Hong Kong J Psychiatry 2003;13:19-25


The Chinese-bilingual SCID-I/P Project: Stage 2 — Reliability for Anxiety Disorders, Adjustment Disorders, and ‘No Diagnosis’
E So, I Kam, CM Leung, A Pang, L Lam


Objective: To report on the reliability of the Chinese-bilingual Structured Clinical Interview for DSM-IV (Axis I, Patient version) Project (CB-SCID-I/P), stage 2: anxiety disorders, adjustment disorders, and ‘no diagnosis’.

Patients and Methods: Newly registered outpatients were consecutively recruited from 2 sites, the Li Ka Shing Psychiatric Centre and Alice Ho Mui Ling Nethersole Psychiatric Clinic. The reliability of the CB-SCID-I/P was assessed individually for anxiety disorders, adjustment disorders, and no diagnosis, and compared with clinician diagnosis using the test-retest method. Kappa value was used to represent the level of reliability.

Results: Seventy seven outpatients were recruited over an 8-month period. Kappa values were 0.81 for anxiety disorders; 0.64 for adjustment disorders, and 0.57 for no diagnosis. The overall kappa was 0.71.

Conclusion: The CB-SCID-I/P is a reliable diagnostic instrument for DSM-IV anxiety and adjustment disorders for outpatients.

Key words: Adjustment disorders, Anxiety disorders, Chinese, Diagnosis, Mental status schedule, Outpatients

Acknowledgements: The authors wish to thank Professor Gabor Ungvari for commenting on an earlier version of the manuscript. The material support given by the Department of Psychiatry, Prince of Wales Hospital and the Alice Ho Nethersole Hospital Psychiatric Clinic is gratefully acknowledged.

Dr E So, MBBS, FRANZCP, FRACGP, FHKCPsych, Department of  Psychiatry, Tai Po Hospital, Tai Po, Hong Kong, China.
Dr I Kam, MBChB, MRCPsych, FHKCPsych, FHKAMPsych, Department of Psychiatry, Shatin Hospital, Shatin, Hong Kong, China.
Dr CM Leung, MBBS, MRCPsych, FHKCPsych, FHKAMPsych, Department of Psychiatry, Prince of Wales Hospital, Shatin, Hong Kong, China.
Dr A Pang, MBBS, MRCPsych, FHKCPsych, FHKAMPsych, Department of Psychiatry, Tai Po Hospital, Tai Po, Hong Kong, China.
Dr L Lam, MBChB, MRCPsych, FHKCPsych, FHKAMPsych, Professor, Department of Psychiatry, Chinese University of Hong Kong, Shatin, Hong King, China.

Address for correspondence: Dr E So, Department of Psychiatry, Tai Po Hospital, Tai Po, Hong Kong, China. E-mail: somp@ha.org.hk

Submitted: 14 March 2003; Accepted: 4 August 2003

pdf Full Paper in PDF


The Chinese-bilingual Structured Clinical Interview for DSM- IV (Axis I, Patient version) [CB-SCID-I/P], a translated Chinese version of the SCID-I/P,is a semi-structured diagnostic instrument for making Axis I DSM-IV diagnoses.It includes a history taking ‘Overview’ section, followed by 10 modules each representing a major Axis I disorder. Using a decision tree approach, the CB-SCID-I/P guides the clinician in testing diagnostic hypotheses as the interview is conducted. The output is a record of the presence or absence of each of the disorders being considered, for the current episode (past 1 month) and for lifetime occurrence.

The CB-SCID-I/P is designed for clinicians or trained mental health professionals who are familiar with the DSM system and are qualified to perform diagnostic assessments. It is intended for psychiatric or general medical patients, but excludes non-patients, such as community subjects or family members of psychiatric patients. The language and diagnostic coverage are appropriate for use with individuals older than 16 years. Those with severe cognitive impairment or severe psychotic symptoms are not suitable for SCID interviewing. Intrinsic to the semi-structured design, clinical judgement is demanded when going through the order and questions on the SCID manual, the process being punctuated with follow-up questions for clarification.3

In addition to interviewing the subject, use of ancillary data from other sources is permitted in making a final SCID diagnosis, e.g. from family members, previous hospital records, referral notes, and observations of clinical staff. The expected time required to administer the CB-SCID-I/P in a single sitting is approximately 30 to 75 minutes, depending on the complexity of the patient’s history, the ability of the patient to describe the psychopathology, and the engaging skills and experience of the interviewer.4 Optimally, the inter- viewer should have enough clinical experience to conduct diagnostic interview without the SCID.

The CB-SCID-I/P Project represents a systematic effort to establish the reliability of a diagnostic instrument. Trans- lation followed the standard back-translation procedure.5 The research methodology was developed by reference to the original document6,7 as well as from local experience.5,8 It is divided into 5 stages:

  • stage 1 — translation of the SCID-I/P, and reliability of CB-SCID-I/P in mood disorders and schizophrenia and related psychotic disorders in inpatients
  • stage 2 — reliability of CB-SCID-I/P in anxiety disorders, adjustment disorders and ‘no diagnosis’ in outpatients
  • stage 3 — multi-site inter-rater reliability of CB-SCID-I/P
  • stage 4 — reliability of CB-SCID-I/P on low prevalence disorders by enhanced sampling
  • stage 5 — reliability of SCID-I/P major psychiatric diagnoses against pooled data from stages 1, 2, 3, and 4.

Stage 1 (mood disorders and schizophrenia) has been completed and reported elsewhere in this issue (see page 7). Results showed a good degree of inter-reliability with percentage agreement at 89.6% and an overall kappa of 0.84. When compared with the clinicians’ best-estimate diagnoses, overall kappa for rater-clinician reliability was 0.77.4 The present report deals with stage 2 of the project.

Overview of Stage 2

The transcript of CB-SCID-I/P as used in stage 1 was adopted without modification. Several insights were gained from the first stage. First, given that the cohort of stage 1 was comprised of inpatients, it is therefore a sample skewed heavily towards signs and symptoms. As expected, the yield for SCID as well as clinician diagnoses was high. Less than 2% of the interviewed subjects were without a diagnosis. The CB-SCID-I/P can confidently be regarded to have a very low false-negative rate (i.e. high in sensitivity), while its propensity for over-diagnosis remains to be determined. Intuitively, use of an outpatient cohort, as in stage 2, would allow this issue to be addressed; it was anticipated that a significant proportion of the stage 2 subjects would have no diagnosable psychiatric illnesses.

Second, as the test-retest method and the joint assessment method yielded similar inter-rater reliability, there was little to gain by repeating both techniques. Consequently, only the test-retest method was employed.

Third, the extremely high inter-rater reliability obtained from stage 1 permits the pooling of SCID diagnoses, such that a useful sample could be obtained within the limits of the recruitment period.

Patients and Methods

Subjects were independently and consecutively recruited from 2 sites. There was no prescreening. Recruitment criteria were the same as in stage 1. Due to resource limitations, only a single consultant diagnosis was used as the ‘gold standard’. The test-retest design was used. Data were pooled from 2 regional outpatient clinics, population characteristics being mostly identical between the 2 sites. Percentage agreement and kappa were employed to represent the degree of reliability.

Inclusion Criteria

Requirements for study subjects were as follows:

  • psychiatric outpatients of Li Ka Shing Psychiatric Centre (LKS) and Alice Ho Mui Ling Nethersole Psychiatric Clinic (AHNH), between 1 September 2000 and 31 July 2001.
  • fluent in the Cantonese dialect.
  • aged from 16 to 65 years.
  • able to give written informed consent.
  • absence of previous history of major head injury, serious neurological or medical problems, severe pervasive developmental disorder (e.g. autism), or significant cog- nitive deficits (e.g. mental retardation, dementia).

SCID Raters and Clinicians

The SCID raters consisted of 2 general psychiatrists (ES and IK), both of whom were involved in the stage 1 reliability assessment. The SCID raters were allowed to use whatever information was available to them at the time of assessment. Diagnoses were generated according to the CB-SCID-I/P manual. The number of diagnoses generated was limited to a maximum of 3: principal diagnosis, comorbid diagnosis, and lifetime diagnosis.

Three senior psychiatrists (LL, AP, and CL) acted as clinicians for generating the clinician’s diagnosis. Diagnostic standardisation and agreement among senior psychiatrists was not pursued as a certain degree of variability was deemed desirable so as to parallel clinical reality. LL and AP were responsible for the AHNH site and CL for the LKS site. All clinical interviews took place within 4 weeks after the SCID interview, so as to minimise the mental state variability while allowing for potential disease progression, where minor pathology might actually herald an evolving major psy- chiatric disorder.

The clinicians were blinded to the SCID findings. They were allowed similar access to patients’ referral notes, case notes, old records, laboratory results, as well as collateral information from friends and relatives if desired, as in a standard clinical assessment. A principal diagnosis according to DSM-IV was generated for each subject. When indicated, a comorbid diagnosis and/or lifetime diagnosis was also entered. Axis II diagnoses were discarded.


It took between 30 and 75 minutes to conduct an SCID interview. All data were processed by Statistical Package for the Social Sciences (SPSS) version 9.0 software.

Analysis of Subjects and Diagnoses

Seventy seven subjects were consecutively enrolled, 40 from LKS and 37 from AHNH. All gave written informed consent and completed the study. There were no dropouts from the LKS site. SCID assessment (IK) and clinical assessment (CL) were conducted on the same day. At the AHNH site (ES), 54 patients attended the outpatients department over the 10-month period. Three were older than 65 years and therefore excluded. Four fulfilled the selection criteria but declined participation. Of the remaining 47 patients, the clinicians successfully assessed 37 (32 by LL and 5 by AP) within 4 weeks of the SCID interview.

Table 1 gives a breakdown of patient characteristics at the 2 sites. There is an over-representation of female subjects. The demographics between the 2 groups of subjects were highly comparable.

Among those subjects that had been interviewed by the SCID rater but failed to return for the clinician’s inter- view (10 of 47, all from the AHNH site), 2 were male and 8 were female. Mean age was 43.4 ± 10.5 years. Seven were married, 2 divorced, and 1 single. The SCID profile of the defaulters showed: 2 patients with major depressive disorder; 1 with depression due to GMC; 1 with schizophrenia; 2 with agoraphobia without panic disorder; 1 with adjustment disorder with anxious mood; 1 with no principal diagnoses, but a life-time diagnosis of adjustment disorder; and 2 with no diagnosis.

Table 2 lists the frequency of the pooled diagnoses (principal, comorbid, and lifetime) extent of agreement between the SCID rater and the clinician on individual diagnoses. ‘No diagnosis’ was counted as a valid diagnostic entry. It should be noted that the clinicians assigned no diagnosis to 2 subjects, therefore automatically excluding any possibility for comorbid diagnoses, although lifetime diagnoses were still possible. Among the 77 subjects, the SCID rater made 64 principal, 9 comorbid, and 12 lifetime diagnoses. The clinicians made 65 principal, 8 comorbid, and 11 lifetime diagnoses. The SCID raters used 3 more diagnostic categories than the clinicians.

The most frequently made diagnoses were major depressive disorder, adjustment disorder, dysthymia, and panic disorder, findings that fall within expectations for a general outpatient profile.

Grouping together the 4 most popular diagnostic groups, the SCID raters made 48 diagnostic entries, while the clinicians made 54. Percentage agreement was lowest for the dysthymia group, attesting to the controversial syndromal nature of this entity and its controversial status in psychiatry. Anxiety disorders as a group represented 22 SCID entries, versus 21 by clinicians.

Twelve diagnoses of adjustment disorders were made by SCID raters, versus 11 by clinicians. For comorbid and lifetime diagnoses, neither the raters nor the clinicians showed a particular diagnostic pattern. There were in- sufficient data to carry out a separate analysis on the profile of comorbid or lifetime diagnoses. Other than for the con- sideration of the instrument’s specificity, these data are dropped from further reliability analysis.

For principal diagnoses, the diagnostic profile was similar between the raters and the clinicians, especially for those more popular diagnostic groups (Table 3). Agreement on major depressive disorder and panic disorder was high, whereas agreement on adjustment disorders was only average. Agreement for dysthymia was worse than chance. Table 3 also addressed the issue of over-diagnosis (i.e. specificity), which was one of the goals of the present project. Of the 12 cases of void entry for principal diagnosis made by the clinicians, the raters disagreed about 4 of them.

Test-retest Reliability

Table 4 gives the kappa scores comparing the raters and the clinicians for diagnostic subtypes of anxiety disorders. The kappa value for panic disorder (0.80) was comparable to that reported by Williams et al (0.87).7

The base rates and kappa values for individual diagnostic groups (i.e. anxiety disorders, adjustment, and no diagnosis) are shown in Table 5. Major depressive disorder, with its significant sample size and a kappa of 0.83, was included for comparison. Panic disorder, with its reasonable sample size, was given a separate kappa value. A more detailed examination revealed that of the 14 agreed anxiety diag- noses, 13 were accurate to the subtypes with only 1 case being at variance.

Kappa scores demonstrated good to very good agree- ment, or no diagnosis (k = 0.57), adjustment disorder (k = 0.64), and anxiety disorders (k = 0.81).

Table 6 gives the overall kappa for anxiety disorders, adjustment disorders, and no diagnosis.

0301 V13N1 p19 table1

0301 V13N1 p19 table2

The CB-SCID-I/P therefore showed good agreement with the clinician diagnoses for anxiety disorders and adjustment disorders. In addition, it is an instrument with high specificity for differentiating among sub-syndromal cases.


The over-representation of female subjects in the cohort rightly reflected the attendance pattern observed in most general outpatient clinics. The preponderance of neuroses among outpatients necessarily skewed the presenting psychopathology within the sample, such that the commonest diagnoses were identified — depression, anxiety, adjustment disorders, and no diagnosis. Scarcely any psychotic patients presented throughout the recruitment period. Low prevalence disorders were equally inconspicuous. Across the board, SCID raters entered 3 more types of diagnosis (for somatisation and eating disorders) and disagreed on a diagnosis of dysthymia, otherwise they had a diagnostic profile very similar to that of the clinicians. This outcome is believed to be a reliable reflection of daily clinical practice. Agreement among clinicians for making diagnoses such as dysthymia, schizoaffective disorder, or psychotic disorder not otherwise specified were notably low, even in the presence of diagnostic aides. The use of semi-structured instruments such as the CB- SCID-I/P, which intrinsically relies on a substantial degree of

0301 V13N1 p19 table3

0301 V13N1 p19 table4

clinical jurisdiction, would therefore have little impact in swaying the ideological inertia towards these conditions.9,10

In stage 1, out of 144 subjects, SCID raters made 29 comorbid diagnoses in comparison to the clinicians’ 13. The raters in stage 2 entered 9 comorbid diagnoses in 77 subjects as opposed to 8 by the clinicians. This finding argues against a propensity of the instrument towards over-diagnosis.

On the whole, although the sample was relatively small in size and insufficient for generating kappa scores for each subtype of anxiety disorders, it was large enough to demonstrate a differentiation among the major diagnostic groups. Focusing only on the principal diagnosis, kappa values were all above 0.50, indicating an acceptable degree of reliability (Table 5). Specifically, kappa for panic disorder against other subtypes of anxiety disorder was 0.78 (Table 4), which is higher than the original report by Williams et al.7 More importantly, the kappa for no diagnosis of 0.57 convincingly established the instrument’s specificity. Compared with the reliability in stage 1, the overall kappa is less favourable but acceptable. Given that in stage 1 the subjects were inpatients whereas

0301 V13N1 p19 table5

stage 2 dealt with outpatients, a lesser degree of diagnostic agreement for the latter was predictable.11-13


With a small sample size and the relatively low base rates, putting a limit on the number of diagnostic items serves to increase the reliability coefficients and compensates for unstable kappa statistics. In the present study, disorders with low prevalence, such as post-traumatic stress disorder, and those with low propensity for presentation, such as social phobia, are under-represented and not suitable for statistical analysis. As previously emphasised, these weaknesses shall be addressed at the subsequent stages of the project.

Apart from the low base rate issue, there are other areas in which the reliability of using CB-SCID-I/P is unknown. This applies to elderly patients and those who declined to participate. In particular, patients who were interviewed by the SCID raters but dropped out of the clinician assessment, who upon telephone tracing claimed to have recovered and declined further service, might represent the milder end of the disease spectrum. The use of CB-SCID-I/P on such patients with minor or short-term mental disturbances is yet to be established.

Methodological Strengths

Apart from the project’s multi-stage multi-site design, one of the strengths of the present study is its consecutive recruitment process, which is crucial in ensuring a ‘real life’ distribution of diagnoses. Another strength is the zero dropout rate achieved at the LKS site. This is likely due to the fact that both the SCID rater and the clinician conducted their sessions at the same time, and therefore capitalised on the readiness of the subjects to complete the assessment on the spot.

Lastly, a critical consideration in a reliability study such as this was the use of senior psychiatrists with all information at their disposal for use in arriving at the subjects’ diagnoses. The addition of rating scales to the clinician’s armamen- tarium would certainly further strengthen the diagnostic process, although the cost-effectiveness of this measure must be considered.

Implications of the Results

The stage 2 study has demonstrated a satisfactory degree of test-retest reliability in the application of the CB-SCID-I/P, when used to measure DSM-IV anxiety disorders and adjustment disorders in a Chinese adult outpatient popu- lation. So far, the findings have been generalised to more than 1 adult psychiatric outpatient clinic with reference to anxiety disorders and adjustment disorders. The accuracy of CB-SCID-I/P as a research instrument in making DSM-IV diagnoses in this defined population is acceptable. Combining the results from stage 1 and stage 2 of the CB-SCID-I/P Project, the results for sensitivity and specificity make it a useful research tool for selecting the ‘case’ from the ‘non- case’. Better generalisability of the CB-SCID-I/P will, however, be dependent upon the stage-by-stage completion of the project.

As DSM-IV diagnostic criteria are embedded within the instrument and the interview is structured in such a manner that it prompts the interviewers to evaluate specific symptom dimensions and guides patients to describe symptoms in a detailed fashion, the CB-SCID-I/P has the potential to move beyond being used as a research tool to facilitate clinical diagnosis and treatment planning. Assuming adequate training can be provided to clinicians, it is probable that in the future the CB-SCID-I/P will be effectively standardised as part of a more extensive intake assessment.14 In the managed care setting, it can be used as a definitive assess- ment document that is complementary to the formal clinical interview.

As previously stipulated, the CB-SCID-I/P does not purport to be superior to a well-conducted standard clinical

0301 V13N1 p19 table6

assessment, and in this regard it is unlikely that there will ever be a perfect diagnostic instrument.11,15-17 Consequently, the most valid diagnoses are obtained by adding information derived from standardised protocols to extensive data obtained from all other clinical and historical sources.


The results from this study established that the CB-SCID- I/P yields highly reliable DSM-IV diagnoses in research on anxiety disorders and adjustment disorders, and can serve equally well as an instrument to screen out sub-threshold cases. These results complement those obtained in mood disorders and schizophrenia in stage 1 of the Project. Good diagnostic agreement was obtained for anxiety disorders and adjustment disorder, the 2 syndromes that often overlap with each other. Validity for the subtypes of panic disorder is excellent. Our observations also refute claims that the CB- SCID-I/P inherently leads to more comorbid diagnoses. However, the instrument is not yet recommended for use in studying the comorbidity of mental disorders. Overall, our results are compatible with those of other translated versions of the SCID, indicating that the CB-SCID-I/P, at least for the A, B, C, D, E, and F modules, is a satisfactory reproduction of the original English copy and a reliable instrument for making DSM-IV diagnoses.

The CB-SCID-I/P Project has completed the second stage. There are still gaps in the data. The limited application of the present study will be strengthened by the Project’s subsequent stages, such that more diagnostic groups, different patient populations, and treatment settings will be recruited using the multi-stage multi-site approach. It is anticipated that at completion, where cumulated data can be pooled for final analysis, the CB-SCID-I/P will be established as a robust DSM diagnostic instrument for general application.


  1. First MB, Gibbon M, Spitzer RL, Williams JBW. Structured Clinical Interview for Axis I DSM-IV Disorders – Patient Edition – (SCID-I/P, Version 2.0, February 1996a Final Version). Biometrics Research Department, New York State Psychiatric Institute.
  2. American Psychiatric Association. DSM-IV: Diagnostic and statistical manual of mental disorders, 4th edition, revised. Washington, DC: American Psychiatric Association;1994.
  3. First MB, Gibbon M, Spitzer RL, Williams JBW. User’s Guide for the Structured Clinical Interview for DSM-IV Axis I Disorders – Research Version – (SCID, Version 2.0, February 1996 Final Version). Biometrics Research Department, New York State Psychiatric Institute.
  4. So E, Kam I, Liu Z, Fong S, Chung, D, Leung CM. The Chinese Bilingual SCID-I/P Project: Stage 1 – Reliability for Depression and Schizophrenia and related disorders. Hong Kong J Psychiatry 2003. In press.
  5. Chen CN, Wong J, Lee N, et al. A two stage screening in community survey: report of a pilot study. In: Yeh EK, Rin H, Yeh CC, Hwu HG, editors. Prevalence of mental disorders. Taipei, Taiwan: Department of Health;1985:247-251.
  6. Williams JB, Gibbon M, First MB, et al. The Structured Clinical Interview for DSM-III-R (SCID), II: Multisite test-retest reliability. Arch Gen Psychiatry 1992;49:630-636.
  7. Williams JB, Spitzer RL, Gibbon M. International reliability of a diagnostic intake procedure for panic disorder. Am J Psychiatry 1992;149:560-562.
  8. Leung, C.M, Ho S, Kan CS, et al. Evaluation of the Chinese Version of the Hospital Anxiety and Depression Scale: a cross-cultural perspective. Int J Psychosom 1993;40:29-34.
  9. Hickie IB, Scott EM, Davenport TA. Somatic distress: developing more integrated concepts. Curr Opin Psychiatry 1998;11:153-158.
  10. Pincus HA, Wakefield Davis W, McQueen LE. Subthreshold mental disorders. Br J Psychiatry 1999;174:288-296.
  11. Robins LN. Epidemiology: reflections on testing the validity of psychiatric interviews. Arch Gen Psychiatry 1985;42:918-924.
  12. Robins LN, Helzer JE, Ratcliff KS, Seyfried W. Validity of the Diagnostic Interview Schedule, Version II DSM-III diagnoses. Psychol Med 1982;12:855-870.
  13. Robins LN, Helzer JE, Croughan J, Ratcliff KS. National Institute of Mental Health Diagnostic Interview Schedule: its history, characteristics and validity. Arch Gen Psychiatry 1981;38:381-389.
  14. Ventura J, Liberman RP, Green MF, et al. Training and quality assurance with the structured clinical interview for DSM-IV (SCID-I/P). Psychiatry Res 1998;79:163-173
  15. Wing JK, Birley JLT, Cooper JE, et al. Reliability of a procedure for measuring and classifying ‘present psychiatric state’. Br J Psychiatry 1967;113:499-515.
  16. Spitzer RL. Psychiatric diagnosis: are clinicians still necessary? Compr Psychiatry 1983;24:399-411.
  17. Spitzer RL, Williams JB, Gibbon M, First MB. The Structured Clinical Interview for DSM-III-R (SCID), I: history, rationale, and description. Arch Gen Psychiatry 1992;49:624-629.
View My Stats