Does scale length matter? A comparison of nine- versus five-point rating scales for the mini-CEX

David Allan Cook, Thomas J. Beckman

Research output: Contribution to journalArticle

40 Citations (Scopus)

Abstract

Educators must often decide how many points to use in a rating scale. No studies have compared interrater reliability for different-length scales, and few have evaluated accuracy. This study sought to evaluate the interrater reliability and accuracy of mini-clinical evaluation exercise (mini-CEX) scores, comparing the traditional mini-CEX nine-point scale to a five-point scale. Methods: The authors conducted a validity study in an academic internal medicine residency program. Fifty-two program faculty participated. Participants rated videotaped resident-patient encounters using the mini-CEX with both a nine-point scale and a five-point scale. Some cases were scripted to reflect a specific level of competence (unsatisfactory, satisfactory, superior). Outcome measures included mini-CEX scores, accuracy (scores compared to scripted competence level), interrater reliability, and domain intercorrelation. Results: Interviewing, exam, counseling, and overall ratings varied significantly across levels of competence (P < .0001). Nine-point scale scores accurately classified competence more often (391/720 [54%] for overall ratings) than five-point scores (316/723 [44%], P < .0001). Interrater reliability was similar for scores from the nine- and five-point scales (0.43 and 0.40, respectively, for overall ratings). With the exception of correlation between exam and counseling scores using the five-point scale (r = 0.38, P = .13), score correlations among all domain combinations were high (r = 0.46-0.89) and statistically significant (P ≤ .015) for both scales. Conclusions: Mini-CEX scores demonstrated modest interrater reliability and accuracy. Although interrater reliability is similar for nine- and five-point scales, nine-point scales appear to provide more accurate scores. This has implications for many educational assessments.

Original languageEnglish (US)
Pages (from-to)655-664
Number of pages10
JournalAdvances in Health Sciences Education
Volume14
Issue number5
DOIs
StatePublished - Nov 2009

Fingerprint

rating scale
Mental Competency
Exercise
evaluation
Counseling
Educational Measurement
Internship and Residency
Internal Medicine
rating
Outcome Assessment (Health Care)
counseling
medicine
educator
resident

Keywords

  • Accuracy
  • Assessment
  • Clinical competence
  • Educational measurement
  • Interrater reliability
  • Medical education
  • Psychometrics
  • Reproducibility of results

ASJC Scopus subject areas

  • Medicine(all)
  • Education

Cite this

Does scale length matter? A comparison of nine- versus five-point rating scales for the mini-CEX. / Cook, David Allan; Beckman, Thomas J.

In: Advances in Health Sciences Education, Vol. 14, No. 5, 11.2009, p. 655-664.

Research output: Contribution to journalArticle

@article{0b04b81c68114400abe4a5adfcb54673,
title = "Does scale length matter? A comparison of nine- versus five-point rating scales for the mini-CEX",
abstract = "Educators must often decide how many points to use in a rating scale. No studies have compared interrater reliability for different-length scales, and few have evaluated accuracy. This study sought to evaluate the interrater reliability and accuracy of mini-clinical evaluation exercise (mini-CEX) scores, comparing the traditional mini-CEX nine-point scale to a five-point scale. Methods: The authors conducted a validity study in an academic internal medicine residency program. Fifty-two program faculty participated. Participants rated videotaped resident-patient encounters using the mini-CEX with both a nine-point scale and a five-point scale. Some cases were scripted to reflect a specific level of competence (unsatisfactory, satisfactory, superior). Outcome measures included mini-CEX scores, accuracy (scores compared to scripted competence level), interrater reliability, and domain intercorrelation. Results: Interviewing, exam, counseling, and overall ratings varied significantly across levels of competence (P < .0001). Nine-point scale scores accurately classified competence more often (391/720 [54{\%}] for overall ratings) than five-point scores (316/723 [44{\%}], P < .0001). Interrater reliability was similar for scores from the nine- and five-point scales (0.43 and 0.40, respectively, for overall ratings). With the exception of correlation between exam and counseling scores using the five-point scale (r = 0.38, P = .13), score correlations among all domain combinations were high (r = 0.46-0.89) and statistically significant (P ≤ .015) for both scales. Conclusions: Mini-CEX scores demonstrated modest interrater reliability and accuracy. Although interrater reliability is similar for nine- and five-point scales, nine-point scales appear to provide more accurate scores. This has implications for many educational assessments.",
keywords = "Accuracy, Assessment, Clinical competence, Educational measurement, Interrater reliability, Medical education, Psychometrics, Reproducibility of results",
author = "Cook, {David Allan} and Beckman, {Thomas J.}",
year = "2009",
month = "11",
doi = "10.1007/s10459-008-9147-x",
language = "English (US)",
volume = "14",
pages = "655--664",
journal = "Advances in Health Sciences Education",
issn = "1382-4996",
publisher = "Springer Netherlands",
number = "5",

}

TY - JOUR

T1 - Does scale length matter? A comparison of nine- versus five-point rating scales for the mini-CEX

AU - Cook, David Allan

AU - Beckman, Thomas J.

PY - 2009/11

Y1 - 2009/11

N2 - Educators must often decide how many points to use in a rating scale. No studies have compared interrater reliability for different-length scales, and few have evaluated accuracy. This study sought to evaluate the interrater reliability and accuracy of mini-clinical evaluation exercise (mini-CEX) scores, comparing the traditional mini-CEX nine-point scale to a five-point scale. Methods: The authors conducted a validity study in an academic internal medicine residency program. Fifty-two program faculty participated. Participants rated videotaped resident-patient encounters using the mini-CEX with both a nine-point scale and a five-point scale. Some cases were scripted to reflect a specific level of competence (unsatisfactory, satisfactory, superior). Outcome measures included mini-CEX scores, accuracy (scores compared to scripted competence level), interrater reliability, and domain intercorrelation. Results: Interviewing, exam, counseling, and overall ratings varied significantly across levels of competence (P < .0001). Nine-point scale scores accurately classified competence more often (391/720 [54%] for overall ratings) than five-point scores (316/723 [44%], P < .0001). Interrater reliability was similar for scores from the nine- and five-point scales (0.43 and 0.40, respectively, for overall ratings). With the exception of correlation between exam and counseling scores using the five-point scale (r = 0.38, P = .13), score correlations among all domain combinations were high (r = 0.46-0.89) and statistically significant (P ≤ .015) for both scales. Conclusions: Mini-CEX scores demonstrated modest interrater reliability and accuracy. Although interrater reliability is similar for nine- and five-point scales, nine-point scales appear to provide more accurate scores. This has implications for many educational assessments.

AB - Educators must often decide how many points to use in a rating scale. No studies have compared interrater reliability for different-length scales, and few have evaluated accuracy. This study sought to evaluate the interrater reliability and accuracy of mini-clinical evaluation exercise (mini-CEX) scores, comparing the traditional mini-CEX nine-point scale to a five-point scale. Methods: The authors conducted a validity study in an academic internal medicine residency program. Fifty-two program faculty participated. Participants rated videotaped resident-patient encounters using the mini-CEX with both a nine-point scale and a five-point scale. Some cases were scripted to reflect a specific level of competence (unsatisfactory, satisfactory, superior). Outcome measures included mini-CEX scores, accuracy (scores compared to scripted competence level), interrater reliability, and domain intercorrelation. Results: Interviewing, exam, counseling, and overall ratings varied significantly across levels of competence (P < .0001). Nine-point scale scores accurately classified competence more often (391/720 [54%] for overall ratings) than five-point scores (316/723 [44%], P < .0001). Interrater reliability was similar for scores from the nine- and five-point scales (0.43 and 0.40, respectively, for overall ratings). With the exception of correlation between exam and counseling scores using the five-point scale (r = 0.38, P = .13), score correlations among all domain combinations were high (r = 0.46-0.89) and statistically significant (P ≤ .015) for both scales. Conclusions: Mini-CEX scores demonstrated modest interrater reliability and accuracy. Although interrater reliability is similar for nine- and five-point scales, nine-point scales appear to provide more accurate scores. This has implications for many educational assessments.

KW - Accuracy

KW - Assessment

KW - Clinical competence

KW - Educational measurement

KW - Interrater reliability

KW - Medical education

KW - Psychometrics

KW - Reproducibility of results

UR - http://www.scopus.com/inward/record.url?scp=70449122923&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70449122923&partnerID=8YFLogxK

U2 - 10.1007/s10459-008-9147-x

DO - 10.1007/s10459-008-9147-x

M3 - Article

C2 - 19034679

AN - SCOPUS:70449122923

VL - 14

SP - 655

EP - 664

JO - Advances in Health Sciences Education

JF - Advances in Health Sciences Education

SN - 1382-4996

IS - 5

ER -