TY - JOUR
T1 - How reliable are assessments of clinical teaching? A review of the published instruments
AU - Beckman, Thomas J.
AU - Ghosh, Amit K.
AU - Cook, David A.
AU - Erwin, Patricia J.
AU - Mandrekar, Jayawant N.
N1 - Funding Information:
Received from the Department of Internal Medicine (TJB, AKG, DAC), Department of Medicine, Mayo Clinic College of Medicine, Mayo Clinic and Mayo Foundation; Plummer Medical Library (PJE), Mayo Clinic College of Medicine; Department of Health Sciences Research (JNM), Division of Biostatistics, Mayo Clinic and Mayo Foundation, Rochester, Minn.
PY - 2004/9
Y1 - 2004/9
N2 - BACKGROUND: Learner feedback is the primary method for evaluating clinical faculty, despite few existing standards for measuring learner assessments. OBJECTIVE: To review the published literature on instruments for evaluating clinical teachers and to summarize themes that will aid in developing universally appealing tools. DESIGN: Searching 5 electronic databases revealed over 330 articles. Excluded were reviews, editorials, and qualitative studies. Twenty-one articles describing instruments designed for evaluating clinical faculty by learners were found. Three investigators studied these papers and tabulated characteristics of the learning environments and validation methods. Salient themes among the evaluation studies were determined. MAIN RESULTS: Many studies combined evaluations from both outpatient and inpatient settings and some authors combined evaluations from different learner levels. Wide ranges in numbers of teachers, evaluators, evaluations, and scale items were observed. The most frequently encountered statistical methods were factor analysis and determining internal consistency reliability with Cronbach's α. Less common methods were the use of test-retest reliability, interrater reliability, and convergent validity between validated instruments. Fourteen domains of teaching were identified and the most frequently studied domains were interpersonal and clinical-teaching skills. CONCLUSIONS: Characteristics of teacher evaluations vary between educational settings and between different learner levels, indicating that future studies should utilize more narrowly defined study populations. A variety of validation methods including temporal stability, interrater reliability, and convergent validity should be considered. Finally, existing data support the validation of instruments comprised solely of interpersonal and clinical-teaching domains.
AB - BACKGROUND: Learner feedback is the primary method for evaluating clinical faculty, despite few existing standards for measuring learner assessments. OBJECTIVE: To review the published literature on instruments for evaluating clinical teachers and to summarize themes that will aid in developing universally appealing tools. DESIGN: Searching 5 electronic databases revealed over 330 articles. Excluded were reviews, editorials, and qualitative studies. Twenty-one articles describing instruments designed for evaluating clinical faculty by learners were found. Three investigators studied these papers and tabulated characteristics of the learning environments and validation methods. Salient themes among the evaluation studies were determined. MAIN RESULTS: Many studies combined evaluations from both outpatient and inpatient settings and some authors combined evaluations from different learner levels. Wide ranges in numbers of teachers, evaluators, evaluations, and scale items were observed. The most frequently encountered statistical methods were factor analysis and determining internal consistency reliability with Cronbach's α. Less common methods were the use of test-retest reliability, interrater reliability, and convergent validity between validated instruments. Fourteen domains of teaching were identified and the most frequently studied domains were interpersonal and clinical-teaching skills. CONCLUSIONS: Characteristics of teacher evaluations vary between educational settings and between different learner levels, indicating that future studies should utilize more narrowly defined study populations. A variety of validation methods including temporal stability, interrater reliability, and convergent validity should be considered. Finally, existing data support the validation of instruments comprised solely of interpersonal and clinical-teaching domains.
KW - Evaluation studies
KW - Medical faculty
KW - Validity
UR - http://www.scopus.com/inward/record.url?scp=4544334347&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=4544334347&partnerID=8YFLogxK
U2 - 10.1111/j.1525-1497.2004.40066.x
DO - 10.1111/j.1525-1497.2004.40066.x
M3 - Review article
C2 - 15333063
AN - SCOPUS:4544334347
SN - 0884-8734
VL - 19
SP - 971
EP - 977
JO - Journal of General Internal Medicine
JF - Journal of General Internal Medicine
IS - 9
ER -