Appraising the Quality of Medical Education Research Methods: The Medical Education Research Study Quality Instrument and the Newcastle-Ottawa Scale-Education

David A. Cook; Darcy A. Reed

doi:10.1097/ACM.0000000000000786

Appraising the Quality of Medical Education Research Methods: The Medical Education Research Study Quality Instrument and the Newcastle-Ottawa Scale-Education

David A. Cook, Darcy A. Reed

General Internal Medicine

Research output: Contribution to journal › Article › peer-review

176 Scopus citations

Abstract

Purpose The Medical Education Research Study Quality Instrument (MERSQI) and the Newcastle-Ottawa Scale-Education (NOS-E) were developed to appraise methodological quality in medical education research. The study objective was to evaluate the interrater reliability, normative scores, and between-instrument correlation for these two instruments. Method In 2014, the authors searched PubMed and Google for articles using the MERSQI or NOS-E. They obtained or extracted data for interrater reliability - using the intraclass correlation coefficient (ICC) - and normative scores. They calculated between-scale correlation using Spearman rho. Results Each instrument contains items concerning sampling, controlling for confounders, and integrity of outcomes. Interrater reliability for overall scores ranged from 0.68 to 0.95. Interrater reliability was "substantial" or better (ICC > 0.60) for nearly all domain-specific items on both instruments. Most instances of low interrater reliability were associated with restriction of range, and raw agreement was usually good. Across 26 studies evaluating published research, the median overall MERSQI score was 11.3 (range 8.9-15.1, of possible 18). Across six studies, the median overall NOS-E score was 3.22 (range 2.08-3.82, of possible 6). Overall MERSQI and NOS-E scores correlated reasonably well (rho 0.49-0.72). Conclusions The MERSQI and NOS-E are useful, reliable, complementary tools for appraising methodological quality of medical education research. Interpretation and use of their scores should focus on item-specific codes rather than overall scores. Normative scores should be used for relative rather than absolute judgments because different research questions require different study designs.

Original language	English (US)
Pages (from-to)	1067-1076
Number of pages	10
Journal	Academic Medicine
Volume	90
Issue number	8
DOIs	https://doi.org/10.1097/ACM.0000000000000786
State	Published - Aug 31 2015

ASJC Scopus subject areas

Education

Access to Document

10.1097/ACM.0000000000000786

Cite this

@article{93d9d68f72f943f89a0184759c3bf2ea,

title = "Appraising the Quality of Medical Education Research Methods: The Medical Education Research Study Quality Instrument and the Newcastle-Ottawa Scale-Education",

abstract = "Purpose The Medical Education Research Study Quality Instrument (MERSQI) and the Newcastle-Ottawa Scale-Education (NOS-E) were developed to appraise methodological quality in medical education research. The study objective was to evaluate the interrater reliability, normative scores, and between-instrument correlation for these two instruments. Method In 2014, the authors searched PubMed and Google for articles using the MERSQI or NOS-E. They obtained or extracted data for interrater reliability - using the intraclass correlation coefficient (ICC) - and normative scores. They calculated between-scale correlation using Spearman rho. Results Each instrument contains items concerning sampling, controlling for confounders, and integrity of outcomes. Interrater reliability for overall scores ranged from 0.68 to 0.95. Interrater reliability was {"}substantial{"} or better (ICC > 0.60) for nearly all domain-specific items on both instruments. Most instances of low interrater reliability were associated with restriction of range, and raw agreement was usually good. Across 26 studies evaluating published research, the median overall MERSQI score was 11.3 (range 8.9-15.1, of possible 18). Across six studies, the median overall NOS-E score was 3.22 (range 2.08-3.82, of possible 6). Overall MERSQI and NOS-E scores correlated reasonably well (rho 0.49-0.72). Conclusions The MERSQI and NOS-E are useful, reliable, complementary tools for appraising methodological quality of medical education research. Interpretation and use of their scores should focus on item-specific codes rather than overall scores. Normative scores should be used for relative rather than absolute judgments because different research questions require different study designs.",

author = "Cook, {David A.} and Reed, {Darcy A.}",

note = "Publisher Copyright: Copyright {\textcopyright} by the Association of American Medical Colleges. Unauthorized reproduction of this article is prohibited.",

year = "2015",

month = aug,

day = "31",

doi = "10.1097/ACM.0000000000000786",

language = "English (US)",

volume = "90",

pages = "1067--1076",

journal = "Academic Medicine",

issn = "1040-2446",

publisher = "Lippincott Williams and Wilkins",

number = "8",

}

TY - JOUR

T1 - Appraising the Quality of Medical Education Research Methods

T2 - The Medical Education Research Study Quality Instrument and the Newcastle-Ottawa Scale-Education

AU - Cook, David A.

AU - Reed, Darcy A.

PY - 2015/8/31

Y1 - 2015/8/31

N2 - Purpose The Medical Education Research Study Quality Instrument (MERSQI) and the Newcastle-Ottawa Scale-Education (NOS-E) were developed to appraise methodological quality in medical education research. The study objective was to evaluate the interrater reliability, normative scores, and between-instrument correlation for these two instruments. Method In 2014, the authors searched PubMed and Google for articles using the MERSQI or NOS-E. They obtained or extracted data for interrater reliability - using the intraclass correlation coefficient (ICC) - and normative scores. They calculated between-scale correlation using Spearman rho. Results Each instrument contains items concerning sampling, controlling for confounders, and integrity of outcomes. Interrater reliability for overall scores ranged from 0.68 to 0.95. Interrater reliability was "substantial" or better (ICC > 0.60) for nearly all domain-specific items on both instruments. Most instances of low interrater reliability were associated with restriction of range, and raw agreement was usually good. Across 26 studies evaluating published research, the median overall MERSQI score was 11.3 (range 8.9-15.1, of possible 18). Across six studies, the median overall NOS-E score was 3.22 (range 2.08-3.82, of possible 6). Overall MERSQI and NOS-E scores correlated reasonably well (rho 0.49-0.72). Conclusions The MERSQI and NOS-E are useful, reliable, complementary tools for appraising methodological quality of medical education research. Interpretation and use of their scores should focus on item-specific codes rather than overall scores. Normative scores should be used for relative rather than absolute judgments because different research questions require different study designs.

AB - Purpose The Medical Education Research Study Quality Instrument (MERSQI) and the Newcastle-Ottawa Scale-Education (NOS-E) were developed to appraise methodological quality in medical education research. The study objective was to evaluate the interrater reliability, normative scores, and between-instrument correlation for these two instruments. Method In 2014, the authors searched PubMed and Google for articles using the MERSQI or NOS-E. They obtained or extracted data for interrater reliability - using the intraclass correlation coefficient (ICC) - and normative scores. They calculated between-scale correlation using Spearman rho. Results Each instrument contains items concerning sampling, controlling for confounders, and integrity of outcomes. Interrater reliability for overall scores ranged from 0.68 to 0.95. Interrater reliability was "substantial" or better (ICC > 0.60) for nearly all domain-specific items on both instruments. Most instances of low interrater reliability were associated with restriction of range, and raw agreement was usually good. Across 26 studies evaluating published research, the median overall MERSQI score was 11.3 (range 8.9-15.1, of possible 18). Across six studies, the median overall NOS-E score was 3.22 (range 2.08-3.82, of possible 6). Overall MERSQI and NOS-E scores correlated reasonably well (rho 0.49-0.72). Conclusions The MERSQI and NOS-E are useful, reliable, complementary tools for appraising methodological quality of medical education research. Interpretation and use of their scores should focus on item-specific codes rather than overall scores. Normative scores should be used for relative rather than absolute judgments because different research questions require different study designs.

UR - http://www.scopus.com/inward/record.url?scp=84938235559&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84938235559&partnerID=8YFLogxK

U2 - 10.1097/ACM.0000000000000786

DO - 10.1097/ACM.0000000000000786

M3 - Article

C2 - 26107881

AN - SCOPUS:84938235559

SN - 1040-2446

VL - 90

SP - 1067

EP - 1076

JO - Academic Medicine

JF - Academic Medicine

IS - 8

ER -

Appraising the Quality of Medical Education Research Methods: The Medical Education Research Study Quality Instrument and the Newcastle-Ottawa Scale-Education

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this