The GRADE approach is reproducible in assessing the quality of evidence of quantitative evidence syntheses

Reem A. Mustafa, Nancy Santesso, Jan Brozek, Elie A. Akl, Stephen D. Walter, Geoff Norman, Mahan Kulasegaram, Robin Christensen, Gordon H. Guyatt, Yngve Falck-Ytter, Stephanie Chang, Mohammad H Murad, Gunn E. Vist, Toby Lasserson, Gerald Gartlehner, Vijay Shukla, Xin Sun, Craig Whittington, Piet N. Post, Eddy LangKylie Thaler, Ilkka Kunnamo, Heidi Alenius, Joerg J. Meerpohl, Ana C. Alba, Immaculate F. Nevis, Stephen Gentles, Marie Chantal Ethier, Alonso Carrasco-Labra, Rasha Khatib, Gihad Nesrallah, Jamie Kroft, Amanda Selk, Romina Brignardello-Petersen, Holger J. Schünemann

Research output: Contribution to journalArticle

125 Citations (Scopus)

Abstract

Objective: We evaluated the inter-rater reliability (IRR) of assessing the quality of evidence (QoE) using the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) approach. Study Design and Setting: On completing two training exercises, participants worked independently as individual raters to assess the QoE of 16 outcomes. After recording their initial impression using a global rating, raters graded the QoE following the GRADE approach. Subsequently, randomly paired raters submitted a consensus rating. Results: The IRR without using the GRADE approach for two individual raters was 0.31 (95% confidence interval [95% CI] = 0.21-0.42) among Health Research Methodology students (n = 10) and 0.27 (95% CI = 0.19-0.37) among the GRADE working group members (n = 15). The corresponding IRR of the GRADE approach in assessing the QoE was significantly higher, that is, 0.66 (95% CI = 0.56-0.75) and 0.72 (95% CI = 0.61-0.79), respectively. The IRR further increased for three (0.80 [95% CI = 0.73-0.86] and 0.74 [95% CI = 0.65-0.81]) or four raters (0.84 [95% CI = 0.78-0.89] and 0.79 [95% CI = 0.71-0.85]). The IRR did not improve when QoE was assessed through a consensus rating. Conclusion: Our findings suggest that trained individuals using the GRADE approach improves reliability in comparison to intuitive judgments about the QoE and that two individual raters can reliably assess the QoE using the GRADE system.

Original languageEnglish (US)
Pages (from-to)736-742
Number of pages7
JournalJournal of Clinical Epidemiology
Volume66
Issue number7
DOIs
StatePublished - Jul 2013

Fingerprint

Confidence Intervals
Consensus
Research Design
Exercise
Students
Health

Keywords

  • Evidence-based medicine
  • GRADE
  • Inter-rater reliability
  • Levels of evidence
  • Reproducibility
  • Validation studies

ASJC Scopus subject areas

  • Epidemiology

Cite this

Mustafa, R. A., Santesso, N., Brozek, J., Akl, E. A., Walter, S. D., Norman, G., ... Schünemann, H. J. (2013). The GRADE approach is reproducible in assessing the quality of evidence of quantitative evidence syntheses. Journal of Clinical Epidemiology, 66(7), 736-742. https://doi.org/10.1016/j.jclinepi.2013.02.004

The GRADE approach is reproducible in assessing the quality of evidence of quantitative evidence syntheses. / Mustafa, Reem A.; Santesso, Nancy; Brozek, Jan; Akl, Elie A.; Walter, Stephen D.; Norman, Geoff; Kulasegaram, Mahan; Christensen, Robin; Guyatt, Gordon H.; Falck-Ytter, Yngve; Chang, Stephanie; Murad, Mohammad H; Vist, Gunn E.; Lasserson, Toby; Gartlehner, Gerald; Shukla, Vijay; Sun, Xin; Whittington, Craig; Post, Piet N.; Lang, Eddy; Thaler, Kylie; Kunnamo, Ilkka; Alenius, Heidi; Meerpohl, Joerg J.; Alba, Ana C.; Nevis, Immaculate F.; Gentles, Stephen; Ethier, Marie Chantal; Carrasco-Labra, Alonso; Khatib, Rasha; Nesrallah, Gihad; Kroft, Jamie; Selk, Amanda; Brignardello-Petersen, Romina; Schünemann, Holger J.

In: Journal of Clinical Epidemiology, Vol. 66, No. 7, 07.2013, p. 736-742.

Research output: Contribution to journalArticle

Mustafa, RA, Santesso, N, Brozek, J, Akl, EA, Walter, SD, Norman, G, Kulasegaram, M, Christensen, R, Guyatt, GH, Falck-Ytter, Y, Chang, S, Murad, MH, Vist, GE, Lasserson, T, Gartlehner, G, Shukla, V, Sun, X, Whittington, C, Post, PN, Lang, E, Thaler, K, Kunnamo, I, Alenius, H, Meerpohl, JJ, Alba, AC, Nevis, IF, Gentles, S, Ethier, MC, Carrasco-Labra, A, Khatib, R, Nesrallah, G, Kroft, J, Selk, A, Brignardello-Petersen, R & Schünemann, HJ 2013, 'The GRADE approach is reproducible in assessing the quality of evidence of quantitative evidence syntheses', Journal of Clinical Epidemiology, vol. 66, no. 7, pp. 736-742. https://doi.org/10.1016/j.jclinepi.2013.02.004
Mustafa, Reem A. ; Santesso, Nancy ; Brozek, Jan ; Akl, Elie A. ; Walter, Stephen D. ; Norman, Geoff ; Kulasegaram, Mahan ; Christensen, Robin ; Guyatt, Gordon H. ; Falck-Ytter, Yngve ; Chang, Stephanie ; Murad, Mohammad H ; Vist, Gunn E. ; Lasserson, Toby ; Gartlehner, Gerald ; Shukla, Vijay ; Sun, Xin ; Whittington, Craig ; Post, Piet N. ; Lang, Eddy ; Thaler, Kylie ; Kunnamo, Ilkka ; Alenius, Heidi ; Meerpohl, Joerg J. ; Alba, Ana C. ; Nevis, Immaculate F. ; Gentles, Stephen ; Ethier, Marie Chantal ; Carrasco-Labra, Alonso ; Khatib, Rasha ; Nesrallah, Gihad ; Kroft, Jamie ; Selk, Amanda ; Brignardello-Petersen, Romina ; Schünemann, Holger J. / The GRADE approach is reproducible in assessing the quality of evidence of quantitative evidence syntheses. In: Journal of Clinical Epidemiology. 2013 ; Vol. 66, No. 7. pp. 736-742.
@article{647ad558a2a440c09351307e647f4dbf,
title = "The GRADE approach is reproducible in assessing the quality of evidence of quantitative evidence syntheses",
abstract = "Objective: We evaluated the inter-rater reliability (IRR) of assessing the quality of evidence (QoE) using the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) approach. Study Design and Setting: On completing two training exercises, participants worked independently as individual raters to assess the QoE of 16 outcomes. After recording their initial impression using a global rating, raters graded the QoE following the GRADE approach. Subsequently, randomly paired raters submitted a consensus rating. Results: The IRR without using the GRADE approach for two individual raters was 0.31 (95{\%} confidence interval [95{\%} CI] = 0.21-0.42) among Health Research Methodology students (n = 10) and 0.27 (95{\%} CI = 0.19-0.37) among the GRADE working group members (n = 15). The corresponding IRR of the GRADE approach in assessing the QoE was significantly higher, that is, 0.66 (95{\%} CI = 0.56-0.75) and 0.72 (95{\%} CI = 0.61-0.79), respectively. The IRR further increased for three (0.80 [95{\%} CI = 0.73-0.86] and 0.74 [95{\%} CI = 0.65-0.81]) or four raters (0.84 [95{\%} CI = 0.78-0.89] and 0.79 [95{\%} CI = 0.71-0.85]). The IRR did not improve when QoE was assessed through a consensus rating. Conclusion: Our findings suggest that trained individuals using the GRADE approach improves reliability in comparison to intuitive judgments about the QoE and that two individual raters can reliably assess the QoE using the GRADE system.",
keywords = "Evidence-based medicine, GRADE, Inter-rater reliability, Levels of evidence, Reproducibility, Validation studies",
author = "Mustafa, {Reem A.} and Nancy Santesso and Jan Brozek and Akl, {Elie A.} and Walter, {Stephen D.} and Geoff Norman and Mahan Kulasegaram and Robin Christensen and Guyatt, {Gordon H.} and Yngve Falck-Ytter and Stephanie Chang and Murad, {Mohammad H} and Vist, {Gunn E.} and Toby Lasserson and Gerald Gartlehner and Vijay Shukla and Xin Sun and Craig Whittington and Post, {Piet N.} and Eddy Lang and Kylie Thaler and Ilkka Kunnamo and Heidi Alenius and Meerpohl, {Joerg J.} and Alba, {Ana C.} and Nevis, {Immaculate F.} and Stephen Gentles and Ethier, {Marie Chantal} and Alonso Carrasco-Labra and Rasha Khatib and Gihad Nesrallah and Jamie Kroft and Amanda Selk and Romina Brignardello-Petersen and Sch{\"u}nemann, {Holger J.}",
year = "2013",
month = "7",
doi = "10.1016/j.jclinepi.2013.02.004",
language = "English (US)",
volume = "66",
pages = "736--742",
journal = "Journal of Clinical Epidemiology",
issn = "0895-4356",
publisher = "Elsevier USA",
number = "7",

}

TY - JOUR

T1 - The GRADE approach is reproducible in assessing the quality of evidence of quantitative evidence syntheses

AU - Mustafa, Reem A.

AU - Santesso, Nancy

AU - Brozek, Jan

AU - Akl, Elie A.

AU - Walter, Stephen D.

AU - Norman, Geoff

AU - Kulasegaram, Mahan

AU - Christensen, Robin

AU - Guyatt, Gordon H.

AU - Falck-Ytter, Yngve

AU - Chang, Stephanie

AU - Murad, Mohammad H

AU - Vist, Gunn E.

AU - Lasserson, Toby

AU - Gartlehner, Gerald

AU - Shukla, Vijay

AU - Sun, Xin

AU - Whittington, Craig

AU - Post, Piet N.

AU - Lang, Eddy

AU - Thaler, Kylie

AU - Kunnamo, Ilkka

AU - Alenius, Heidi

AU - Meerpohl, Joerg J.

AU - Alba, Ana C.

AU - Nevis, Immaculate F.

AU - Gentles, Stephen

AU - Ethier, Marie Chantal

AU - Carrasco-Labra, Alonso

AU - Khatib, Rasha

AU - Nesrallah, Gihad

AU - Kroft, Jamie

AU - Selk, Amanda

AU - Brignardello-Petersen, Romina

AU - Schünemann, Holger J.

PY - 2013/7

Y1 - 2013/7

N2 - Objective: We evaluated the inter-rater reliability (IRR) of assessing the quality of evidence (QoE) using the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) approach. Study Design and Setting: On completing two training exercises, participants worked independently as individual raters to assess the QoE of 16 outcomes. After recording their initial impression using a global rating, raters graded the QoE following the GRADE approach. Subsequently, randomly paired raters submitted a consensus rating. Results: The IRR without using the GRADE approach for two individual raters was 0.31 (95% confidence interval [95% CI] = 0.21-0.42) among Health Research Methodology students (n = 10) and 0.27 (95% CI = 0.19-0.37) among the GRADE working group members (n = 15). The corresponding IRR of the GRADE approach in assessing the QoE was significantly higher, that is, 0.66 (95% CI = 0.56-0.75) and 0.72 (95% CI = 0.61-0.79), respectively. The IRR further increased for three (0.80 [95% CI = 0.73-0.86] and 0.74 [95% CI = 0.65-0.81]) or four raters (0.84 [95% CI = 0.78-0.89] and 0.79 [95% CI = 0.71-0.85]). The IRR did not improve when QoE was assessed through a consensus rating. Conclusion: Our findings suggest that trained individuals using the GRADE approach improves reliability in comparison to intuitive judgments about the QoE and that two individual raters can reliably assess the QoE using the GRADE system.

AB - Objective: We evaluated the inter-rater reliability (IRR) of assessing the quality of evidence (QoE) using the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) approach. Study Design and Setting: On completing two training exercises, participants worked independently as individual raters to assess the QoE of 16 outcomes. After recording their initial impression using a global rating, raters graded the QoE following the GRADE approach. Subsequently, randomly paired raters submitted a consensus rating. Results: The IRR without using the GRADE approach for two individual raters was 0.31 (95% confidence interval [95% CI] = 0.21-0.42) among Health Research Methodology students (n = 10) and 0.27 (95% CI = 0.19-0.37) among the GRADE working group members (n = 15). The corresponding IRR of the GRADE approach in assessing the QoE was significantly higher, that is, 0.66 (95% CI = 0.56-0.75) and 0.72 (95% CI = 0.61-0.79), respectively. The IRR further increased for three (0.80 [95% CI = 0.73-0.86] and 0.74 [95% CI = 0.65-0.81]) or four raters (0.84 [95% CI = 0.78-0.89] and 0.79 [95% CI = 0.71-0.85]). The IRR did not improve when QoE was assessed through a consensus rating. Conclusion: Our findings suggest that trained individuals using the GRADE approach improves reliability in comparison to intuitive judgments about the QoE and that two individual raters can reliably assess the QoE using the GRADE system.

KW - Evidence-based medicine

KW - GRADE

KW - Inter-rater reliability

KW - Levels of evidence

KW - Reproducibility

KW - Validation studies

UR - http://www.scopus.com/inward/record.url?scp=84878261195&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878261195&partnerID=8YFLogxK

U2 - 10.1016/j.jclinepi.2013.02.004

DO - 10.1016/j.jclinepi.2013.02.004

M3 - Article

C2 - 23623694

AN - SCOPUS:84878261195

VL - 66

SP - 736

EP - 742

JO - Journal of Clinical Epidemiology

JF - Journal of Clinical Epidemiology

SN - 0895-4356

IS - 7

ER -