Poor Agreement between Clinician Response Ratings and Calculated Response Measures in Patients with Chronic Graft-versus-Host Disease

Jeanne Palmer, Stephanie J. Lee, Xiaoyu Chai, Barry E. Storer, Mary E.D. Flowers, Kirk R. Schultz, Yoshihiro Inamoto, Corey Cutler, Joseph Pidala, Mukta Arora, David A. Jacobsohn, Paul A. Carpenter, Steven Z. Pavletic, Paul J. Martin

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

In 2005, a National Institutes of Health consensus conference was held to refine methods for research in patients with chronic graft-versus-host disease, including proposed objective response measures and a provisional algorithm for calculating organ-specific and overall response. In this study, we used weighted kappa statistics to evaluate the level of agreement between clinician response ratings and calculated response categories in patients with chronic graft-versus-host disease. The study included 290 patients who had paired enrollment and follow-up visits. Based on a set of objective measures, 37% of the patients had an overall complete or partial response, whereas clinicians reported an overall complete or partial response rate of 71% (slight to fair agreement, weighted kappa 0.20). Agreement rates between calculated organ-specific responses and clinician-reported changes in skin, mouth, and eyes were fair to moderate (weighted kappa, 0.28-0.54). We conclude that for both overall and organ-specific comparisons, clinician response ratings did not agree well with calculated response categories. Possible reasons for this discrepancy include a high clinical sensitivity for detecting response, a clinical predisposition to recognize selective improvements as overall response, the large change in objective measures proposed to define response, and the high incidence of progressive disease based on new manifestations. Conclusions from prior literature reporting high overall response rates based on clinician judgment would not be supported if the provisional algorithm had been applied to calculate response. Our analysis also highlights the need to define an overall response measure that incorporates both patient-reported and objective measures and accurately reflects the outcome in patients with a mixed response in which one organ or site improves, whereas another shows new involvement.

Original languageEnglish (US)
Pages (from-to)1649-1655
Number of pages7
JournalBiology of Blood and Marrow Transplantation
Volume18
Issue number11
DOIs
StatePublished - Nov 1 2012
Externally publishedYes

Fingerprint

Graft vs Host Disease
National Institutes of Health (U.S.)
Mouth
Consensus
Skin
Incidence
Research

Keywords

  • Allogeneic hematopoietic cell transplantation
  • Chronic graft-versus-host disease
  • Response assessment

ASJC Scopus subject areas

  • Hematology
  • Transplantation

Cite this

Poor Agreement between Clinician Response Ratings and Calculated Response Measures in Patients with Chronic Graft-versus-Host Disease. / Palmer, Jeanne; Lee, Stephanie J.; Chai, Xiaoyu; Storer, Barry E.; Flowers, Mary E.D.; Schultz, Kirk R.; Inamoto, Yoshihiro; Cutler, Corey; Pidala, Joseph; Arora, Mukta; Jacobsohn, David A.; Carpenter, Paul A.; Pavletic, Steven Z.; Martin, Paul J.

In: Biology of Blood and Marrow Transplantation, Vol. 18, No. 11, 01.11.2012, p. 1649-1655.

Research output: Contribution to journalArticle

Palmer, J, Lee, SJ, Chai, X, Storer, BE, Flowers, MED, Schultz, KR, Inamoto, Y, Cutler, C, Pidala, J, Arora, M, Jacobsohn, DA, Carpenter, PA, Pavletic, SZ & Martin, PJ 2012, 'Poor Agreement between Clinician Response Ratings and Calculated Response Measures in Patients with Chronic Graft-versus-Host Disease', Biology of Blood and Marrow Transplantation, vol. 18, no. 11, pp. 1649-1655. https://doi.org/10.1016/j.bbmt.2012.05.005
Palmer, Jeanne ; Lee, Stephanie J. ; Chai, Xiaoyu ; Storer, Barry E. ; Flowers, Mary E.D. ; Schultz, Kirk R. ; Inamoto, Yoshihiro ; Cutler, Corey ; Pidala, Joseph ; Arora, Mukta ; Jacobsohn, David A. ; Carpenter, Paul A. ; Pavletic, Steven Z. ; Martin, Paul J. / Poor Agreement between Clinician Response Ratings and Calculated Response Measures in Patients with Chronic Graft-versus-Host Disease. In: Biology of Blood and Marrow Transplantation. 2012 ; Vol. 18, No. 11. pp. 1649-1655.
@article{d8d02827cfc8485191925636b2d27760,
title = "Poor Agreement between Clinician Response Ratings and Calculated Response Measures in Patients with Chronic Graft-versus-Host Disease",
abstract = "In 2005, a National Institutes of Health consensus conference was held to refine methods for research in patients with chronic graft-versus-host disease, including proposed objective response measures and a provisional algorithm for calculating organ-specific and overall response. In this study, we used weighted kappa statistics to evaluate the level of agreement between clinician response ratings and calculated response categories in patients with chronic graft-versus-host disease. The study included 290 patients who had paired enrollment and follow-up visits. Based on a set of objective measures, 37{\%} of the patients had an overall complete or partial response, whereas clinicians reported an overall complete or partial response rate of 71{\%} (slight to fair agreement, weighted kappa 0.20). Agreement rates between calculated organ-specific responses and clinician-reported changes in skin, mouth, and eyes were fair to moderate (weighted kappa, 0.28-0.54). We conclude that for both overall and organ-specific comparisons, clinician response ratings did not agree well with calculated response categories. Possible reasons for this discrepancy include a high clinical sensitivity for detecting response, a clinical predisposition to recognize selective improvements as overall response, the large change in objective measures proposed to define response, and the high incidence of progressive disease based on new manifestations. Conclusions from prior literature reporting high overall response rates based on clinician judgment would not be supported if the provisional algorithm had been applied to calculate response. Our analysis also highlights the need to define an overall response measure that incorporates both patient-reported and objective measures and accurately reflects the outcome in patients with a mixed response in which one organ or site improves, whereas another shows new involvement.",
keywords = "Allogeneic hematopoietic cell transplantation, Chronic graft-versus-host disease, Response assessment",
author = "Jeanne Palmer and Lee, {Stephanie J.} and Xiaoyu Chai and Storer, {Barry E.} and Flowers, {Mary E.D.} and Schultz, {Kirk R.} and Yoshihiro Inamoto and Corey Cutler and Joseph Pidala and Mukta Arora and Jacobsohn, {David A.} and Carpenter, {Paul A.} and Pavletic, {Steven Z.} and Martin, {Paul J.}",
year = "2012",
month = "11",
day = "1",
doi = "10.1016/j.bbmt.2012.05.005",
language = "English (US)",
volume = "18",
pages = "1649--1655",
journal = "Biology of Blood and Marrow Transplantation",
issn = "1083-8791",
publisher = "Elsevier Inc.",
number = "11",

}

TY - JOUR

T1 - Poor Agreement between Clinician Response Ratings and Calculated Response Measures in Patients with Chronic Graft-versus-Host Disease

AU - Palmer, Jeanne

AU - Lee, Stephanie J.

AU - Chai, Xiaoyu

AU - Storer, Barry E.

AU - Flowers, Mary E.D.

AU - Schultz, Kirk R.

AU - Inamoto, Yoshihiro

AU - Cutler, Corey

AU - Pidala, Joseph

AU - Arora, Mukta

AU - Jacobsohn, David A.

AU - Carpenter, Paul A.

AU - Pavletic, Steven Z.

AU - Martin, Paul J.

PY - 2012/11/1

Y1 - 2012/11/1

N2 - In 2005, a National Institutes of Health consensus conference was held to refine methods for research in patients with chronic graft-versus-host disease, including proposed objective response measures and a provisional algorithm for calculating organ-specific and overall response. In this study, we used weighted kappa statistics to evaluate the level of agreement between clinician response ratings and calculated response categories in patients with chronic graft-versus-host disease. The study included 290 patients who had paired enrollment and follow-up visits. Based on a set of objective measures, 37% of the patients had an overall complete or partial response, whereas clinicians reported an overall complete or partial response rate of 71% (slight to fair agreement, weighted kappa 0.20). Agreement rates between calculated organ-specific responses and clinician-reported changes in skin, mouth, and eyes were fair to moderate (weighted kappa, 0.28-0.54). We conclude that for both overall and organ-specific comparisons, clinician response ratings did not agree well with calculated response categories. Possible reasons for this discrepancy include a high clinical sensitivity for detecting response, a clinical predisposition to recognize selective improvements as overall response, the large change in objective measures proposed to define response, and the high incidence of progressive disease based on new manifestations. Conclusions from prior literature reporting high overall response rates based on clinician judgment would not be supported if the provisional algorithm had been applied to calculate response. Our analysis also highlights the need to define an overall response measure that incorporates both patient-reported and objective measures and accurately reflects the outcome in patients with a mixed response in which one organ or site improves, whereas another shows new involvement.

AB - In 2005, a National Institutes of Health consensus conference was held to refine methods for research in patients with chronic graft-versus-host disease, including proposed objective response measures and a provisional algorithm for calculating organ-specific and overall response. In this study, we used weighted kappa statistics to evaluate the level of agreement between clinician response ratings and calculated response categories in patients with chronic graft-versus-host disease. The study included 290 patients who had paired enrollment and follow-up visits. Based on a set of objective measures, 37% of the patients had an overall complete or partial response, whereas clinicians reported an overall complete or partial response rate of 71% (slight to fair agreement, weighted kappa 0.20). Agreement rates between calculated organ-specific responses and clinician-reported changes in skin, mouth, and eyes were fair to moderate (weighted kappa, 0.28-0.54). We conclude that for both overall and organ-specific comparisons, clinician response ratings did not agree well with calculated response categories. Possible reasons for this discrepancy include a high clinical sensitivity for detecting response, a clinical predisposition to recognize selective improvements as overall response, the large change in objective measures proposed to define response, and the high incidence of progressive disease based on new manifestations. Conclusions from prior literature reporting high overall response rates based on clinician judgment would not be supported if the provisional algorithm had been applied to calculate response. Our analysis also highlights the need to define an overall response measure that incorporates both patient-reported and objective measures and accurately reflects the outcome in patients with a mixed response in which one organ or site improves, whereas another shows new involvement.

KW - Allogeneic hematopoietic cell transplantation

KW - Chronic graft-versus-host disease

KW - Response assessment

UR - http://www.scopus.com/inward/record.url?scp=84867539234&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867539234&partnerID=8YFLogxK

U2 - 10.1016/j.bbmt.2012.05.005

DO - 10.1016/j.bbmt.2012.05.005

M3 - Article

VL - 18

SP - 1649

EP - 1655

JO - Biology of Blood and Marrow Transplantation

JF - Biology of Blood and Marrow Transplantation

SN - 1083-8791

IS - 11

ER -