Identifying meaningful change on PROMIS short forms in cancer patients: a comparison of item response theory and classic test theory frameworks

Minji K. Lee; John D. Peipert; David Cella; Kathleen J. Yost; David T. Eton; Paul J. Novotny; Jeff A. Sloan; Amylou C. Dueck

doi:10.1007/s11136-022-03255-3

Identifying meaningful change on PROMIS short forms in cancer patients: a comparison of item response theory and classic test theory frameworks

Minji K. Lee, John D. Peipert, David Cella, Kathleen J. Yost, David T. Eton, Paul J. Novotny, Jeff A. Sloan, Amylou C. Dueck

Research output: Contribution to journal › Article › peer-review

Abstract

Background: This study compares classical test theory and item response theory frameworks to determine reliable change. Reliable change followed by anchoring to the change in categorically distinct responses on a criterion measure is a useful method to detect meaningful change on a target measure. Methods: Adult cancer patients were recruited from five cancer centers. Baseline and follow-up assessments at 6 weeks were administered. We investigated short forms derived from PROMIS® item banks on anxiety, depression, fatigue, pain intensity, pain interference, and sleep disturbance. We detected reliable change using reliable change index (RCI). We derived the T-scores corresponding to the RCI calculated under IRT and CTT frameworks using PROMIS® short forms. For changes that were reliable, meaningful change was identified using patient-reported change in PRO-CTCAE by at least one level. For both CTT and IRT approaches, we applied one-sided tests to detect reliable improvement or worsening using RCI. We compared the percentages of patients with reliable change and reliable/meaningful change. Results: The amount of change in T score corresponding to RCI_CTT of 1.65 ranged from 5.1 to 9.2 depending on domains. The amount of change corresponding to RCI_IRT of 1.65 varied across the score range, and the minimum change ranged from 3.0 to 8.2 depending on domains. Across domains, the RCI_CTT and RCI_IRT classified 80% to 98% of the patients consistently. When there was disagreement, the RCI_IRT tended to identify more patients as having reliably changed compared to RCI_CTT if scores at both timepoints were in the range of 43 to 78 in anxiety, 45 to 70 in depression, 38 to 80 in fatigue, 35 to 78 in sleep disturbance, and 48 to 74 in pain interference, due to smaller standard errors in these ranges using the IRT method. The CTT method found more changes compared to IRT for the pain intensity domain that was shorter in length. Using RCI_CTT, 22% to 66% had reliable change in either direction depending on domains, and among these patients, 62% to 83% had meaningful change. Using RCI_IRT, 37% to 68% had reliable change in either direction, and among these patients, 62% to 81% had meaningful change. Conclusion: Applying the two-step criteria demonstrated in this study, we determined how much change is needed to declare reliable change at different levels of baseline scores. We offer reference values for percentage of patients who meaningfully change for investigators using the PROMIS instruments in oncology.

Original language	English (US)
Pages (from-to)	1355-1367
Number of pages	13
Journal	Quality of Life Research
Volume	32
Issue number	5
DOIs	https://doi.org/10.1007/s11136-022-03255-3
State	Published - May 2023

Keywords

Classical Test Theory
Item Response Theory
Meaningful change
PRO-CTCAE
PatientReported Outcomes Measurement Information System
Reliable change index

ASJC Scopus subject areas

Public Health, Environmental and Occupational Health

Access to Document

10.1007/s11136-022-03255-3

Cite this

@article{377d33b89ed94426bf611c64657eb1d4,

title = "Identifying meaningful change on PROMIS short forms in cancer patients: a comparison of item response theory and classic test theory frameworks",

abstract = "Background: This study compares classical test theory and item response theory frameworks to determine reliable change. Reliable change followed by anchoring to the change in categorically distinct responses on a criterion measure is a useful method to detect meaningful change on a target measure. Methods: Adult cancer patients were recruited from five cancer centers. Baseline and follow-up assessments at 6 weeks were administered. We investigated short forms derived from PROMIS{\textregistered} item banks on anxiety, depression, fatigue, pain intensity, pain interference, and sleep disturbance. We detected reliable change using reliable change index (RCI). We derived the T-scores corresponding to the RCI calculated under IRT and CTT frameworks using PROMIS{\textregistered} short forms. For changes that were reliable, meaningful change was identified using patient-reported change in PRO-CTCAE by at least one level. For both CTT and IRT approaches, we applied one-sided tests to detect reliable improvement or worsening using RCI. We compared the percentages of patients with reliable change and reliable/meaningful change. Results: The amount of change in T score corresponding to RCICTT of 1.65 ranged from 5.1 to 9.2 depending on domains. The amount of change corresponding to RCIIRT of 1.65 varied across the score range, and the minimum change ranged from 3.0 to 8.2 depending on domains. Across domains, the RCICTT and RCIIRT classified 80% to 98% of the patients consistently. When there was disagreement, the RCIIRT tended to identify more patients as having reliably changed compared to RCICTT if scores at both timepoints were in the range of 43 to 78 in anxiety, 45 to 70 in depression, 38 to 80 in fatigue, 35 to 78 in sleep disturbance, and 48 to 74 in pain interference, due to smaller standard errors in these ranges using the IRT method. The CTT method found more changes compared to IRT for the pain intensity domain that was shorter in length. Using RCICTT, 22% to 66% had reliable change in either direction depending on domains, and among these patients, 62% to 83% had meaningful change. Using RCIIRT, 37% to 68% had reliable change in either direction, and among these patients, 62% to 81% had meaningful change. Conclusion: Applying the two-step criteria demonstrated in this study, we determined how much change is needed to declare reliable change at different levels of baseline scores. We offer reference values for percentage of patients who meaningfully change for investigators using the PROMIS instruments in oncology.",

keywords = "Classical Test Theory, Item Response Theory, Meaningful change, PRO-CTCAE, PatientReported Outcomes Measurement Information System, Reliable change index",

author = "Lee, {Minji K.} and Peipert, {John D.} and David Cella and Yost, {Kathleen J.} and Eton, {David T.} and Novotny, {Paul J.} and Sloan, {Jeff A.} and Dueck, {Amylou C.}",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s).",

year = "2023",

month = may,

doi = "10.1007/s11136-022-03255-3",

language = "English (US)",

volume = "32",

pages = "1355--1367",

journal = "Quality of Life Research",

issn = "0962-9343",

publisher = "Springer Netherlands",

number = "5",

}

TY - JOUR

T1 - Identifying meaningful change on PROMIS short forms in cancer patients

T2 - a comparison of item response theory and classic test theory frameworks

AU - Lee, Minji K.

AU - Peipert, John D.

AU - Cella, David

AU - Yost, Kathleen J.

AU - Eton, David T.

AU - Novotny, Paul J.

AU - Sloan, Jeff A.

AU - Dueck, Amylou C.

PY - 2023/5

Y1 - 2023/5

N2 - Background: This study compares classical test theory and item response theory frameworks to determine reliable change. Reliable change followed by anchoring to the change in categorically distinct responses on a criterion measure is a useful method to detect meaningful change on a target measure. Methods: Adult cancer patients were recruited from five cancer centers. Baseline and follow-up assessments at 6 weeks were administered. We investigated short forms derived from PROMIS® item banks on anxiety, depression, fatigue, pain intensity, pain interference, and sleep disturbance. We detected reliable change using reliable change index (RCI). We derived the T-scores corresponding to the RCI calculated under IRT and CTT frameworks using PROMIS® short forms. For changes that were reliable, meaningful change was identified using patient-reported change in PRO-CTCAE by at least one level. For both CTT and IRT approaches, we applied one-sided tests to detect reliable improvement or worsening using RCI. We compared the percentages of patients with reliable change and reliable/meaningful change. Results: The amount of change in T score corresponding to RCICTT of 1.65 ranged from 5.1 to 9.2 depending on domains. The amount of change corresponding to RCIIRT of 1.65 varied across the score range, and the minimum change ranged from 3.0 to 8.2 depending on domains. Across domains, the RCICTT and RCIIRT classified 80% to 98% of the patients consistently. When there was disagreement, the RCIIRT tended to identify more patients as having reliably changed compared to RCICTT if scores at both timepoints were in the range of 43 to 78 in anxiety, 45 to 70 in depression, 38 to 80 in fatigue, 35 to 78 in sleep disturbance, and 48 to 74 in pain interference, due to smaller standard errors in these ranges using the IRT method. The CTT method found more changes compared to IRT for the pain intensity domain that was shorter in length. Using RCICTT, 22% to 66% had reliable change in either direction depending on domains, and among these patients, 62% to 83% had meaningful change. Using RCIIRT, 37% to 68% had reliable change in either direction, and among these patients, 62% to 81% had meaningful change. Conclusion: Applying the two-step criteria demonstrated in this study, we determined how much change is needed to declare reliable change at different levels of baseline scores. We offer reference values for percentage of patients who meaningfully change for investigators using the PROMIS instruments in oncology.

AB - Background: This study compares classical test theory and item response theory frameworks to determine reliable change. Reliable change followed by anchoring to the change in categorically distinct responses on a criterion measure is a useful method to detect meaningful change on a target measure. Methods: Adult cancer patients were recruited from five cancer centers. Baseline and follow-up assessments at 6 weeks were administered. We investigated short forms derived from PROMIS® item banks on anxiety, depression, fatigue, pain intensity, pain interference, and sleep disturbance. We detected reliable change using reliable change index (RCI). We derived the T-scores corresponding to the RCI calculated under IRT and CTT frameworks using PROMIS® short forms. For changes that were reliable, meaningful change was identified using patient-reported change in PRO-CTCAE by at least one level. For both CTT and IRT approaches, we applied one-sided tests to detect reliable improvement or worsening using RCI. We compared the percentages of patients with reliable change and reliable/meaningful change. Results: The amount of change in T score corresponding to RCICTT of 1.65 ranged from 5.1 to 9.2 depending on domains. The amount of change corresponding to RCIIRT of 1.65 varied across the score range, and the minimum change ranged from 3.0 to 8.2 depending on domains. Across domains, the RCICTT and RCIIRT classified 80% to 98% of the patients consistently. When there was disagreement, the RCIIRT tended to identify more patients as having reliably changed compared to RCICTT if scores at both timepoints were in the range of 43 to 78 in anxiety, 45 to 70 in depression, 38 to 80 in fatigue, 35 to 78 in sleep disturbance, and 48 to 74 in pain interference, due to smaller standard errors in these ranges using the IRT method. The CTT method found more changes compared to IRT for the pain intensity domain that was shorter in length. Using RCICTT, 22% to 66% had reliable change in either direction depending on domains, and among these patients, 62% to 83% had meaningful change. Using RCIIRT, 37% to 68% had reliable change in either direction, and among these patients, 62% to 81% had meaningful change. Conclusion: Applying the two-step criteria demonstrated in this study, we determined how much change is needed to declare reliable change at different levels of baseline scores. We offer reference values for percentage of patients who meaningfully change for investigators using the PROMIS instruments in oncology.

KW - Classical Test Theory

KW - Item Response Theory

KW - Meaningful change

KW - PRO-CTCAE

KW - PatientReported Outcomes Measurement Information System

KW - Reliable change index

UR - http://www.scopus.com/inward/record.url?scp=85138736280&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85138736280&partnerID=8YFLogxK

U2 - 10.1007/s11136-022-03255-3

DO - 10.1007/s11136-022-03255-3

M3 - Article

C2 - 36152109

AN - SCOPUS:85138736280

SN - 0962-9343

VL - 32

SP - 1355

EP - 1367

JO - Quality of Life Research

JF - Quality of Life Research

IS - 5

ER -

Identifying meaningful change on PROMIS short forms in cancer patients: a comparison of item response theory and classic test theory frameworks

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this