Background: The National Cancer Database (NCDB) is a valuable resource for studying national cancer treatment patterns. However, data abstraction rules from 2004 to 2007 resulted in missing clinical stage for a high percentage of cases. We investigated how this missingness can bias results in breast cancer studies including patients treated with neoadjuvant chemotherapy (NAC). Methods: The impact of missing clinical stage on the estimated percentage of breast cancers treated with NAC versus adjuvant chemotherapy (AC) was examined from 2004 to 2013. Trends in NAC use were presented, excluding those cases with missing clinical stage, and compared with trends after multiple imputation, performed using the chained equations approach with predictive mean matching. Results: Clinical stage was missing for 56% of cases in 2004–2007, versus 12% in 2008–2013, and was missing more than twice as often for AC patients versus NAC patients (31% vs. 12% overall), with the largest difference occurring in 2004–2007 (60% vs. 27% missing). Because stage was more frequently missing in AC patients, excluding those missing clinical stage introduced bias when considering NAC versus AC trends. With multiple imputation, significant increases in NAC use were identified between 2004 and 2013 for each stage: use for stage I was 2% in 2004 and 5% in 2013, use for stage II was 11% in 2004 and 24% in 2013, use for stage III was 34% in 2004 and 46% in 2013, in contrast to an analysis excluding those missing stage, which suggested little or no increase within any stage. Conclusion: NCDB data abstraction rules from 2004 to 2007 resulted in missing clinical stage for > 50% of breast cancers, which may introduce substantial bias. Multiple imputation or exclusion of the years 2004–2007 should be considered to mitigate the problem of missing clinical stage in NCDB.
ASJC Scopus subject areas