TY - JOUR
T1 - The effects of physics-based data augmentation on the generalizability of deep neural networks
T2 - Demonstration on nodule false-positive reduction
AU - Omigbodun, Akinyinka O.
AU - Noo, Frederic
AU - McNitt-Gray, Michael
AU - Hsu, William
AU - Hsieh, Scott S.
N1 - Funding Information:
The authors thank John Hoffman, Nastaran Emaminejad, Angela Sultan, and Muhammad Wahi-Anwar for their assistance in accessing CT data used in preliminary analyses. The authors also thank Professor Matt Brown for helpful feedback in the course of this project. Finally, we acknowledge the reviewers of this manuscript for a very insightful review that aided us in revising this manuscript.
Publisher Copyright:
© 2019 American Association of Physicists in Medicine
PY - 2019/10/1
Y1 - 2019/10/1
N2 - Purpose: An important challenge for deep learning models is generalizing to new datasets that may be acquired with acquisition protocols different from the training set. It is not always feasible to expand training data to the range encountered in clinical practice. We introduce a new technique, physics-based data augmentation (PBDA), that can emulate new computed tomography (CT) data acquisition protocols. We demonstrate two forms of PBDA, emulating increases in slice thickness and reductions of dose, on the specific problem of false-positive reduction in the automatic detection of lung nodules. Methods: We worked with CT images from the lung image database consortium (LIDC) collection. We employed a hybrid ensemble convolutional neural network (CNN), which consists of multiple CNN modules (VGG, DenseNet, ResNet), for a classification task of determining whether an image patch was a suspicious nodule or a false positive. To emulate a reduction in tube current, we injected noise by simulating forward projection, noise addition, and backprojection corresponding to 1.5 mAs (a “chest x-ray” dose). To simulate thick slice CT scans from thin slice CT scans, we grouped and averaged spatially contiguous CT within thin slice data. The neural network was trained with 10% of the LIDC dataset that was selected to have either the highest tube current or the thinnest slices. The network was tested on the remaining data. We compared PBDA to a baseline with standard geometric augmentations (such as shifts and rotations) and Gaussian noise addition. Results: PBDA improved the performance of the networks when generalizing to the test dataset in a limited number of cases. We found that the best performance was obtained by applying augmentation at very low doses (1.5 mAs), about an order of magnitude less than most screening protocols. In the baseline augmentation, a comparable level of Gaussian noise was injected. For dose reduction PBDA, the average sensitivity of 0.931 for the hybrid ensemble network was not statistically different from the average sensitivity of 0.935 without PBDA. Similarly for slice thickness PBDA, the average sensitivity of 0.900 when augmenting with doubled simulated slice thicknesses was not statistically different from the average sensitivity of 0.895 without PBDA. While there were cases detailed in this paper in which we observed improvements, the overall picture was one that suggests PBDA may not be an effective data enrichment tool. Conclusions: PBDA is a newly proposed strategy for mitigating the performance loss of neural networks related to the variation of acquisition protocol between the training dataset and the data that is encountered in deployment or testing. We found that PBDA does not provide robust improvements with the four neural networks (three modules and the ensemble) tested and for the specific task of false-positive reduction in nodule detection.
AB - Purpose: An important challenge for deep learning models is generalizing to new datasets that may be acquired with acquisition protocols different from the training set. It is not always feasible to expand training data to the range encountered in clinical practice. We introduce a new technique, physics-based data augmentation (PBDA), that can emulate new computed tomography (CT) data acquisition protocols. We demonstrate two forms of PBDA, emulating increases in slice thickness and reductions of dose, on the specific problem of false-positive reduction in the automatic detection of lung nodules. Methods: We worked with CT images from the lung image database consortium (LIDC) collection. We employed a hybrid ensemble convolutional neural network (CNN), which consists of multiple CNN modules (VGG, DenseNet, ResNet), for a classification task of determining whether an image patch was a suspicious nodule or a false positive. To emulate a reduction in tube current, we injected noise by simulating forward projection, noise addition, and backprojection corresponding to 1.5 mAs (a “chest x-ray” dose). To simulate thick slice CT scans from thin slice CT scans, we grouped and averaged spatially contiguous CT within thin slice data. The neural network was trained with 10% of the LIDC dataset that was selected to have either the highest tube current or the thinnest slices. The network was tested on the remaining data. We compared PBDA to a baseline with standard geometric augmentations (such as shifts and rotations) and Gaussian noise addition. Results: PBDA improved the performance of the networks when generalizing to the test dataset in a limited number of cases. We found that the best performance was obtained by applying augmentation at very low doses (1.5 mAs), about an order of magnitude less than most screening protocols. In the baseline augmentation, a comparable level of Gaussian noise was injected. For dose reduction PBDA, the average sensitivity of 0.931 for the hybrid ensemble network was not statistically different from the average sensitivity of 0.935 without PBDA. Similarly for slice thickness PBDA, the average sensitivity of 0.900 when augmenting with doubled simulated slice thicknesses was not statistically different from the average sensitivity of 0.895 without PBDA. While there were cases detailed in this paper in which we observed improvements, the overall picture was one that suggests PBDA may not be an effective data enrichment tool. Conclusions: PBDA is a newly proposed strategy for mitigating the performance loss of neural networks related to the variation of acquisition protocol between the training dataset and the data that is encountered in deployment or testing. We found that PBDA does not provide robust improvements with the four neural networks (three modules and the ensemble) tested and for the specific task of false-positive reduction in nodule detection.
KW - data augmentation
KW - ensemble CNN
KW - false-positive reduction
KW - lung CT
KW - lung nodule detection
UR - http://www.scopus.com/inward/record.url?scp=85071620384&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071620384&partnerID=8YFLogxK
U2 - 10.1002/mp.13755
DO - 10.1002/mp.13755
M3 - Article
C2 - 31396974
AN - SCOPUS:85071620384
SN - 0094-2405
VL - 46
SP - 4563
EP - 4574
JO - Medical Physics
JF - Medical Physics
IS - 10
ER -