TY - JOUR
T1 - Evaluation of Computer-Aided Nodule Assessment and Risk Yield (CANARY) in Korean patients for prediction of invasiveness of ground-glass opacity nodule
AU - Lee, Juyoung
AU - Bartholmai, Brian
AU - Peikert, Tobias
AU - Chun, Jaehee
AU - Kim, Hojin
AU - Kim, Jin Sung
AU - Park, Seong Yong
N1 - Publisher Copyright:
© 2021 Lee et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2021/6
Y1 - 2021/6
N2 - Differentiating the invasiveness of ground-glass nodules (GGN) is clinically important, and several institutions have attempted to develop their own solutions by using computed tomography images. The purpose of this study is to evaluate Computer-Aided Analysis of Risk Yield (CANARY), a validated virtual biopsy and risk-stratification machine-learning tool for lung adenocarcinomas, in a Korean patient population. To this end, a total of 380 GGNs from 360 patients who underwent pulmonary resection in a single institution were reviewed. Based on the Score Indicative of Lung Cancer Aggression (SILA), a quantitative indicator of CANARY analysis results, all of the GGNs were classified as "indolent"(atypical adenomatous hyperplasia, adenocarcinomas in situ, or minimally invasive adenocarcinoma) or "invasive"(invasive adenocarcinoma) and compared with the pathology reports. By considering the possibility of uneven class distribution, statistical analysis was performed on the 1) entire cohort and 2) randomly extracted six sets of class-balanced samples. For each trial, the optimal cutoff SILA was obtained from the receiver operating characteristic curve. The classification results were evaluated using several binary classification metrics. Of a total of 380 GGNs, the mean SILA for 65 (17.1%) indolent and 315 (82.9%) invasive lesions were 0.195±0.124 and 0.391±0.208 (p < 0.0001). The area under the curve (AUC) of each trial was 0.814 and 0.809, with an optimal threshold SILA of 0.229 for both. The macro F1-score and geometric mean were found to be 0.675 and 0.745 for the entire cohort, while both scored 0.741 in the class-equalized dataset. From these results, CANARY could be confirmed acceptable in classifying GGN for Korean patients after the cutoff SILA was calibrated. We found that adjusting the cutoff SILA is needed to use CANARY in other countries or races, and geometric mean could be more objective than F1-score or AUC in the binary classification of imbalanced data.
AB - Differentiating the invasiveness of ground-glass nodules (GGN) is clinically important, and several institutions have attempted to develop their own solutions by using computed tomography images. The purpose of this study is to evaluate Computer-Aided Analysis of Risk Yield (CANARY), a validated virtual biopsy and risk-stratification machine-learning tool for lung adenocarcinomas, in a Korean patient population. To this end, a total of 380 GGNs from 360 patients who underwent pulmonary resection in a single institution were reviewed. Based on the Score Indicative of Lung Cancer Aggression (SILA), a quantitative indicator of CANARY analysis results, all of the GGNs were classified as "indolent"(atypical adenomatous hyperplasia, adenocarcinomas in situ, or minimally invasive adenocarcinoma) or "invasive"(invasive adenocarcinoma) and compared with the pathology reports. By considering the possibility of uneven class distribution, statistical analysis was performed on the 1) entire cohort and 2) randomly extracted six sets of class-balanced samples. For each trial, the optimal cutoff SILA was obtained from the receiver operating characteristic curve. The classification results were evaluated using several binary classification metrics. Of a total of 380 GGNs, the mean SILA for 65 (17.1%) indolent and 315 (82.9%) invasive lesions were 0.195±0.124 and 0.391±0.208 (p < 0.0001). The area under the curve (AUC) of each trial was 0.814 and 0.809, with an optimal threshold SILA of 0.229 for both. The macro F1-score and geometric mean were found to be 0.675 and 0.745 for the entire cohort, while both scored 0.741 in the class-equalized dataset. From these results, CANARY could be confirmed acceptable in classifying GGN for Korean patients after the cutoff SILA was calibrated. We found that adjusting the cutoff SILA is needed to use CANARY in other countries or races, and geometric mean could be more objective than F1-score or AUC in the binary classification of imbalanced data.
UR - http://www.scopus.com/inward/record.url?scp=85107935796&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107935796&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0253204
DO - 10.1371/journal.pone.0253204
M3 - Article
C2 - 34125856
AN - SCOPUS:85107935796
SN - 1932-6203
VL - 16
JO - PLoS One
JF - PLoS One
IS - 6 June
M1 - e0253204
ER -