TY - JOUR
T1 - Study design considerations in clinical outcome research of lung cancer using microarray analysis
AU - Yang, P.
AU - Sun, Z.
AU - Aubry, M. C.
AU - Kosari, F.
AU - Bamlet, W.
AU - Endo, C.
AU - Molina, J. R.
AU - Vasmatzis, G.
N1 - Funding Information:
This work was supported by research grants from the US National Cancer Institute (CA80127 and CA84354—Yang) and Mayo Foundation Funds. We would like to thank Drs. A. Visbal, C. Deschamps, and R. Marks for their assistance at various stages of this work. We also thank Ms. Susan Ernst for her technical assistance with the manuscript.
PY - 2004/11
Y1 - 2004/11
N2 - Background: Prognosis following a diagnosis of primary lung cancer is very poor and varies significantly even after adjusting for known predictors. Inherent and acquired gene alterations could cause failure in lung cancer treatment and patient survival. To search for potential molecular markers with significant and independent predictive value in lung cancer survival, we applied oligo-nucleotide microarray analysis, along with patients' phenotypic profile, in a case-control study. The focus of this report is on the methodology used in the identification of potential genes as prognostic factors. Methods: Selected from 304 patients at Mayo Clinic, 18 stage I squamous cell lung cancer patients who died within 2 years (high-aggressive) or lived beyond 5 years (low-aggressive) were included in this study. Both a one-to-one matched design (paired) and a two-group design (grouped) were utilized. Matching variables were age, gender, tumor size and grade, smoking status, and treatment. Two-GeneChip®-array sets from Affymetrix (HG-U133) were used. We applied multiple analytic approaches including Dchip (Harvard University), SAM (Stanford University), ArrayTools (US National Cancer Institute), and MAS5 (Affymetrix); and integrated multiple results to generate the final candidate genes for further investigation. We evaluated the consistency across the methods and the effects of matched versus grouped design on the results. Results: Using the same pre-processed data under the same criteria for type I error and fold-change in expression intensity, results are 94-100% concordant in the list of significant genes by Dchip and by ArrayTools, and 53% concordant between the paired and the grouped analysis. If using differently pre-processed data, the concordance rate is under 6% even by the same analytic tool. Combining results from all analyses, we found 23 potentially important genes that may distinguish the high- versus low-aggressive squamous cell tumors of the lung. Conclusion: Given the generally low consistency of results across analytic algorithms and study design, poor agreement is expected from different investigators reporting candidate genes for the same endpoint. A well-designed study with a carefully planned analytic strategy is critical. We are in the process of validating the 23 preliminary candidate genes found from this study among independent yet comparable cases.
AB - Background: Prognosis following a diagnosis of primary lung cancer is very poor and varies significantly even after adjusting for known predictors. Inherent and acquired gene alterations could cause failure in lung cancer treatment and patient survival. To search for potential molecular markers with significant and independent predictive value in lung cancer survival, we applied oligo-nucleotide microarray analysis, along with patients' phenotypic profile, in a case-control study. The focus of this report is on the methodology used in the identification of potential genes as prognostic factors. Methods: Selected from 304 patients at Mayo Clinic, 18 stage I squamous cell lung cancer patients who died within 2 years (high-aggressive) or lived beyond 5 years (low-aggressive) were included in this study. Both a one-to-one matched design (paired) and a two-group design (grouped) were utilized. Matching variables were age, gender, tumor size and grade, smoking status, and treatment. Two-GeneChip®-array sets from Affymetrix (HG-U133) were used. We applied multiple analytic approaches including Dchip (Harvard University), SAM (Stanford University), ArrayTools (US National Cancer Institute), and MAS5 (Affymetrix); and integrated multiple results to generate the final candidate genes for further investigation. We evaluated the consistency across the methods and the effects of matched versus grouped design on the results. Results: Using the same pre-processed data under the same criteria for type I error and fold-change in expression intensity, results are 94-100% concordant in the list of significant genes by Dchip and by ArrayTools, and 53% concordant between the paired and the grouped analysis. If using differently pre-processed data, the concordance rate is under 6% even by the same analytic tool. Combining results from all analyses, we found 23 potentially important genes that may distinguish the high- versus low-aggressive squamous cell tumors of the lung. Conclusion: Given the generally low consistency of results across analytic algorithms and study design, poor agreement is expected from different investigators reporting candidate genes for the same endpoint. A well-designed study with a carefully planned analytic strategy is critical. We are in the process of validating the 23 preliminary candidate genes found from this study among independent yet comparable cases.
KW - Lung cancer
KW - Microarray
KW - NSCLC
KW - Squamous cell carcinoma
KW - Study design
KW - Survival
UR - http://www.scopus.com/inward/record.url?scp=4944265743&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=4944265743&partnerID=8YFLogxK
U2 - 10.1016/j.lungcan.2004.03.012
DO - 10.1016/j.lungcan.2004.03.012
M3 - Article
C2 - 15474670
AN - SCOPUS:4944265743
SN - 0169-5002
VL - 46
SP - 215
EP - 226
JO - Lung Cancer
JF - Lung Cancer
IS - 2
ER -