Crowdsourcing the General Public for Large Scale Molecular Pathology Studies in Cancer

Francisco J. Candido dos Reis, Stuart Lynn, H. Raza Ali, Diana Eccles, Andrew Hanby, Elena Provenzano, Carlos Caldas, William J. Howat, Leigh Anne McDuffus, Bin Liu, Frances Daley, Penny Coulson, Rupesh J. Vyas, Leslie M. Harris, Joanna M. Owens, Amy F.M. Carton, Janette P. McQuillan, Andy M. Paterson, Zohra Hirji, Sarah K. ChristieAmber R. Holmes, Marjanka K. Schmidt, Montserrat Garcia-Closas, Douglas F. Easton, Manjeet K. Bolla, Qin Wang, Javier Benitez, Roger L. Milne, Arto Mannermaa, Fergus Couch, Peter Devilee, Robert A.E.M. Tollenaar, Caroline Seynaeve, Angela Cox, Simon S. Cross, Fiona M. Blows, Joyce Sanders, Renate de Groot, Jonine Figueroa, Mark Sherman, Maartje Hooning, Hermann Brenner, Bernd Holleczek, Christa Stegmaier, Chris Lintott, Paul D.P. Pharoah

Research output: Contribution to journalArticle

33 Scopus citations

Abstract

Background: Citizen science, scientific research conducted by non-specialists, has the potential to facilitate biomedical research using available large-scale data, however validating the results is challenging. The Cell Slider is a citizen science project that intends to share images from tumors with the general public, enabling them to score tumor markers independently through an internet-based interface. Methods: From October 2012 to June 2014, 98,293 Citizen Scientists accessed the Cell Slider web page and scored 180,172 sub-images derived from images of 12,326 tissue microarray cores labeled for estrogen receptor (ER). We evaluated the accuracy of Citizen Scientist's ER classification, and the association between ER status and prognosis by comparing their test performance against trained pathologists. Findings: The area under ROC curve was 0.95 (95% CI 0.94 to 0.96) for cancer cell identification and 0.97 (95% CI 0.96 to 0.97) for ER status. ER positive tumors scored by Citizen Scientists were associated with survival in a similar way to that scored by trained pathologists. Survival probability at 15. years were 0.78 (95% CI 0.76 to 0.80) for ER-positive and 0.72 (95% CI 0.68 to 0.77) for ER-negative tumors based on Citizen Scientists classification. Based on pathologist classification, survival probability was 0.79 (95% CI 0.77 to 0.81) for ER-positive and 0.71 (95% CI 0.67 to 0.74) for ER-negative tumors. The hazard ratio for death was 0.26 (95% CI 0.18 to 0.37) at diagnosis and became greater than one after 6.5. years of follow-up for ER scored by Citizen Scientists, and 0.24 (95% CI 0.18 to 0.33) at diagnosis increasing thereafter to one after 6.7 (95% CI 4.1 to 10.9) years of follow-up for ER scored by pathologists. Interpretation: Crowdsourcing of the general public to classify cancer pathology data for research is viable, engages the public and provides accurate ER data. Crowdsourced classification of research data may offer a valid solution to problems of throughput requiring human input.

Original languageEnglish (US)
Pages (from-to)681-689
Number of pages9
JournalEBioMedicine
Volume2
Issue number7
DOIs
StatePublished - Jul 1 2015

    Fingerprint

Keywords

  • Breast cancer
  • Citizen science
  • Crowd science
  • Crowdsourcing

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Candido dos Reis, F. J., Lynn, S., Ali, H. R., Eccles, D., Hanby, A., Provenzano, E., Caldas, C., Howat, W. J., McDuffus, L. A., Liu, B., Daley, F., Coulson, P., Vyas, R. J., Harris, L. M., Owens, J. M., Carton, A. F. M., McQuillan, J. P., Paterson, A. M., Hirji, Z., ... Pharoah, P. D. P. (2015). Crowdsourcing the General Public for Large Scale Molecular Pathology Studies in Cancer. EBioMedicine, 2(7), 681-689. https://doi.org/10.1016/j.ebiom.2015.05.009