Autosomal dominantly inherited alzheimer disease: Analysis of genetic subgroups by machine learning

Diego Castillo-Barnes, Li Su, Javier Ramírez, Diego Salas-Gonzalez, Francisco J. Martinez-Murcia, Ignacio A. Illan, Fermin Segovia, Andres Ortiz, Carlos Cruchaga, Martin R. Farlow, Chengjie Xiong, Neil R. Graff-Radford, Peter R. Schofield, Colin L. Masters, Stephen Salloway, Mathias Jucker, Hiroshi Mori, Johannes Levin, Juan M. Gorriz, Dominantly Inherited Alzheimer Network (DIAN)

Research output: Contribution to journalArticle

Abstract

Despite subjects with Dominantly-Inherited Alzheimer's Disease (DIAD) represent less than 1% of all Alzheimer's Disease (AD) cases, the Dominantly Inherited Alzheimer Network (DIAN) initiative constitutes a strong impact in the understanding of AD disease course with special emphasis on the presyptomatic disease phase. Until now, the 3 genes involved in DIAD pathogenesis (PSEN1, PSEN2 and APP) have been commonly merged into one group (Mutation Carriers, MC) and studied using conventional statistical analysis. Comparisons between groups using null-hypothesis testing or longitudinal regression procedures, such as the linear-mixed-effects models, have been assessed in the extant literature. Within this context, the work presented here performs a comparison between different groups of subjects by considering the 3 genes, either jointly or separately, and using tools based on Machine Learning (ML). This involves a feature selection step which makes use of ANOVA followed by Principal Component Analysis (PCA) to determine which features would be realiable for further comparison purposes. Then, the selected predictors are classified using a Support-Vector-Machine (SVM) in a nested k-Fold cross-validation resulting in maximum classification rates of 72–74% using PiB PET features, specially when comparing asymptomatic Non-Carriers (NC) subjects with asymptomatic PSEN1 Mutation-Carriers (PSEN1-MC). Results obtained from these experiments led to the idea that PSEN1-MC might be considered as a mixture of two different subgroups including: a first group whose patterns were very close to NC subjects, and a second group much more different in terms of imaging patterns. Thus, using a k-Means clustering algorithm it was determined both subgroups and a new classification scenario was conducted to validate this process. The comparison between each subgroup vs. NC subjects resulted in classification rates around 80% underscoring the importance of considering DIAN as an heterogeneous entity.

Original languageEnglish (US)
Pages (from-to)153-167
Number of pages15
JournalInformation Fusion
Volume58
DOIs
StatePublished - Jun 2020

Fingerprint

Learning systems
Genes
Analysis of variance (ANOVA)
Clustering algorithms
Principal component analysis
Support vector machines
Feature extraction
Statistical methods
Imaging techniques
Testing
Experiments

Keywords

  • Alzheimer's disease (AD)
  • DIAN
  • Dominantly-inherited Alzheimer's disease (DIAD)
  • Machine learning
  • Neuroimaging

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems
  • Hardware and Architecture

Cite this

Castillo-Barnes, D., Su, L., Ramírez, J., Salas-Gonzalez, D., Martinez-Murcia, F. J., Illan, I. A., ... (DIAN), D. I. A. N. (2020). Autosomal dominantly inherited alzheimer disease: Analysis of genetic subgroups by machine learning. Information Fusion, 58, 153-167. https://doi.org/10.1016/j.inffus.2020.01.001

Autosomal dominantly inherited alzheimer disease : Analysis of genetic subgroups by machine learning. / Castillo-Barnes, Diego; Su, Li; Ramírez, Javier; Salas-Gonzalez, Diego; Martinez-Murcia, Francisco J.; Illan, Ignacio A.; Segovia, Fermin; Ortiz, Andres; Cruchaga, Carlos; Farlow, Martin R.; Xiong, Chengjie; Graff-Radford, Neil R.; Schofield, Peter R.; Masters, Colin L.; Salloway, Stephen; Jucker, Mathias; Mori, Hiroshi; Levin, Johannes; Gorriz, Juan M.; (DIAN), Dominantly Inherited Alzheimer Network.

In: Information Fusion, Vol. 58, 06.2020, p. 153-167.

Research output: Contribution to journalArticle

Castillo-Barnes, D, Su, L, Ramírez, J, Salas-Gonzalez, D, Martinez-Murcia, FJ, Illan, IA, Segovia, F, Ortiz, A, Cruchaga, C, Farlow, MR, Xiong, C, Graff-Radford, NR, Schofield, PR, Masters, CL, Salloway, S, Jucker, M, Mori, H, Levin, J, Gorriz, JM & (DIAN), DIAN 2020, 'Autosomal dominantly inherited alzheimer disease: Analysis of genetic subgroups by machine learning', Information Fusion, vol. 58, pp. 153-167. https://doi.org/10.1016/j.inffus.2020.01.001
Castillo-Barnes D, Su L, Ramírez J, Salas-Gonzalez D, Martinez-Murcia FJ, Illan IA et al. Autosomal dominantly inherited alzheimer disease: Analysis of genetic subgroups by machine learning. Information Fusion. 2020 Jun;58:153-167. https://doi.org/10.1016/j.inffus.2020.01.001
Castillo-Barnes, Diego ; Su, Li ; Ramírez, Javier ; Salas-Gonzalez, Diego ; Martinez-Murcia, Francisco J. ; Illan, Ignacio A. ; Segovia, Fermin ; Ortiz, Andres ; Cruchaga, Carlos ; Farlow, Martin R. ; Xiong, Chengjie ; Graff-Radford, Neil R. ; Schofield, Peter R. ; Masters, Colin L. ; Salloway, Stephen ; Jucker, Mathias ; Mori, Hiroshi ; Levin, Johannes ; Gorriz, Juan M. ; (DIAN), Dominantly Inherited Alzheimer Network. / Autosomal dominantly inherited alzheimer disease : Analysis of genetic subgroups by machine learning. In: Information Fusion. 2020 ; Vol. 58. pp. 153-167.
@article{b4d95bbf5a6d4d819392b7461c0f9ac4,
title = "Autosomal dominantly inherited alzheimer disease: Analysis of genetic subgroups by machine learning",
abstract = "Despite subjects with Dominantly-Inherited Alzheimer's Disease (DIAD) represent less than 1{\%} of all Alzheimer's Disease (AD) cases, the Dominantly Inherited Alzheimer Network (DIAN) initiative constitutes a strong impact in the understanding of AD disease course with special emphasis on the presyptomatic disease phase. Until now, the 3 genes involved in DIAD pathogenesis (PSEN1, PSEN2 and APP) have been commonly merged into one group (Mutation Carriers, MC) and studied using conventional statistical analysis. Comparisons between groups using null-hypothesis testing or longitudinal regression procedures, such as the linear-mixed-effects models, have been assessed in the extant literature. Within this context, the work presented here performs a comparison between different groups of subjects by considering the 3 genes, either jointly or separately, and using tools based on Machine Learning (ML). This involves a feature selection step which makes use of ANOVA followed by Principal Component Analysis (PCA) to determine which features would be realiable for further comparison purposes. Then, the selected predictors are classified using a Support-Vector-Machine (SVM) in a nested k-Fold cross-validation resulting in maximum classification rates of 72–74{\%} using PiB PET features, specially when comparing asymptomatic Non-Carriers (NC) subjects with asymptomatic PSEN1 Mutation-Carriers (PSEN1-MC). Results obtained from these experiments led to the idea that PSEN1-MC might be considered as a mixture of two different subgroups including: a first group whose patterns were very close to NC subjects, and a second group much more different in terms of imaging patterns. Thus, using a k-Means clustering algorithm it was determined both subgroups and a new classification scenario was conducted to validate this process. The comparison between each subgroup vs. NC subjects resulted in classification rates around 80{\%} underscoring the importance of considering DIAN as an heterogeneous entity.",
keywords = "Alzheimer's disease (AD), DIAN, Dominantly-inherited Alzheimer's disease (DIAD), Machine learning, Neuroimaging",
author = "Diego Castillo-Barnes and Li Su and Javier Ram{\'i}rez and Diego Salas-Gonzalez and Martinez-Murcia, {Francisco J.} and Illan, {Ignacio A.} and Fermin Segovia and Andres Ortiz and Carlos Cruchaga and Farlow, {Martin R.} and Chengjie Xiong and Graff-Radford, {Neil R.} and Schofield, {Peter R.} and Masters, {Colin L.} and Stephen Salloway and Mathias Jucker and Hiroshi Mori and Johannes Levin and Gorriz, {Juan M.} and (DIAN), {Dominantly Inherited Alzheimer Network}",
year = "2020",
month = "6",
doi = "10.1016/j.inffus.2020.01.001",
language = "English (US)",
volume = "58",
pages = "153--167",
journal = "Information Fusion",
issn = "1566-2535",
publisher = "Elsevier",

}

TY - JOUR

T1 - Autosomal dominantly inherited alzheimer disease

T2 - Analysis of genetic subgroups by machine learning

AU - Castillo-Barnes, Diego

AU - Su, Li

AU - Ramírez, Javier

AU - Salas-Gonzalez, Diego

AU - Martinez-Murcia, Francisco J.

AU - Illan, Ignacio A.

AU - Segovia, Fermin

AU - Ortiz, Andres

AU - Cruchaga, Carlos

AU - Farlow, Martin R.

AU - Xiong, Chengjie

AU - Graff-Radford, Neil R.

AU - Schofield, Peter R.

AU - Masters, Colin L.

AU - Salloway, Stephen

AU - Jucker, Mathias

AU - Mori, Hiroshi

AU - Levin, Johannes

AU - Gorriz, Juan M.

AU - (DIAN), Dominantly Inherited Alzheimer Network

PY - 2020/6

Y1 - 2020/6

N2 - Despite subjects with Dominantly-Inherited Alzheimer's Disease (DIAD) represent less than 1% of all Alzheimer's Disease (AD) cases, the Dominantly Inherited Alzheimer Network (DIAN) initiative constitutes a strong impact in the understanding of AD disease course with special emphasis on the presyptomatic disease phase. Until now, the 3 genes involved in DIAD pathogenesis (PSEN1, PSEN2 and APP) have been commonly merged into one group (Mutation Carriers, MC) and studied using conventional statistical analysis. Comparisons between groups using null-hypothesis testing or longitudinal regression procedures, such as the linear-mixed-effects models, have been assessed in the extant literature. Within this context, the work presented here performs a comparison between different groups of subjects by considering the 3 genes, either jointly or separately, and using tools based on Machine Learning (ML). This involves a feature selection step which makes use of ANOVA followed by Principal Component Analysis (PCA) to determine which features would be realiable for further comparison purposes. Then, the selected predictors are classified using a Support-Vector-Machine (SVM) in a nested k-Fold cross-validation resulting in maximum classification rates of 72–74% using PiB PET features, specially when comparing asymptomatic Non-Carriers (NC) subjects with asymptomatic PSEN1 Mutation-Carriers (PSEN1-MC). Results obtained from these experiments led to the idea that PSEN1-MC might be considered as a mixture of two different subgroups including: a first group whose patterns were very close to NC subjects, and a second group much more different in terms of imaging patterns. Thus, using a k-Means clustering algorithm it was determined both subgroups and a new classification scenario was conducted to validate this process. The comparison between each subgroup vs. NC subjects resulted in classification rates around 80% underscoring the importance of considering DIAN as an heterogeneous entity.

AB - Despite subjects with Dominantly-Inherited Alzheimer's Disease (DIAD) represent less than 1% of all Alzheimer's Disease (AD) cases, the Dominantly Inherited Alzheimer Network (DIAN) initiative constitutes a strong impact in the understanding of AD disease course with special emphasis on the presyptomatic disease phase. Until now, the 3 genes involved in DIAD pathogenesis (PSEN1, PSEN2 and APP) have been commonly merged into one group (Mutation Carriers, MC) and studied using conventional statistical analysis. Comparisons between groups using null-hypothesis testing or longitudinal regression procedures, such as the linear-mixed-effects models, have been assessed in the extant literature. Within this context, the work presented here performs a comparison between different groups of subjects by considering the 3 genes, either jointly or separately, and using tools based on Machine Learning (ML). This involves a feature selection step which makes use of ANOVA followed by Principal Component Analysis (PCA) to determine which features would be realiable for further comparison purposes. Then, the selected predictors are classified using a Support-Vector-Machine (SVM) in a nested k-Fold cross-validation resulting in maximum classification rates of 72–74% using PiB PET features, specially when comparing asymptomatic Non-Carriers (NC) subjects with asymptomatic PSEN1 Mutation-Carriers (PSEN1-MC). Results obtained from these experiments led to the idea that PSEN1-MC might be considered as a mixture of two different subgroups including: a first group whose patterns were very close to NC subjects, and a second group much more different in terms of imaging patterns. Thus, using a k-Means clustering algorithm it was determined both subgroups and a new classification scenario was conducted to validate this process. The comparison between each subgroup vs. NC subjects resulted in classification rates around 80% underscoring the importance of considering DIAN as an heterogeneous entity.

KW - Alzheimer's disease (AD)

KW - DIAN

KW - Dominantly-inherited Alzheimer's disease (DIAD)

KW - Machine learning

KW - Neuroimaging

UR - http://www.scopus.com/inward/record.url?scp=85077807763&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85077807763&partnerID=8YFLogxK

U2 - 10.1016/j.inffus.2020.01.001

DO - 10.1016/j.inffus.2020.01.001

M3 - Article

AN - SCOPUS:85077807763

VL - 58

SP - 153

EP - 167

JO - Information Fusion

JF - Information Fusion

SN - 1566-2535

ER -