Cross–scanner harmonization methods for structural MRI may need further work: A comparison study

Robel K. Gebre; Matthew L. Senjem; Sheelakumari Raghavan; Christopher G. Schwarz; Jeffery L. Gunter; Ekaterina I. Hofrenning; Robert I. Reid; Kejal Kantarci; Jonathan Graff-Radford; David S. Knopman; Ronald C. Petersen; Clifford R. Jack; Prashanthi Vemuri

doi:10.1016/j.neuroimage.2023.119912

Cross–scanner harmonization methods for structural MRI may need further work: A comparison study

Robel K. Gebre, Matthew L. Senjem, Sheelakumari Raghavan, Christopher G. Schwarz, Jeffery L. Gunter, Ekaterina I. Hofrenning, Robert I. Reid, Kejal Kantarci, Jonathan Graff-Radford, David S. Knopman, Ronald C. Petersen, Clifford R. Jack, Prashanthi Vemuri

Research output: Contribution to journal › Article › peer-review

Abstract

The clinical usefulness MRI biomarkers for aging and dementia studies relies on precise brain morphological measurements; however, scanner and/or protocol variations may introduce noise or bias. One approach to address this is post-acquisition scan harmonization. In this work, we evaluate deep learning (neural style transfer, CycleGAN and CGAN), histogram matching, and statistical (ComBat and LongComBat) methods. Participants who had been scanned on both GE and Siemens scanners (cross-sectional participants, known as Crossover (n = 113), and longitudinally scanned participants on both scanners (n = 454)) were used. The goal was to match GE MPRAGE (T1-weighted) scans to Siemens improved resolution MPRAGE scans. Harmonization was performed on raw native and preprocessed (resampled, affine transformed to template space) scans. Cortical thicknesses were measured using FreeSurfer (v.7.1.1). Distributions were checked using Kolmogorov-Smirnov tests. Intra-class correlation (ICC) was used to assess the degree of agreement in the Crossover datasets and annualized percent change in cortical thickness was calculated to evaluate the Longitudinal datasets. Prior to harmonization, the least agreement was found at the frontal pole (ICC = 0.72) for the raw native scans, and at caudal anterior cingulate (0.76) and frontal pole (0.54) for the preprocessed scans. Harmonization with NST, CycleGAN, and HM improved the ICCs of the preprocessed scans at the caudal anterior cingulate (>0.81) and frontal poles (>0.67). In the Longitudinal raw native scans, over- and under-estimations of cortical thickness were observed due to the changing of the scanners. ComBat matched the cortical thickness distributions throughout but was not able to increase the ICCs or remove the effects of scanner changeover in the Longitudinal datasets. CycleGAN and NST performed slightly better to address the cortical thickness variations between scanner change. However, none of the methods succeeded in harmonizing the Longitudinal dataset. CGAN was the worst performer for both datasets. In conclusion, the performance of the methods was overall similar and region dependent. Future research is needed to improve the existing approaches since none of them outperformed each other in terms of harmonizing the datasets at all ROIs. The findings of this study establish framework for future research into the scan harmonization problem.

Original language	English (US)
Article number	119912
Journal	NeuroImage
Volume	269
DOIs	https://doi.org/10.1016/j.neuroimage.2023.119912
State	Published - Apr 1 2023

Keywords

Deep learning
Scan harmonization
Structural MRI

ASJC Scopus subject areas

Neurology
Cognitive Neuroscience

Access to Document

10.1016/j.neuroimage.2023.119912

Cite this

Gebre, R. K., Senjem, M. L., Raghavan, S., Schwarz, C. G., Gunter, J. L., Hofrenning, E. I., Reid, R. I., Kantarci, K., Graff-Radford, J., Knopman, D. S., Petersen, R. C., Jack, C. R., & Vemuri, P. (2023). Cross–scanner harmonization methods for structural MRI may need further work: A comparison study. NeuroImage, 269, Article 119912. https://doi.org/10.1016/j.neuroimage.2023.119912

@article{9129ae70306b4d5b93c4b82ff6c1969a,

title = "Cross–scanner harmonization methods for structural MRI may need further work: A comparison study",

abstract = "The clinical usefulness MRI biomarkers for aging and dementia studies relies on precise brain morphological measurements; however, scanner and/or protocol variations may introduce noise or bias. One approach to address this is post-acquisition scan harmonization. In this work, we evaluate deep learning (neural style transfer, CycleGAN and CGAN), histogram matching, and statistical (ComBat and LongComBat) methods. Participants who had been scanned on both GE and Siemens scanners (cross-sectional participants, known as Crossover (n = 113), and longitudinally scanned participants on both scanners (n = 454)) were used. The goal was to match GE MPRAGE (T1-weighted) scans to Siemens improved resolution MPRAGE scans. Harmonization was performed on raw native and preprocessed (resampled, affine transformed to template space) scans. Cortical thicknesses were measured using FreeSurfer (v.7.1.1). Distributions were checked using Kolmogorov-Smirnov tests. Intra-class correlation (ICC) was used to assess the degree of agreement in the Crossover datasets and annualized percent change in cortical thickness was calculated to evaluate the Longitudinal datasets. Prior to harmonization, the least agreement was found at the frontal pole (ICC = 0.72) for the raw native scans, and at caudal anterior cingulate (0.76) and frontal pole (0.54) for the preprocessed scans. Harmonization with NST, CycleGAN, and HM improved the ICCs of the preprocessed scans at the caudal anterior cingulate (>0.81) and frontal poles (>0.67). In the Longitudinal raw native scans, over- and under-estimations of cortical thickness were observed due to the changing of the scanners. ComBat matched the cortical thickness distributions throughout but was not able to increase the ICCs or remove the effects of scanner changeover in the Longitudinal datasets. CycleGAN and NST performed slightly better to address the cortical thickness variations between scanner change. However, none of the methods succeeded in harmonizing the Longitudinal dataset. CGAN was the worst performer for both datasets. In conclusion, the performance of the methods was overall similar and region dependent. Future research is needed to improve the existing approaches since none of them outperformed each other in terms of harmonizing the datasets at all ROIs. The findings of this study establish framework for future research into the scan harmonization problem.",

keywords = "Deep learning, Scan harmonization, Structural MRI",

author = "Gebre, {Robel K.} and Senjem, {Matthew L.} and Sheelakumari Raghavan and Schwarz, {Christopher G.} and Gunter, {Jeffery L.} and Hofrenning, {Ekaterina I.} and Reid, {Robert I.} and Kejal Kantarci and Jonathan Graff-Radford and Knopman, {David S.} and Petersen, {Ronald C.} and Jack, {Clifford R.} and Prashanthi Vemuri",

note = "Publisher Copyright: {\textcopyright} 2023",

year = "2023",

month = apr,

day = "1",

doi = "10.1016/j.neuroimage.2023.119912",

language = "English (US)",

volume = "269",

journal = "NeuroImage",

issn = "1053-8119",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Cross–scanner harmonization methods for structural MRI may need further work

T2 - A comparison study

AU - Gebre, Robel K.

AU - Senjem, Matthew L.

AU - Raghavan, Sheelakumari

AU - Schwarz, Christopher G.

AU - Gunter, Jeffery L.

AU - Hofrenning, Ekaterina I.

AU - Reid, Robert I.

AU - Kantarci, Kejal

AU - Graff-Radford, Jonathan

AU - Knopman, David S.

AU - Petersen, Ronald C.

AU - Jack, Clifford R.

AU - Vemuri, Prashanthi

PY - 2023/4/1

Y1 - 2023/4/1

N2 - The clinical usefulness MRI biomarkers for aging and dementia studies relies on precise brain morphological measurements; however, scanner and/or protocol variations may introduce noise or bias. One approach to address this is post-acquisition scan harmonization. In this work, we evaluate deep learning (neural style transfer, CycleGAN and CGAN), histogram matching, and statistical (ComBat and LongComBat) methods. Participants who had been scanned on both GE and Siemens scanners (cross-sectional participants, known as Crossover (n = 113), and longitudinally scanned participants on both scanners (n = 454)) were used. The goal was to match GE MPRAGE (T1-weighted) scans to Siemens improved resolution MPRAGE scans. Harmonization was performed on raw native and preprocessed (resampled, affine transformed to template space) scans. Cortical thicknesses were measured using FreeSurfer (v.7.1.1). Distributions were checked using Kolmogorov-Smirnov tests. Intra-class correlation (ICC) was used to assess the degree of agreement in the Crossover datasets and annualized percent change in cortical thickness was calculated to evaluate the Longitudinal datasets. Prior to harmonization, the least agreement was found at the frontal pole (ICC = 0.72) for the raw native scans, and at caudal anterior cingulate (0.76) and frontal pole (0.54) for the preprocessed scans. Harmonization with NST, CycleGAN, and HM improved the ICCs of the preprocessed scans at the caudal anterior cingulate (>0.81) and frontal poles (>0.67). In the Longitudinal raw native scans, over- and under-estimations of cortical thickness were observed due to the changing of the scanners. ComBat matched the cortical thickness distributions throughout but was not able to increase the ICCs or remove the effects of scanner changeover in the Longitudinal datasets. CycleGAN and NST performed slightly better to address the cortical thickness variations between scanner change. However, none of the methods succeeded in harmonizing the Longitudinal dataset. CGAN was the worst performer for both datasets. In conclusion, the performance of the methods was overall similar and region dependent. Future research is needed to improve the existing approaches since none of them outperformed each other in terms of harmonizing the datasets at all ROIs. The findings of this study establish framework for future research into the scan harmonization problem.

AB - The clinical usefulness MRI biomarkers for aging and dementia studies relies on precise brain morphological measurements; however, scanner and/or protocol variations may introduce noise or bias. One approach to address this is post-acquisition scan harmonization. In this work, we evaluate deep learning (neural style transfer, CycleGAN and CGAN), histogram matching, and statistical (ComBat and LongComBat) methods. Participants who had been scanned on both GE and Siemens scanners (cross-sectional participants, known as Crossover (n = 113), and longitudinally scanned participants on both scanners (n = 454)) were used. The goal was to match GE MPRAGE (T1-weighted) scans to Siemens improved resolution MPRAGE scans. Harmonization was performed on raw native and preprocessed (resampled, affine transformed to template space) scans. Cortical thicknesses were measured using FreeSurfer (v.7.1.1). Distributions were checked using Kolmogorov-Smirnov tests. Intra-class correlation (ICC) was used to assess the degree of agreement in the Crossover datasets and annualized percent change in cortical thickness was calculated to evaluate the Longitudinal datasets. Prior to harmonization, the least agreement was found at the frontal pole (ICC = 0.72) for the raw native scans, and at caudal anterior cingulate (0.76) and frontal pole (0.54) for the preprocessed scans. Harmonization with NST, CycleGAN, and HM improved the ICCs of the preprocessed scans at the caudal anterior cingulate (>0.81) and frontal poles (>0.67). In the Longitudinal raw native scans, over- and under-estimations of cortical thickness were observed due to the changing of the scanners. ComBat matched the cortical thickness distributions throughout but was not able to increase the ICCs or remove the effects of scanner changeover in the Longitudinal datasets. CycleGAN and NST performed slightly better to address the cortical thickness variations between scanner change. However, none of the methods succeeded in harmonizing the Longitudinal dataset. CGAN was the worst performer for both datasets. In conclusion, the performance of the methods was overall similar and region dependent. Future research is needed to improve the existing approaches since none of them outperformed each other in terms of harmonizing the datasets at all ROIs. The findings of this study establish framework for future research into the scan harmonization problem.

KW - Deep learning

KW - Scan harmonization

KW - Structural MRI

UR - http://www.scopus.com/inward/record.url?scp=85147380763&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85147380763&partnerID=8YFLogxK

U2 - 10.1016/j.neuroimage.2023.119912

DO - 10.1016/j.neuroimage.2023.119912

M3 - Article

C2 - 36731814

AN - SCOPUS:85147380763

SN - 1053-8119

VL - 269

JO - NeuroImage

JF - NeuroImage

M1 - 119912

ER -

Cross–scanner harmonization methods for structural MRI may need further work: A comparison study

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this