Mass spectrometry imaging (MSI) is an emerging technology that holds potential for improving, biomarker discovery, metabolomics research, pharmaceutical applications and clinical diagnosis. Despite many solutions being developed, the large data size and high dimensional nature of MSI, especially 3D datasets, still pose computational and memory complexities that hinder accurate identification of biologically relevant molecular patterns. Moreover, the subjectivity in the selection of parameters for conventional pre-processing approaches can lead to bias. Therefore, we assess if a probabilistic generative model based on a fully connected variational autoencoder can be used for unsupervised analysis and peak learning of MSI data to uncover hidden structures. The resulting msiPL method learns and visualizes the underlying non-linear spectral manifold, revealing biologically relevant clusters of tissue anatomy in a mouse kidney and tumor heterogeneity in human prostatectomy tissue, colorectal carcinoma, and glioblastoma mouse model, with identification of underlying m/z peaks. The method is applied for the analysis of MSI datasets ranging from 3.3 to 78.9 GB, without prior pre-processing and peak picking, and acquired using different mass spectrometers at different centers.
ASJC Scopus subject areas
- Biochemistry, Genetics and Molecular Biology(all)
- Physics and Astronomy(all)