Please donot adjust the margins
Parkinson's disease (PD) is a chronic, progressive neurodegenerative condition that affects approximately 2% to 3% of people over
the age of 65 [1]. Its rising prevalence among the elderly has made it one of the fastest-growing neurodegenerative diseases, exerting a
heavy social burden [2]. PD is characterized by a reduction in striatal dopamine, and a selective degeneration of dopaminergic neurons
in the substantial nigra pars compacta (SNpc), accompanied by intraneuronal proteinaceous inclusions known as Lewy bodies [3].
Common motor symptoms of PD include postural instability, bradykinesia, muscle rigidity, and resting tremor. Along with that, PD
patients exhibit non-motor symptoms like dementia, depression, and anxiety [4]. PD is diagnosed primarily by clinical examinations,
medical histories, and responses to dopaminergic treatment, resulting in a high rate of misdiagnosis in clinical practice [5]. Exploring
reliable, specific, and highly predictive biomarkers is, therefore, critical for improving PD diagnosis and developing effective PD
treatments and prevention strategies [6].
Metabolomics, as a powerful phenotyping technique, can detect thousands of features and determine global changes associated with
illness states, as well as identify diagnostic and/or predictive indicators of disease progression [7,8]. Emerging evidence has linked
metabolic dysfunctions to the development and progression of PD [9]. Previous studies indicated the involvement of alterations in
redox metabolism and central carbon metabolism in PD [10,11]. Additionally, amino acids, bile acids, caffeine metabolites, and fatty
acids were assessed and found to be significantly altered in PD patients [12–17]. Lipidomics, as a subfield of metabolomics, has the
potential to provide new insights and answers that may enhance our ability to diagnose and track disease progression, predict critical
endpoints, and identify people at risk before they show symptoms. Recently, lipid abnormalities have also been implicated in many
aspects of PD pathology [18–20]. A combination of metabolomics and lipidomics that can encompass a broad range of compounds will
largely facilitate the detection of biomarkers for disease. To date, there are relatively few studies that have combined untargeted and
targeted metabolomics and lipidomics with machine learning to discover a unique metabolic pattern to enhance the diagnostic
capability of PD.
In the present study, we investigated metabolic changes in the plasma and identified diagnostic biomarkers associated with PD in
two well-characterized cohorts composed of 288 subjects using untargeted metabolomics and lipidomics approaches based on LC-MS.
To improve the reliability of the discovered biomarkers, we used an ensemble-based algorithm consisting of random forest (RF),
support vector machine (SVM), and least absolute shrinkage and selection operator (LASSO) to develop prediction models based on
potential metabolic biomarkers for the discrimination of PD patients from healthy subjects. Multi-algorithm ensemble learning
improved feature selection accuracy by combining complementary algorithms. A panel of 6 plasma markers showed a good
performance in distinguishing PD patients, which had the potential to enhance PD diagnosis and might further contribute to future
investigation of the disease.
A total of 288 participants were enrolled and assigned to a discovery cohort (N = 190, 99 PD subjects and 91 healthy controls) and a
validation cohort (N = 98, 44 PD subjects and 54 healthy controls). Table S2 (Supporting information) details the demographics and
clinical characteristics of the recruited participants. The age of PD participants ranged from 50 to 71 years. Control subjects in both
cohorts were between the ages of 47 and 73, with an average age of 57-61. The PD patients of the discovery and validation cohorts had
an average disease duration of 9.40 ± 6.36 and 7.60 ± 3.19 years, respectively. The proportion of male participants in both PD cohorts
was greater than that of female participants in the control group (M/F > 1.2). There is ample evidence indicating that the male
population has a PD prevalence rate 1.4-1.5 times higher than the females although the reasons for this are unknown [21–23].
According to the Hoehn and Yahr (H-Y) scale rating system, the H-Y stages of the patients in both PD groups ranged from 2 to 4 with
an average of 3.05 and 2.64, respectively. As demonstrated by a two-tailed Mann-Whitney U-test performed on the discovery and
validation cohorts, the gender and age distributions of subjects in each group did not differ significantly (P > 0.05) from each other.
LC-MS-based metabolomic and lipid profiling was performed on 190 plasma samples from PD patients and control subjects in
discovery cohort. There were 9129 accurate mass-retention time pairs for metabolites. We repeatedly tested the QC sample throughout
the analysis sequence to evaluate the quality of metabolomic data. The data from the QC sample were well clustered in the PCA scores
plot (Figs. S1A-1B in Supporting information). After instrumental analysis, peak alignment, and metabolite identification, 445
metabolites and lipids were identified. A PLS-DA model was developed to identify distinct metabolites in the plasma of PD patients.
The PLS-DA model provided a clear separation between the plasma metabolomes of PD and HC (Fig. 1A). The model was subjected
to a permutation test to ensure that it did not overfit (Fig. S2 in Supporting information). In PD patients, 67 metabolites and lipids were
found to be significantly altered (Fig. 1B, Table S3 in Supporting information). Specifically, these differential metabolites and lipids
comprised 28 glycerophospholipids, 20 organic acids, 6 ceramides, 4 fatty acyls, and 2 sphingolipids. These metabolites represent key
metabolic pathways involving alpha linolenic acid and linoleic acid metabolism, phenylacetate metabolism, citric acid cycle, etc.
according to the annotations in the KEGG database (Fig. 1C). In total, 1840 species of lipid were detected by LipidSearch, among
which 399 individual species covering 15 subclasses of lipids were confidently identified after manually correlating the results with the
LipidSearch output results. Over half of the lipids belong to GP category, including 103 phosphatidylcholines (PCs), 45
lysophosphatidylcholines (LPCs), 18 methyl phosphatidylcholines (MePCs), 24 phosphatidylethanolamines (PEs), 10
lysophosphatidylethanolamines (LPEs), 5 dimethylphosphatidylethanolamine (dMePE), 9 phosphatidylinositols (PIs). Approximately
one-fifth of the total lipids were GL category, consisting of 70 (TGs) and 15 diglycerides (DGs). Sphingolipids with 24 ceramides
(Cers), 7 glucosylceramides (CerGs), 48 sphingomyelins (SMs) compromised nearly 20% of the total 339 lipids. Besides, 4 acyl
carnitines (AcCas), 7 cholesterol esters (ChEs) were also identified (Fig. 1D).