A Speech-Based Hybrid Method for Parkinson’s Detection using Pearson Correlation and Mutual Information

Document Type : Original Article

Authors

1 Department of Information Technology, Faculty of Computers and Information, Kafr El-Sheikh University, Egypt

2 Information Technology dept., Faculty of Computers and Information, Menoufia University, Egypt

3 Department of Computer Science, Faculty of Computers and Information, Kafr El-Sheikh University, Kafr El-Sheikh 33511, Egypt

4 information technology department, faculty of computers and information

Abstract

Parkinson's disease (PD) is a chronic and progressive neurodegenerative disorder that affects movement. Studies have shown that speech difficulties can appear early in PD, suggesting their potential use as an early diagnostic indicator. Our proposed method investigated a hybrid approach for Parkinson’s detection based on Pearson Correlation (PC) and Mutual Information (MI). The approach combines PC and MI to identify the relevant features in the speech signals, utilizing these features for training five machine learning models, namely XGBoost, GBoost, CatBoost, AdaBoost, and LightGBM. Two datasets obtained from UCI repository were utilized for evaluation. To overcome the challenge of imbalanced classes in the datasets, synthetic minority oversampling technique (SMOTE) was implemented to achieve a more balanced representation. The proposed PCMI approach selects 10 features from dataset1 and 55 features from dataset2. The results show that CatBoost with SMOTE and PCMI achieved an accuracy of 97.3% using hold-out method 75:25 and 97.2% using 10-fold CV method for dataset1, while LightGBM with SMOTE and PCMI approach achieved an accuracy of 95.6% using hold-out method 60:40 and 97.6% using 10-fold CV method for dataset2.

Keywords