TY - JOUR
T1 - Enhancing the prediction of type 2 diabetes mellitus using sparse balanced SVM
AU - Shrestha, Bibek
AU - Alsadoon, Abeer
AU - Prasad, P. W. C.
AU - Al-Naymat, Ghazi
AU - Al-Dala’in, Thair
AU - Rashid, Tarik A.
AU - Alsadoon, Omar Hisham
PY - 2022
Y1 - 2022
N2 - The natural population-based prediction of type 2 diabetes is costly since it needs a high number of resources. Even though much research has used machine learning algorithms to predict type II diabetes, it could not obtain a sufficient sensitivity range due to imbalanced and sparse data. This research aims to utilize noninvasive features from electronic health records with a machine-learning algorithm, namely Sparse Balance- Support Vector Machine (SB-SVM), to handle the imbalanced data and achieve high precision. The proposed system uses SB-SVM to create sparsity and implicitly to select the highest relevant features from the imbalanced data. Initially, we preprocess the data using different baseline variables and filters. Secondly, different features are extracted from the preprocessed data using inclusion and exclusion criteria as filters. Thirdly, we selected 12 highly relevant features to diabetes prediction using statistical analysis and logistic regression. Then, we train and test the proposed model using the nested stratified cross-validation method. Finally, the optimal model performance is evaluated based on the test set. The proposed model predicts type 2 diabetes mellitus using the noninvasive features, with enhanced sensitivity and less processing time. Our solution outperforms the state-of-the-art in most performance metrics. Accuracy, precision, recall, and Area Under the Curve (AUC) of the best solution are 67.22%, 62.93%, 69.96%, and 69.96%, respectively. In comparison, our solution achieved Accuracy, precision, recall, and AUC of 76.39%, 66.86%, 76.74%, and 85.08%, respectively. The average processing time is decreased from 40 ~ 85 folds/sec to 8.9 ~ 10.7 folds/sec. To conclude, the proposed system improves the precision and sensitivity of diabetes prediction with minimal processing time.
AB - The natural population-based prediction of type 2 diabetes is costly since it needs a high number of resources. Even though much research has used machine learning algorithms to predict type II diabetes, it could not obtain a sufficient sensitivity range due to imbalanced and sparse data. This research aims to utilize noninvasive features from electronic health records with a machine-learning algorithm, namely Sparse Balance- Support Vector Machine (SB-SVM), to handle the imbalanced data and achieve high precision. The proposed system uses SB-SVM to create sparsity and implicitly to select the highest relevant features from the imbalanced data. Initially, we preprocess the data using different baseline variables and filters. Secondly, different features are extracted from the preprocessed data using inclusion and exclusion criteria as filters. Thirdly, we selected 12 highly relevant features to diabetes prediction using statistical analysis and logistic regression. Then, we train and test the proposed model using the nested stratified cross-validation method. Finally, the optimal model performance is evaluated based on the test set. The proposed model predicts type 2 diabetes mellitus using the noninvasive features, with enhanced sensitivity and less processing time. Our solution outperforms the state-of-the-art in most performance metrics. Accuracy, precision, recall, and Area Under the Curve (AUC) of the best solution are 67.22%, 62.93%, 69.96%, and 69.96%, respectively. In comparison, our solution achieved Accuracy, precision, recall, and AUC of 76.39%, 66.86%, 76.74%, and 85.08%, respectively. The average processing time is decreased from 40 ~ 85 folds/sec to 8.9 ~ 10.7 folds/sec. To conclude, the proposed system improves the precision and sensitivity of diabetes prediction with minimal processing time.
UR - https://hdl.handle.net/1959.7/uws:77127
U2 - 10.1007/s11042-022-13087-5
DO - 10.1007/s11042-022-13087-5
M3 - Article
SN - 1380-7501
VL - 81
SP - 38945
EP - 38969
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 27
ER -