Aim: To construct predictive models of diabetes complications (DCs) by big data machine learning, based on electronic medical records. Methods: Six groups of DCs were considered: eye complications, cardiovascular, cerebrovascular, and peripheral vascular disease, nephropathy, diabetic neuropathy. A supervised, tree-based learning approach (XGBoost) was used to predict the onset of each complication within 5 years (task 1). Furthermore, a separate prediction for early (within 2 years) and late (3–5 years) onset of complication (task 2) was performed. A dataset of 147.664 patients seen during 15 years by 23 centers was used. External validation was performed in five additional centers. Models were evaluated by considering accuracy, sensitivity, specificity, and area under the ROC curve (AUC). Results: For all DCs considered, the predictive models in task 1 showed an accuracy > 70 %, and AUC largely exceeded 0.80, reaching 0.97 for nephropathy. For task 2, all predictive models showed an accuracy > 70 % and an AUC > 0.85. Sensitivity in predicting the early occurrence of the complication ranged between 83.2 % (peripheral vascular disease) and 88.5 % (nephropathy). Conclusions: Machine learning approach offers the opportunity to identify patients at greater risk of complications. This can help overcoming clinical inertia and improving the quality of diabetes care.
Prediction of complications of type 2 Diabetes: A Machine learning approach
Romeo L.;Frontoni E.;
2022-01-01
Abstract
Aim: To construct predictive models of diabetes complications (DCs) by big data machine learning, based on electronic medical records. Methods: Six groups of DCs were considered: eye complications, cardiovascular, cerebrovascular, and peripheral vascular disease, nephropathy, diabetic neuropathy. A supervised, tree-based learning approach (XGBoost) was used to predict the onset of each complication within 5 years (task 1). Furthermore, a separate prediction for early (within 2 years) and late (3–5 years) onset of complication (task 2) was performed. A dataset of 147.664 patients seen during 15 years by 23 centers was used. External validation was performed in five additional centers. Models were evaluated by considering accuracy, sensitivity, specificity, and area under the ROC curve (AUC). Results: For all DCs considered, the predictive models in task 1 showed an accuracy > 70 %, and AUC largely exceeded 0.80, reaching 0.97 for nephropathy. For task 2, all predictive models showed an accuracy > 70 % and an AUC > 0.85. Sensitivity in predicting the early occurrence of the complication ranged between 83.2 % (peripheral vascular disease) and 88.5 % (nephropathy). Conclusions: Machine learning approach offers the opportunity to identify patients at greater risk of complications. This can help overcoming clinical inertia and improving the quality of diabetes care.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.