Aim: To construct predictive models of diabetes complications (DCs) by big data machine learning, based on electronic medical records. Methods: Six groups of DCs were considered: eye complications, cardiovascular, cerebrovascular, and peripheral vascular disease, nephropathy, diabetic neuropathy. A supervised, tree-based learning approach (XGBoost) was used to predict the onset of each complication within 5 years (task 1). Furthermore, a separate prediction for early (within 2 years) and late (3–5 years) onset of complication (task 2) was performed. A dataset of 147.664 patients seen during 15 years by 23 centers was used. External validation was performed in five additional centers. Models were evaluated by considering accuracy, sensitivity, specificity, and area under the ROC curve (AUC). Results: For all DCs considered, the predictive models in task 1 showed an accuracy > 70 %, and AUC largely exceeded 0.80, reaching 0.97 for nephropathy. For task 2, all predictive models showed an accuracy > 70 % and an AUC > 0.85. Sensitivity in predicting the early occurrence of the complication ranged between 83.2 % (peripheral vascular disease) and 88.5 % (nephropathy). Conclusions: Machine learning approach offers the opportunity to identify patients at greater risk of complications. This can help overcoming clinical inertia and improving the quality of diabetes care.

Prediction of complications of type 2 Diabetes: A Machine learning approach

Romeo L.;Frontoni E.;
2022-01-01

Abstract

Aim: To construct predictive models of diabetes complications (DCs) by big data machine learning, based on electronic medical records. Methods: Six groups of DCs were considered: eye complications, cardiovascular, cerebrovascular, and peripheral vascular disease, nephropathy, diabetic neuropathy. A supervised, tree-based learning approach (XGBoost) was used to predict the onset of each complication within 5 years (task 1). Furthermore, a separate prediction for early (within 2 years) and late (3–5 years) onset of complication (task 2) was performed. A dataset of 147.664 patients seen during 15 years by 23 centers was used. External validation was performed in five additional centers. Models were evaluated by considering accuracy, sensitivity, specificity, and area under the ROC curve (AUC). Results: For all DCs considered, the predictive models in task 1 showed an accuracy > 70 %, and AUC largely exceeded 0.80, reaching 0.97 for nephropathy. For task 2, all predictive models showed an accuracy > 70 % and an AUC > 0.85. Sensitivity in predicting the early occurrence of the complication ranged between 83.2 % (peripheral vascular disease) and 88.5 % (nephropathy). Conclusions: Machine learning approach offers the opportunity to identify patients at greater risk of complications. This can help overcoming clinical inertia and improving the quality of diabetes care.
2022
Elsevier Ireland Ltd
Internazionale
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11393/301733
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 24
  • ???jsp.display-item.citation.isi??? 10
social impact