Diabetic Retinopathy (DR) is the most common and insidious microvascular complication of diabetes, and can progress asymptomatically until a sudden loss of vision occurs. Although DR is prevalent nowadays, its prevention remains challenging. The multiple aim of this study was to predict the risk of developing DR as diabetic complication (task 1) and, subsequently, temporally stratify the DR risk (task 2) using electronic health records data. To perform these objectives, a novel preprocessing procedure was designed to select both control and pathological patients, and moreover, a novel fully annotated/standardized 120K dataset from multiple diabetologic centers was provided. Globally, although the Extreme Gradient Boosting model offers satisfying predictive performance, the Random Forest model obtained the best predictive performance to solve task 1 and task 2, reaching the best Area Under the Precision-Recall Curve of 72.43 % and 84.38 %, respectively. Also the features importance extracted from the best Machine Learning (ML) models is provided. The proposed Artificial Intelligence-based solution was proven to be capable of generalizing across different diabetologic centers while ensuring high-interpretability. Moreover, the proposed ML solution is currently being adopted as a Clinical Decision Support System in several diabetologic centers for DR screening and follow-up purposes.

A Clinical Decision Support System to Stratify the Temporal Risk of Diabetic Retinopathy

Romeo L.;Frontoni E.
2021-01-01

Abstract

Diabetic Retinopathy (DR) is the most common and insidious microvascular complication of diabetes, and can progress asymptomatically until a sudden loss of vision occurs. Although DR is prevalent nowadays, its prevention remains challenging. The multiple aim of this study was to predict the risk of developing DR as diabetic complication (task 1) and, subsequently, temporally stratify the DR risk (task 2) using electronic health records data. To perform these objectives, a novel preprocessing procedure was designed to select both control and pathological patients, and moreover, a novel fully annotated/standardized 120K dataset from multiple diabetologic centers was provided. Globally, although the Extreme Gradient Boosting model offers satisfying predictive performance, the Random Forest model obtained the best predictive performance to solve task 1 and task 2, reaching the best Area Under the Precision-Recall Curve of 72.43 % and 84.38 %, respectively. Also the features importance extracted from the best Machine Learning (ML) models is provided. The proposed Artificial Intelligence-based solution was proven to be capable of generalizing across different diabetologic centers while ensuring high-interpretability. Moreover, the proposed ML solution is currently being adopted as a Clinical Decision Support System in several diabetologic centers for DR screening and follow-up purposes.
2021
Institute of Electrical and Electronics Engineers Inc.
Internazionale
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11393/316552
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 4
social impact