Fig. 5

The plot showing the top 20 important features for diabetes identified by the random forest (RF) model according to Gini coefficients. Dark blue and orange columns represent RF model built on similar samples selected according to the CCS-based similarity and randomly selected samples, respectively