You are viewing the site in preview mode

Skip to main content

Table 1 The basic characteristics of samples in the test set and training set

From: Measurement and application of patient similarity in personalized predictive modeling based on electronic medical records

Characteristic Test set (n = 6490) Training set (n = 10,000) P value#
Male gender, n (%) 4387 (67.6%) 6838 (68.4%) 0.282
Age (years), mean ± SD 60.1 ± 14.7 60.1 ± 15.0 0.967
Myocardial infarction, n (%) 443 (6.8%) 656 (6.6%) 0.615
Congestive heart failure, n (%) 507 (7.8%) 795 (8.0%) 0.642
Chronic obstructive pulmonary disease, n (%) 288 (4.4%) 467 (4.7%) 0.368
Mild liver disease, n (%) 799 (12.3%) 1301 (13.0%) 0.188
Hypertension, n (%) 3501 (53.9%) 5389 (53.9%) 0.950
Coronary heart disease, n (%) 2206 (34.0%) 3331 (33.3%) 0.366
Serum glucose (mmol/L), mean ± SD 6.6 ± 2.9 6.7 ± 2.9 0.793
Abnormal urine glucose, n (%) 1222 (18.8%) 1884 (18.8%) 0.987
  1. #Pearson’s χ2 test for nominal variables and T-test for scale variables
  2. SD standard deviation