Abstract
With severe acute respiratory syndrome coronavirus 2 spreading globally and causing 2019 coronavirus disease (COVID-19), a challenge that we unprepared for was about how to optimally plan and distribute limited top-medical resources for patients in need of urgent care. To address this challenge, physicians desperately needed a scientific tool to methodically differentiate between cases with varying severity. In this study, the unique data of COVID-19 intensive care unit (ICU) patients provided by the national medical team in Wuhan were classified into discrete and continuous variable types. All continuous data were discretized using an entropy-based method and transformed into serial information margins, in which each information margin is related to a specific symptom or clinical meaning. Finally, all these native and processed discrete data were used to configure a readable scorecard through logistic regression, which is the desired scientific tool aforementioned. A total of 322 ICU patients (age: [median: 64, interquartile range: 54–75], males: 178 [55.28%], and death: 72 [22.36%]) were included in the study. Probabilities of mortality in COVID-19 patients can be evaluated using a scorecard model (calibration slope: 1.343, Brier: 0.048, Dxy = 0.972, and population stability index = 0.071), with desired model performances (accuracy = 0.948, area under curve = 0.99, sensitivity = 1, and specificity = 0.939). This new model can interpret clinical meanings from complex data, and compare it with existing machine learning methods through a black-box mechanism. This new data-information model answers a critical question of how a computing algorithm produces clinically meaningful results that will help physicians logically allocate medical resources for COVID-19 patients. Notably, this tool has limitations, giving that this research is a retrospective study. Hopefully, this tool will be tested further and optimized for adaptation to similar clinical cases in the future.