Ensemble learning for poor prognosis predictions: a case study on SARS-CoV2

Authors: Wu, Honghan ;

Zhang, Huayu;

Karwath, Andreas;

Ibrahim, Zina;

Shi, Ting ;

Zhang, Xin ;

Wang, Kun;

Sun, Jiaxing;

Dhaliwal, Kevin ;

Bean, Daniel;

Cardoso, Victor Roth;

Li, Kezhi;

Teo, James T.;

Banerjee, Amitava ;

Gao-Smith, Fang;

Whitehouse, Tony;

Veenith, Tonny;

Gkoutos, Georgios V.;

Wu, Xiaodong;

Dobson, Richard;

Guthrie, Bruce

Source: JAMIA: A Scholarly Journal of Informatics in Health and Biomedicine

Full text

: https://doi.org/10.1093/jamia/ocaa295

Abstract

OBJECTIVE: Risk prediction models are widely used to inform evidence-based clinical decision making. However, few models developed from single cohorts can perform consistently well at population level where diverse prognoses exist (such as the SARS-CoV2 pandemic). This study aims at tackling this challenge by synergising prediction models from the literature using ensemble learning.

MATERIALS AND METHODS: In this study we selected and reimplemented seven prediction models for COVID-19, which were derived from diverse cohorts and used different implementation techniques. A novel ensemble learning framework was proposed to synergise them for realising personalised predictions for individual patients. Four diverse international cohorts (2 from the UK and 2 from China; total N=5,394) were used to validate all eight models on discrimination, calibration and clinical usefulness.

RESULTS: Results showed that individual prediction models could perform well on some cohorts while poorly on others. Conversely, the ensemble model achieved the best performances consistently on all metrics quantifying discrimination, calibration and clinical usefulness. Performance disparities were observed in cohorts from the two countries: all models achieved better performances on the China cohorts.

DISCUSSION: When individual models were learned from complementary cohorts, the synergised model will have the potential to achieve synergised performances. Results indicate that blood parameters and physiological measurements might have better predictive powers when collected early, which remains to be confirmed by further studies.

CONCLUSIONS: Combining a diverse set of individual prediction models, ensemble method can synergise a robust and well-performing model by choosing the most competent ones for individual patients.

Rights

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Cite as

Wu, H., Zhang, H., Karwath, A., Ibrahim, Z., Shi, T., Zhang, X., Wang, K., Sun, J., Dhaliwal, K., Bean, D., Cardoso, V., Li, K., Teo, J., Banerjee, A., Gao-Smith, F., Whitehouse, T., Veenith, T., Gkoutos, G., Wu, X., Dobson, R. & Guthrie, B. 2020, 'Ensemble learning for poor prognosis predictions: a case study on SARS-CoV2', JAMIA: A Scholarly Journal of Informatics in Health and Biomedicine, 28(4), pp. 791-800. https://doi.org/10.1093/jamia/ocaa295

Downloadable citations

HTML BIB RIS

Identifiers

DOI: https://doi.org/10.1093/jamia/ocaa295

Repository URI: https://www.research.ed.ac.uk/en/publications/3f2e8045-5d12-4423-9bc7-53d5715cccb1