基于机器学习的慢阻肺患者再入院预测研究

吴菊华; 郑稳; 聂亚; 陶雷

doi:10.12052/gdutxb.230206

基于机器学习的慢阻肺患者再入院预测研究

Readmission Prediction for Patients with Chronic Obstructive Pulmonary Disease Based on Machine Learning

摘要

摘要: 由于慢性阻塞性肺病(简称慢阻肺)的高复发性，患者计划外再入院问题已成为严峻挑战。本文提出融合不同结构化数据和多种机器学习算法进行风险预测的框架和方法，并以广州某三甲医院近万名慢阻肺患者的真实电子病历数据进行演示。通过构建双向长短期记忆条件随机场命名实体识别模型处理非结构化信息，使用支持向量机、随机森林、极限梯度提升机和反向传播神经网络构建风险预测模型，发现极限梯度提升机模型的预测性能最佳，以及住院时长、查尔森合并症指数、病程、白细胞和嗜酸性粒细胞是再入院最重要的影响因素。本文研究丰富了慢阻肺的相关知识，并为其早期发现、及时诊断和精准干预提供了研究思路和辅助决策工具。

Abstract: Due to the high recurrence rate of chronic obstructive pulmonary disease (COPD), the issue of unplanned readmissions has become a significant challenge for patients. In this research, a framework and a methodology are proposed that integrate different structured data and multiple machine learning algorithms for risk prediction. A method is showed using genuine electronic medical information from approximately 10 000 COPD patients at a tertiary hospital in Guangzhou, China. To handle unstructured input, a Bidirectional Long Short-Term Memory-Conditional Random Field (BiLSTM-CRF) model known as named entity recognition is used. Furthermore, risk prediction models with Support Vector Machines (SVM), Random Forests (RF), Extreme Gradient Boosting (XGBoost), and Back Propagation (BP) Neural Network are developed. The results show that the XGBoost model performs best. The length of hospital stay, Charlson Comorbidity Index, disease duration, white blood cell count, and eosinophil count are also identified as the most relevant predictors for readmission. An understanding of chronic diseases is advanced by providing research insights and decision support tools for early detection, prompt diagnosis, and precise intervention.

HTML全文

参考文献(38)

施引文献

资源附件(0)