Abstract
Background
Heart failure (HF) is a leading cause of cardiac morbidity among women, whose risk
factors differ from those in men. We used machine-learning approaches to develop risk-
prediction models for incident HF in a cohort of postmenopausal women from the Women’s
Health Initiative (WHI).
Methods
We used 2 machine-learning methods—Least Absolute Shrinkage and Selection Operator
(LASSO) and Classification and Regression Trees (CART)—to perform variable selection
on 1227 baseline WHI variables for the primary outcome of incident HF. These variables
were then used to construct separate Cox proportional hazard models, and we compared
these results, using receiver-operating characteristic (ROC) curve analysis, against
a comparator model built using variables from the Atherosclerosis Risk in Communities
(ARIC) HF prediction model. We analyzed 43,709 women who had 2222 incident HF events;
median follow-up was 14.3 years.
Results
LASSO selected 10 predictors, and CART selected 11 predictors. The highest correlation
between selected variables was 0.46. In addition to selecting well-established predictors
such as age, myocardial infarction, and smoking, novel predictors included physical
function, number of pregnancies, number of previous live births and age at menopause.
In ROC analysis, the CART-derived model had the highest C-statistic of 0.83 (95% confidence
interval [CI], 0.81-0.85), followed by LASSO 0.82 (95% CI, 0.81-0.84) and ARIC 0.73
(95% CI, 0.70-0.76).
Conclusions
Machine-learning approaches can be used to develop HF risk-prediction models that
can have better discrimination compared with an established HF risk model and may
provide a basis for investigating novel HF predictors.
Résumé
Contexte
L'insuffisance cardiaque (IC) est une cause majeure de morbidité cardiaque chez les
femmes, dont les facteurs de risque diffèrent de ceux des hommes. Nous avons utilisé
des approches d'apprentissage automatique pour développer des modèles de prédiction
du risque d'insuffisance cardiaque dans une cohorte de femmes ménopausées de la Women's
Health Initiative (WHI).
Méthodes
Nous avons utilisé 2 méthodes d'apprentissage automatique—LASSO (Least Absolute Shrinkage and Selection Operator) et CART (Classification and Regression Trees)—pour effectuer une sélection de variables parmi les 1227 variables de base de la
WHI pour le critère primaire d'une IC incidente. Ces variables ont ensuite été utilisées
pour construire différents modèles à risque proportionnel de Cox, et nous avons comparé
ces résultats, par une analyse de la fonction d'efficacité du récepteur (ROC), à un
modèle de référence construit à partir de variables du modèle de prédiction de l'IC
issues de l'ARIC (Atherosclerosis Risk in Communities). Nous avons analysé le cas
de 43 709 femmes qui ont connus 2 222 épisodes d'IC; avec un suivi médian de 14,3
ans.
Résultats
La méthode LASSO a sélectionné 10 prédicteurs, et la méthode CART a sélectionné 11
prédicteurs. La corrélation la plus importante au sein des variables sélectionnées
était de 0,46. En plus de la sélection de prédicteurs bien établis tels que l'âge,
l'infarctus du myocarde et le tabagisme, de nouveaux prédicteurs comprenaient la capacité
physique, le nombre de grossesses, le nombre de naissances antérieures viables et
l'âge à la ménopause. Dans l'analyse de la courbe ROC, le modèle dérivé de CART présentait
la statistique C la plus élevée de 0,83 (intervalle de confiance [IC] à 95 %, 0,81-0,85),
suivi des modèles LASSO 0,82 (IC à 95 %, 0,81-0,84) et ARIC 0,73 (IC à 95 %, 0,70-0,76).
Conclusions
Les approches d'apprentissage automatique peuvent être utilisées pour développer des
modèles de prédiction du risque d'IC qui peuvent avoir une meilleure discrimination
par rapport à un modèle établi du risque d'IC et peuvent fournir une base pour étudier
de nouveaux prédicteurs d'IC.
To read this article in full you will need to make a payment
Purchase one-time access:
Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online accessOne-time access price info
- For academic or personal research use, select 'Academic and Personal'
- For corporate R&D use, select 'Corporate R&D Professionals'
Subscribe:
Subscribe to Canadian Journal of CardiologyAlready a print subscriber? Claim online access
Already an online subscriber? Sign in
Register: Create an account
Institutional Access: Sign in to ScienceDirect
References
- Heart disease and stroke statistics–2018 update: a report from the American Heart Association.Circulation. 2018; 137: 67-492
- Factors influencing the predictive power of models for predicting mortality and/or heart failure hospitalization in patients with heart failure.J Am Coll Cardiol HF. 2014; 2: 429-436
- Predictors of heart failure among women with coronary disease.Circulation. 2004; 110: 1424-1430
- Cardiovascular disease in women: clinical perspectives.Circ Res. 2016; 118: 1273-1293
- Prediction of incident heart failure in general practice: the Atherosclerosis Risk in Communities (ARIC) study.Circ Heart Fail. 2012; 5: 422-429
- Profile for estimating risk of heart failure.Arch Intern Med. 1999; 159: 1197-1204
- Incident heart failure prediction in the elderly: the health ABC heart failure score.Circ Heart Fail. 2008; 1: 125-133
- Comparison between neural networks and multiple logistic regression to predict acute coronary syndrome in the emergency room.Artif Intell Med. 2006; 38: 305-318
- Machine learning in medicine.Circulation. 2015; 132: 1920-1930
- Cardiovascular event prediction by machine learning: the Multi-Ethnic Study of Atherosclerosis.Circ Res. 2017; 121: 1092-1101
- Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function.J Biomed Inform. 2016; 60: 260-269
- Deep learning.Nature. 2015; 521: 436-444
- An Introduction to Statistical Learning: With Applications in R.Springer Publishing Co Inc., New York, NY2014: 430
- Development and validation of machine learning-based race-specific models to predict 10-year risk of heart failure: a multicohort analysis.Circulation. 2021; 143: 2370-2383
- Improving risk prediction in heart failure using machine learning.Eur J Heart Fail. 2020; 22: 139-147
- Machine learning to predict the risk of incident heart failure hospitalization among patients with diabetes: the WATCH-DM risk score.Diabetes Care. 2019; 42: 2298-2306
- Design of the Women’s Health Initiative clinical trial and observational study. The Women’s Health Initiative Study Group.Control Clin Trials. 1998; 19: 61-109
- Classification and Regression Trees.Wadsworth Publishing, Belmont, CA1984
- The Elements of Statistical Learning.Springer Publishing, New York, NY2001
- Potential biases in machine learning algorithms using electronic health record data.JAMA Intern Med. 2018; 178: 1544-1547
- Machine learning prediction of mortality and hospitalization in heart failure with preserved ejection fraction.J Am Coll Cardiol HF. 2020; 8: 12-21
- General cardiovascular risk profile for use in primary care: the Framingham Heart Study.Circulation. 2008; 117: 743-753
- Classification of heart failure in the Atherosclerosis Risk in Communities (ARIC) study: a comparison of diagnostic criteria.Circ Heart Fail. 2012; 5 (152-19)
- Age at menopause, reproductive history, and venous thromboembolism risk among postmenopausal women: the Women’s Health Initiative Hormone Therapy clinical trials.Menopause. 2014; 21: 214-220
- Risk factors for incident hospitalized heart failure with preserved versus reduced ejection fraction in a multiracial cohort of postmenopausal women.Circ Heart Fail. 2016; 9e002883
- Reproductive factors and incidence of heart failure hospitalization in the Women’s Health Initiative.J Am Coll Cardiol. 2017; 69: 2517-2526
- Epidemiology of heart failure with preserved ejection fraction.Nat Rev Cardiol. 2017; 14: 591-602
- Predicting heart failure with preserved and reduced ejection fraction: the international collaboration on heart failure subtypes.Circ Heart Fail. 2016; 9: 10
- Number of pregnancies and the subsequent risk of cardiovascular disease.N Engl J Med. 1993; 328: 1528-1533
- Risk of cardiovascular disease among postmenopausal women with prior pregnancy loss: the Women’s Health Initiative.Ann Fam Med. 2014; 12: 302-309
- Reproductive risk factors and coronary heart disease in the Women’s Health Initiative observational study.Circulation. 2016; 133: 2149-2158
- Pregnancy loss and risk of ischaemic stroke and myocardial infarction.Br J Haematol. 2016; 174: 302-309
- Indicators of lifetime endogenous estrogen exposure and risk of venous thromboembolism.J Thromb Haemost. 2006; 4: 71-76
Article info
Publication history
Published online: August 12, 2021
Accepted:
August 4,
2021
Received:
October 8,
2019
Footnotes
See page 1713 for disclosure information.
Identification
Copyright
© 2021 Canadian Cardiovascular Society. Published by Elsevier Inc. All rights reserved.