Table 3

Comparison of regression and machine learning approaches to clinical prediction

Regression methodsMachine learning methods
Informed by assumptions, background knowledge and theory.Exploratory, data-driven, automatically learns from data.
Typically use a small number of variables to predict probability of an outcome.May be more suited to handling a large number of predictors in data with high signal-to-noise ratio.
Mainly linear effect of variables on outcome.More flexible, captures non-linear associations and interactions between variables, strategies required to reduce overfitting.
Provide clinically informative relationships between variables and outcome, allows, for example, consideration of counterfactuals.Limited clinical interpretability, ‘black-box’ algorithms may lack face validity for clinicians, especially if large number of unintuitive predictors.
Results often simply presented for end-user, for example, conversion to a score.Transparent presentation of results difficult.
Can undertake model updating for use in populations with different baseline risk.Testing calibration and updating to new baseline risk difficult for many models.