Statistics from Altmetric.com
Resource scarcity is an inherent aspect of any healthcare system. People, time, facilities, equipment and money can potentially be deployed across and within several different programmes, interventions and strategies, but ultimately only a few options can become reality. In deciding how these limited resources are put to best use, allocative efficiency—achieving maximum benefits with available inputs—should be an important guiding principle.
Health economic evaluation has been developed as a methodology to inform policymakers, payers and others on how to make efficient allocation decisions over competing healthcare interventions or programmes.1 ,2 Rather than dictating and prescribing particular decisions, it aims to establish an economic evidence base for discussions. Alongside other relevant considerations, such as effectiveness, severity of illness, patient preferences and equity, these discussions should consider ‘opportunity costs’ as well: the health gains given up by not investing the same resources in other programmes.
One area where health economic evaluation has become essential and undergone significant development in recent years is in the context of reimbursement decisions. Several countries now routinely require evidence that an intervention offers good value for money for insurance coverage (eg, Australia, Austria, England, Finland and France). Such evidence from economic evaluations has become a key predictor of actual funding decisions.3–5 The term ‘fourth hurdle’ is sometimes used to refer to the need for an intervention to be cost-effective in order to qualify for funding, after having demonstrated safety, quality and efficacy.6
In this paper, we provide a brief introduction to one particular but widely used form of economic evaluation, cost-utility analysis (CUA) and its application to mental healthcare.
What is an economic evaluation?
Health economic evaluation is a ‘container concept’, embodying a broad range of different approaches. First, there are partial evaluations, which provide information on the cost implications of illnesses and interventions—albeit not from an efficiency perspective. For instance, cost-of-illness studies estimate the total costs attributable to a disease, or how much an average patient costs to the healthcare system or to society. Cost comparisons compare interventions in terms of their cost impact, without considering the health effects of the alternatives being compared. These partial evaluations can be informative about how to minimise costs, but they do not assess the value that is added for the money invested. They are therefore not informative about how to efficiently allocate resources. Full economic evaluations compare the costs and effects of two (or more) competing options, often a new intervention versus an already implemented alternative. The question these analyses answer is that of efficiency: what is the extra cost of one intervention over another, and what gains would the funding body or society get in return for that extra cost?
Full economic evaluations have four possible outcomes, which are represented by the quadrants in figure 1. Intervention A can be more costly and lead to lower health gains than B (quadrant IV). In that case A is said to be ‘dominated’ by B. Vice versa, if A is less expensive, but leads to better outcomes, it is said to ‘dominate’ B (quadrant II). More difficult questions arise when one intervention is more expensive and more effective than the other (quadrant I). Here, we need to judge whether paying more for better outcomes is ‘worth it’. Similarly, if an intervention is less costly but also less effective than the alternative with which it is compared, are the cost savings worth the health losses (quadrant III)?
There are three types of full economic evaluation: cost-effectiveness analysis (CEA), CUA and cost-benefit analysis (CBA). All three are identical in their approach to capturing costs; however, they differ in how they assess health effects. Depending on the particular type of information that is needed for a given decision, one method will be more appropriate than the others.
CEA compares the economic costs and health effects of two or more interventions. Health outcomes in CEA are expressed in terms of specific clinical, patient-centred or other ‘natural’ end points that are considered important within a particular clinical or health domain. For example, in the field of communicable disease prevention, these outcomes can be ‘number of children vaccinated’, ‘infections avoided’ or, more broadly, lives saved (life years gained). In the field of mental health, relevant outcomes include ‘depression-free days achieved’; ‘relapses prevented’; or changes in the severity of symptoms, behaviours or functioning that are typically associated with a particular condition.
Combining both costs and effects, the findings of a CEA are usually reported as an ‘incremental cost-effectiveness ratio’ (ICER). For instance, a study of computerised cognitive behavioural therapy compared with standard practice found that achieving a one-point improvement on the Beck Depression Inventory would cost £21, or (more intuitively) the cost of achieving one additional depression-free day would be £2.50.7
The main advantage of expressing health outcomes in natural units is that these are often observable, relatively easy to measure and, often, meaningful (at least to clinicians, but hopefully also to patients). The main disadvantage, however, is that they limit the scope of comparisons. For example, cost-effectiveness analyses using outcomes such as ‘depression-free days achieved’ only allow comparisons with other interventions that can be expressed using exactly the same metric and are not informative beyond that condition-specific context. Survival-related outcomes such as life years gained allow comparisons over a broader range of conditions, but they disregard morbidity and quality of life. The achievements of most interventions for mental illness cannot be adequately gauged by mortality alone.
CUA is a broader form of economic evaluation in which health outcomes are translated into a generic measure of health—like a currency—that combines morbidity and mortality. There are several generic outcome measures available, but the most widely used are the quality-adjusted life year (QALY, a measure of health8) and the disability-adjusted life year (DALY, a measure of illness, mostly used in low-income and middle-income contexts9). We focus on QALYs in the remainder of this paper. Similar to CEA, QALYs capture observable health outcomes but link them to subjective appraisals of how bad it is to experience these outcomes (see Effects: quality-adjusted life years). Owing to its generic health outcome measure, CUA can be used to compare interventions not only within a given condition (such as depression), but also across different conditions. Moreover, this also allows the study to take into account the adverse health effects associated with interventions and to calculate a net health gain.
Findings of CUA are often reported as an ‘incremental cost-utility ratio’ (ICUR); conventionally, they are sometimes referred to as ICER as well. In box 1, we present an example of a CUA from a study that compares two types of support for family carers of people with dementia.
Supporting dementia carers: example of cost-utility analysis
A manual-based coping strategy programme was developed to support family carers of people with dementia, and compared with usual support in a pragmatic, multicentre, randomised controlled trial in England (n=260). The new intervention (called START) was an individual therapy programme of eight sessions delivered over 8–14 weeks by a psychology graduate, plus a manual. Carers were coached in techniques to understand and manage the behaviour of the person they cared for, to change unhelpful thoughts, promote acceptance, improve communication, plan for the future, relax and engage in meaningful, enjoyable activities. The trial compared the effectiveness and cost-effectiveness of START and usual support over 24 months, with interim findings also reported at 8 months as the intervention was found to confer immediate positive benefits.25 ,26 By 24 months, carers with usual support were seven times more likely to have clinically significant depression than carers with the START intervention. There was also a small but statistically significant quality-adjusted life year (QALY) gain. Over the first 8 months, costs were slightly but not significantly higher for the intervention group, and the incremental cost per QALY was found to be £6000. By 24 months, cost per QALY was £12 400 if only carer-related costs were considered. However, one benefit of the intervention was that it reduced service use by the people with dementia being cared for, so the intervention was in fact dominant.
CUA also has limitations. Most importantly, it can be too broad and too narrow. On the one hand, it can be difficult to meaningfully quantify very specific health effects (eg, improved cognition) into broad and generic outcome measures such as QALYs. On the other hand, comparisons are still limited to health interventions whereas, ideally, efficiency assessments should be applied to other sectors as well (eg, building schools, bridges, enhancing security measures, etc), so that resource use and its impact on social welfare can be optimised across the various domains of public policy, rather than within.
CBA is the broadest form of economic evaluation. It assesses health consequences in the most common metric used to assess value: money. Expressing health effects of an intervention in monetary terms and comparing it to the costs associated with that intervention allows the decision maker to judge whether the intervention adds net value; this estimate can consequently be compared with other interventions for which the benefits can also be expressed in monetary terms, both within health care and beyond. Putting a monetary figure on health outcomes is a key challenge in CBA. There are different approaches: the ‘human capital approach’ (based on lost income due to illness), revealed preferences (based on observing the actual choices people make where health risks are traded-off against money, eg, higher salaries for riskier jobs, buying safer but more expensive cars, etc) or stated preferences (asking people how much they would be willing to pay to receive particular health gains, or for how much money they would be willing to accept particular health losses). CBA is attractive for economists as it can theoretically inform challenging resource allocation decisions across various sectors, so that they have the biggest impact on population-level welfare.10 Nonetheless, it is infrequently used in healthcare studies because of problems with the reliability, validity and perhaps even morality of putting a monetary figure on health.11 One (rare) example of a CBA in the mental health field is provided by a six-country evaluation of individual placement and support (IPS) compared with standard vocational rehabilitation to help people with serious mental health problems (80% with schizophrenia) to move into open employment.12 ,13 Over the 18-month follow-up period, IPS generated significant incremental gains over standard arrangements in terms of people finding and staying in jobs, days worked and hospitalisation avoided. The primary economic evaluation was a CEA, but a (partial) CBA attached a value to each day of employment, equal to expected earnings (an approximation to marginal productivity) and found a difference in net benefit of £17 005 in favour of IPS.
In the remaining sections of the paper, we focus on CUA based on QALYs, which is, overall, the most widely used type of economic evaluation in healthcare.
Main elements of a CUA
Costs of interventions
Costs are a function of the resources that are consumed by an intervention, multiplied by their value. They can arise in a variety of context-specific forms and categories. Although cost categories are not rigidly defined, we can distinguish between direct, indirect, patient, future and intangible costs (see box 2).
Types of costs
Direct costs often represent the resources expended on healthcare services: doctor hours, medications, hospital beds, overhead costs of running facilities, capital costs of buildings or equipment, etc.
Indirect costs are the opportunity costs of patients and care givers losing time by being sick, being treated or providing unpaid care. These costs mainly represent productivity losses because of inability to work due to illness but they could also include disrupted domestic, educational, social and leisure activities.
Patient costs are those costs borne by patients and their families like transport costs, out of pocket expenses, etc
Future costs are often split between future costs that are directly related to the disease or the intervention (eg, a mental health problem that gives a higher risk of developing diabetes, or a particular drug that increases the risk of cardiovascular diseases later in life) and those costs that are unrelated (eg, increased life expectancy leading to higher pension costs).
Intangible costs are the psychological ‘costs’ of pain and suffering that patients experience during the episode of illness, or while undergoing the treatment. These are obviously difficult to measure and value and are almost never included on the cost side of an evaluation, but may get included as outcomes (in cost-utility analysis and cost-benefit analysis).
When evaluating mental health interventions, societal costs outside of the healthcare system can also be relevant: costs associated with criminal justice, provision of special housing, social care or extra costs falling on schools because of special educational needs.14
Which cost categories should be taken into account depends on the viewpoint from which the analysis is undertaken. If a healthcare payer perspective is adopted, only those costs that are incurred by the payer are considered. These primarily include the direct costs (other costs predominantly fall on other parties). If a societal perspective is adopted, all costs borne by the whole of society become relevant. Especially in the field of mental health, there can be large differences between the findings obtained from the healthcare payer and societal perspectives, as an atypical cost pattern often emerges in mental illness. For instance, prevention of depression is likely to be much more attractive from a societal perspective than from a payer perspective, as the bulk of its cost burden is indirect, attributable to inability to work rather than costs associated with treatment. For instance, an English study estimated that 90% of the societal cost of depression was due to unemployment and absenteeism from work.15
The QALY is a generic health metric that enables the researcher to capture both the gains in health-related quality of life (HRQoL) and the increased life expectancy attributable to a healthcare intervention.8 ,16 ,17 While measuring a mortality effect (ie, changes in the remaining life expectancy at age of death) is methodologically relatively straightforward, measuring a morbidity effect (ie, changes in HRQoL) is not.18 A key challenge to obtain QALYs for the morbidity associated with a condition is translating the disutility experienced by people living in that particular health state into a number that accurately captures their HRQoL. Estimating the quality of life associated with living in different health states implies two steps: describing the health state and valuing it.
The first, descriptive part is usually done via asking people to fill in a standardised health survey that defines health in terms of a number of essential dimensions. A widely used survey is the EQ-5D,19 which includes five dimensions: mobility, self-care, usual activities, pain/discomfort and anxiety/depression, on which patients can indicate their health state according to a number of levels. In the standard three-level version of the survey, this allows distinctions to be made between 245 possible health states. A newer, five-level version distinguishes between 3125 states.
The second part consists of translating these health states into ‘utility’ values on the scale zero to one, with zero being the value ascribed to being dead and one the value ascribed to ‘perfect health’. Values below zero are also possible and are considered ‘worse than dead’. Valuing health states implies an assessment of the relative weight of different health dimensions, for example, mobility versus depression, in order to come to a scoring algorithm that can translate any health state description into a numeric value on a scale. Two methods that are widely used to elicit these values are the ‘time trade-off’ (TTO)20 and the ‘standard gamble’ (SG).21 The TTO method asks for the quantity of life years to be lived in a particular health state that would be equally attractive as living a lower number of years in a state of perfect health. The SG asks for the maximum mortality risk people are willing to take for a treatment that would fully cure a state of incomplete health. These methods provide a common ground to compare and rank otherwise incomparable health states.
Whether patients, experts or the general public are best equipped to perform these assessments is an ongoing matter of debate. From a social perspective, economists and policymakers consider the general public to be the appropriate group to value health states. This is because the general public is responsible for financing healthcare via taxes or health insurance (private or social). The public may also be better suited than patients to adopt a birds-eye view across a wide range of different health states. Following the seminal work in the UK to characterise and document population-level preferences for various health states,22 ,23 several countries have done similar TTO and SG experiments in representative samples of the population, resulting in scoring algorithms or ‘tariffs’ that can be used to translate descriptions of health states into utility values.
An example may help to clarify how to link QALYs to health states. In a study investigating the health impact of a hepatitis A infection, 111 identified infected patients were asked to describe their health state on the EQ-5D survey while they were sick.24 Their reported EQ-5D profile during this period was translated into an ‘utility’ value using the available scoring algorithm obtained through TTO experiments in the general population in the UK. It was found that on average a patient was sick for 17.8 days, and it was assumed that (s)he held the same utility during this time period (other assumptions were tested as well). As compared with living 1 year in full health, this group of patients would only achieve 0.975 QALYs. In other words, experiencing hepatitis A during 17.8 days would lead to an average loss of 0.025 QALYs; or, we would need 40 hepatitis A patients in order to attribute the loss of 1 full QALY to the disease (making abstraction from more complicated patients who need a liver transplant or even die).
Trials and models
Determining the comparative cost utility of alternative interventions requires the quantification of costs and QALYs that are associated with these interventions. But these costs and QALYs are likely to differ between groups of individuals undergoing different treatment regimens. In addition to overall differences between groups receiving competing interventions, individuals within groups may also differ in terms of the rate with which they progress with their condition. Individuals within groups may also respond differently to treatments.
Trials and models are two common approaches to account for this heterogeneity. In a randomised controlled trial (RCT) of sufficient follow-up duration, both costs and health effects can be recorded on an individual patient basis, after which average costs and QALYs can be calculated.27 However, RCTs are often costly to carry out, take a long time to complete and are not always fully representative of the costs and effects that would occur in the ‘real world’. Therefore, modelling is an important alternative basis for economic evaluation analyses.28 ,29 Using input variables, probabilities and mathematical relationships, a model allows depiction in a simplified way of the possible consequences resulting from different treatment choices or events. Each consequence has a cost and an outcome attached to it, so that we can calculate the expected cost and expected outcome of each treatment choice under consideration. Two popular techniques are decision trees,30 generally used for acute events, and Markov models,31 mostly used to synthesise events that require a longer time frame.
A combination of inputs on costs and effects leads to an estimate of the incremental cost per QALY of one intervention versus another. The accuracy of this estimate depends on the degree of uncertainty that is embodied in the underlying analysis, and it is essential to present this uncertainty in the findings of an economic evaluation.
A distinction is often made between three sources of uncertainty in economic evaluation: methodological uncertainty (related to choice of evaluation methods: are the methods we use appropriate instruments to measure the costs and effects of the intervention? See also next section), structural uncertainty (related to uncertainty in the modelling approach or the trial design, for instance, are there any disease outcomes being ignored in the model?) and parameter uncertainty (uncertainty in the input variables that are used).32
The effect of underlying uncertainty can be analysed via a ‘sensitivity analysis’ (SA) in which the impact on the estimated cost per QALY is explored of making different assumptions in terms of methods, models and parameters. Sensitivity analysis can be done for one particular source of uncertainty (univariate, deterministic SA, ie, trying different values for a parameter), or by exploring the effect of changing many assumptions at the same time (multivariate, deterministic SA), or by adding statistical distributions to variables from which consequently random values are drawn (probabilistic SA). These iterations lead to a ‘cloud’ of cost-effectiveness estimates on the four quadrants shown in figure 1.
Rather than presenting a single estimate of ‘value for money’, sensitivity analysis is used to report a range of probable values for ICURs, thereby giving decision makers an idea of the uncertainty involved in an evaluation. These analyses can also be used to identify the main drivers of the results and the inputs for which further research can reduce observed uncertainty.28
Interpretation and use of ICURs
Correctly interpreting the findings of an economic evaluation and consequently making a decision as to whether an intervention is ‘worth it’ is a complex process with several challenges. Below, we list three key guiding considerations when interpreting and using ICURs.
First, assessing whether an intervention offers an efficient use of resources requires a benchmark—a cost per QALY threshold—that distinguishes health benefits that come at a ‘reasonable’ cost from those that are excessively costly. What this benchmark or threshold value for one life year in full health is, or should be, is however an ongoing subject of discussion.33 Thresholds can be soft or hard (depending on how flexibly or strictly they are applied) and they can be explicit or implicit.34 An explicit threshold defines a particular value that the society is willing to pay for 1 year in full health. Implicit thresholds are more informal (and also more common) and can be inferred retrospectively by analysing past decisions. Oft-cited figures that serve as a rule of thumb (eg, £30 000 or $50 000 per QALY gained) lack a solid rational foundation. Some argue that, in a wealthy society, thresholds reflecting the true value of health gains (as compared with the value that spending income on extra consumption of additional goods and services in the economy could offer) should be much higher.35 ,36 Others, starting from the premise that healthcare budgets are de facto fixed, argue that thresholds should be substantially lower in order not to crowd out more efficient and already implemented programmes that will be displaced in favour of new interventions.37
Second, CUA narrowly focuses on the value of efficiency in allocating budgets and does not necessarily reflect the impact of interventions on other important objectives of healthcare, including fairness, autonomy and solidarity with the worst-off groups in society.38 ,39 Therefore, efficiency should not necessarily be the over-riding principle for making decisions. There may be good ethical reasons why an expensive intervention still deserves funding (albeit at a high opportunity cost in terms of other more efficient health programmes). For example, health technology assessment of new medications in Scotland explicitly allows for setting a higher cost per QALY threshold for drugs treating exceptionally rare conditions and classified as ultraorphan.40 Conversely, efficient interventions that increase health inequity (eg, a prevention programme that only improves the health of society's best-off groups) may be considered unattractive. While ethical values should be considered alongside efficiency, there are no easy solutions for making such trade-offs.41 ,42
Third, methodological uncertainty: an estimate of ICUR may not correctly and accurately reflect the full value of an intervention versus another. This is particularly relevant for mental health interventions, both on the effects and on the cost side. For example, effects of mental health interventions may not be fully captured in QALYs.43 Utility values for mental illness assessed by the general population may not adequately reflect its severity. Although the average person may have an idea about the severity of physical illness, what it means to have a mental illness may be much less familiar.44 Moreover, mental illness is associated with important comorbidities (increased rates of cardiovascular disease, type II diabetes, obesity, drug addiction or suicide).14 Failing to incorporate such comorbidities—and their quality of life impact—into the estimated QALYs would significantly underestimate the value of interventions targeting mental illnesses. As mentioned before, economic evaluations of mental health interventions may be particularly sensitive to the perspective adopted in the analysis. Healthcare payer perspectives may neglect key cost dimensions in which mental illness leads to costs, particularly effects on employment and also costs associated with crime, criminal justice, social care and special schooling arrangements. Such broader costs are often difficult to estimate and they complicate accurate cost assessments. Nonetheless, when this broad range of cost dimensions is considered (instead of narrow treatment costs from a payer perspective), CUA often indicate potential cost savings (quadrant II).45
CUA is increasingly used to provide decision makers with evidence on how to efficiently allocate a budget. For mental health policy, this presents an opportunity as there is often a strong economic case to increase investments in mental healthcare. Moreover, as an evidence-based platform, CUA can help overcome the biases that are inherent when following a predominantly biomedical interpretation of the concepts of health and sickness. Given its high relevance to decision-making, it is important that mental health practitioners are familiar with the primary components and assumptions of CUA, the difficulties that are inherent to the methodology, and the particular challenges that occur when it is applied to mental health. When applied with the necessary adjustments and nuances, economic evaluation should be seen as a helpful and welcome instrument for mental health policy.
Competing interests None declared.
Provenance and peer review Commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.