Review ArticleAn unadjusted NNT was a moderately good predictor of health benefit
Introduction
Introduced in the late 1980s, the number needed to treat (NNT) has become a widely used method of interpreting the magnitude of treatment benefit [1]. The NNT is calculated as the reciprocal of the absolute risk reduction (ARR) and is commonly defined as how many patients must undergo a therapy to prevent one adverse outcome. For example, if a randomized controlled trial (RCT) demonstrates that patients taking a placebo have a 30% chance of death but those taking a drug have a 10% chance, the drug reduces the absolate risk of death by (30% − 10%) = 20%. The NNT = 1/0.20 = 5. Five patients “need to be treated” to prevent one death.
Texts in evidence-based medicine (EBM) advocate using the NNT to report and interpret RCT results [2], [3], [4]. In searching for the best measure of clinical benefit, it is argued that, in comparison with other outcome measures such as the odds ratio (OR) and relative risk reduction (RRR), the NNT offers clinicians a simpler, more intuitive, and yet more accurate method of understanding the magnitude of health benefit associated with a treatment [5], [6]. EBM advocates promote the NNT as “the most useful measure of clinical effort” [2]. In five general medical journals, the frequency with which RCTs included an NNT (and/or ARR) increased from 4.4% in 1992 to 16.7% in 1998 [7].
Despite widespread use, the NNT does have several well-described limitations. First, users of the NNT must supply their own implicit adjustment for the severity of the illness under consideration [8]. For example, preventing a stroke is of greater value than preventing a headache. Thus, treatments with dramatically different overall benefit may have similar NNTs. Also, the NNT does not incorporate the type of treatment or treatment adverse effects. A different statistic, the “number needed to harm” (NNH) [2] must be calculated to capture side-effect risks.
Similarly, the NNT does not explicitly incorporate the duration of therapy [1]. For chronic conditions, the NNT measures less the avoidance of an adverse outcome than it does the postponement of one, and depends greatly on the point of time in the disease process at which the statistic is measured [9]. Taking the importance of this time component into account, the simple definition of the NNT as “the number of patients who must undergo a therapy to prevent one adverse event” might be more accurately expressed as “the average number of patients who must undergo a therapy over a specified time period to observe one less adverse event at the end of the same or different time period.”
Furthermore, there is a growing body of evidence that despite its apparent simplicity, the NNT is frequently misinterpreted by both lay people and health professionals [9], [10], [11]. Other limitations of the NNT include its failure to take baseline risks into account [8], its limitation to only dichotomous outcomes (it is not possible to calculate an NNT for outcomes measured on continuous scales), and potentially undesirable statistical properties [12].
Our clinical experience has suggested that the NNT does appeal to physicians and trainees. Unfortunately, in contrast to EBM recommendations that the NNT be interpreted in the context of the above limitations and personal patient values [3], we have also observed, particularly in journal club and critical appraisal settings, that clinicians seem to have implicit thresholds, such that a value of <10–15 is considered to be an attractive ratio irrespective of differences in clinical conditions, side effects, length of treatment, and other factors. We are not aware of studies exploring how the NNT is used in clinical settings, but a review of the literature does provide some support to our hypothesis. An informal poll of respiratory health professionals suggested that an NNT ≤20 or less was considered “clinically worthwhile” [13], and other tutorials teaching the NNT concept suggest that a value of ≤10 indicates a clinically significant effect [14]. Published statements such as “These small NNTs suggest that . . . the cholinesterase inhibitors have a valuable place in the current clinical management of [Alzheimer's disease]” [15] suggest an implicit belief that an unadjusted NNT value adequately captures the overall worth of a treatment. The use of NNT league tables [2], [16] may further add to the impression that NNT values in and of themselves are broadly comparable. Indeed, studies have shown that clinicians tend to misinterpret the NNT and use it incorrectly [9], [11].
A number of authors have proposed modifications to the NNT to improve the fidelity of its representation of overall health benefit. Dividing the NNT by the length of the study has been proposed as a way to adjust for variable observation time [2]. Calculating a “number needed to harm” has been recommended to adjust for the risk of side effects [2]. Dividing the NNT by an individual-to-study-population risk ratio has been offered as a way to adjust for baseline risk [17]. A formula has been derived to calculate an “NNT threshold,” the point at which clinical benefit equals clinical risk [18]. Such modifications address some of the potential limitations of the NNT as an outcome measure, but diminish the simplicity and intuitive appeal that make the NNT so attractive to clinicians. The present study addresses the question of whether an unadjusted NNT is a useful clinical tool despite its potential for oversimplifying the expression of potential benefits.
To evaluate the clinical utility of the NNT, a reference standard of health benefit is required. An alternative outcome measure that, like the NNT, represents the magnitude of clinical benefit is the quality-adjusted-life-year (QALY). Developed in the 1970s, the QALY represents health using two attributes, length of life and quality of life (QOL). The key idea underlying the QALY is that the gains associated with any health intervention or program can be accurately represented and expressed using these two dimensions. The QOL changes are usually represented using utility, a quality of life scale from 0 (imminent death) to 1 (full health). Thus, a treatment that is estimated to extend a patient's life for 5 years but at a utility that is 50% of full health would gain (5 years × 0.5 utility) = 2.5 QALYs [19].
QALYs are not often compared to NNTs. The latter are usually used to interpret empirical research (e.g., clinical trials), whereas the former are mostly used in decision and cost-effectiveness analyses (CEAs). Nonetheless, both concepts represent alternative ways of representing the net health benefit associated with a program or intervention. As an outcome measure, the QALY offers some theoretical advantages over the NNT. For example, the QALY more explicitly incorporates benefits, side effects, and length of therapy. Also, QALYs are designed to facilitate comparison across different conditions and interventions [19]. The QALY does, however, have its own, well-described limitations [20]. Furthermore, the QALY is typically too complex to be used in everyday clinical practice. Nonetheless, it is arguably one of the most comprehensive health outcome measures available, and more completely captures health benefit than the NNT can.
Our objective was to determine how well the NNT, with its theoretical limitations, predicts the net health benefit of interventions, using the QALY as a reference standard. We did not primarily focus on the relative utility of other outcome measures (such as RRR or OR) as compared to the NNT.
Section snippets
Article selection
We used a set of 228 CEAs performed between 1976 to 1997 inclusive as our primary data set [21]. This database represents the results of a systematic search for original CEAs up to 1997 that expressed health benefit in QALYs and were published in English. It was compiled through extensive electronic database searches [21] and a review of 6,500 titles in two paper-based bibliographies [22], [23]. From these searches, >1,500 candidate articles were extracted. Based on a reading of the study
Characteristics of selected articles
The characteristics of the 65 articles selected are shown in Table 1. Most of the articles examined drug interventions (60.5%) for cardiovascular, neoplastic, or infectious conditions (60%) at the tertiary prevention stage (73%). The majority of the probabilities used to calculate the NNTs were based on RCTs or formal meta-analyses of RCTs (71%).
Empiric exploration of theoretical limitations of NNT
When comparing NNTs to the change in QALYs across all treatments, we observed that as NNTs fell (i.e., increasing health benefit) the QALY gain
Discussion
The NNT has become a popular method of evaluating the magnitude of treatment benefit. Here, we show that, despite its theoretical limitations, the unadjusted NNT is a moderately good predictor of interventions that are associated with clinically significant health benefit, defined as high QALY gains. We provide evidence that an unadjusted NNT may be sufficiently accurate to be used as an easily calculated shorthand measure of clinical benefit. Our ROC analysis suggests that for identifying
Acknowledgments
M.D.K. is supported by the F. Norman Hughes Chair in Pharmacoeconomics, Faculty of Pharmacy, University of Toronto, and an Investigator Award from the Canadian Institutes for Health Research. G.N. is supported by the Mary Trimmer Chair in Geriatric Medicine Research, University of Toronto.
References (122)
Number needed to treat (NNT)
Ann Emerg Med
(1999)- et al.
Number needed to treat: easily understood and intuitively meaningful? Theoretical considerations and a randomized trial
J Clin Epidemiol
(2002) - et al.
A note on the number needed to treat
Control Clin Trials
(1999) How to estimate treatment effects from reports of clinical trials. II: Dichotomous outcomes
Aust J Physiother
(2000)- et al.
When should an effective treatment be used? Derivation of the threshold number needed to treat and the minimum event rate for treatment
J Clin Epidemiol
(2001) - et al.
Cost-effectiveness of palliative chemotherapy in advanced gastrointestinal cancer
Ann Oncol
(1995) - et al.
Cost utility analysis of maintenance treatment for recurrent depression
Control Clin Trials
(1995) - et al.
A study of the quality of life and cost-utility of renal transplantation
Kidney Int
(1996) - et al.
Cost-effectiveness analysis of potential improvements to emergency medical services for victims of out-of-hospital cardiac arrest
Ann Emerg Med
(1996) - et al.
A cost-utility analysis of second-line antibiotics in the treatment of acute otitis media in children
Clin Ther
(1996)
Cost-utility analysis of paclitaxel in combination with cisplatin for patients with advanced ovarian cancer
Gynecol Oncol
Economic analysis of an immunosuppressive strategy in renal transplantation
Health Policy
Primary prophylaxis of variceal bleeding in cirrhosis: a cost-effectiveness analysis
Gastroenterology
Cost-effectiveness of captopril therapy after myocardial infarction
J Am Coll Cardiol
Systemic treatment of early breast cancer by hormonal, cytotoxic, or immune therapy: 133 randomised trials involving 31,000 recurrences and 24,000 deaths among 75,000 women
Lancet
An assessment of clinically useful measures of the consequences of treatment
N Engl J Med
Evidence-based medicine: how to practise and teach EBM
Evidence-Based Medicine Working Group. Users' guides to the medical literature: XX. Integrating research evidence with the care of the individual patient
JAMA
Measured enthusiasm: does the method of reporting trial results alter perceptions of therapeutic effectiveness?
Ann Intern Med
Therapeutic priorities of Canadian internists
CMAJ
Reporting number needed to treat and absolute risk reduction in randomized controlled trials
JAMA
Using numerical results from systematic reviews in clinical practice
Ann Intern Med
A randomized comparison of patients' understanding of number needed to treat and other common risk reduction formats
J Gen Intern Med
Numeracy and the medical student's ability to interpret data
Eff Clin Pract
A simple method for evaluating the clinical literature
Fam Pract Manag
How useful are cholinesterase inhibitors in the treatment of Alzheimer's disease? A number needed to treat analysis
Int J Geriatr Psychiatry
The number needed to treat: a clinically useful measure of treatment effect
BMJ
Utilities and quality-adjusted life years
Int J Technol Assess Health Care
Cost-utility analysis: use QALYs only with great caution
CMAJ
The quality of reporting in published cost-utility analyses, 1976–1997
Ann Intern Med
Health care CBA/CEA: an update on the growth and composition of the literature
Med Care
Health care CBA and CEA from 1991 to 1996: an updated bibliography
Med Care
Economic outcome for intensive care of infants of birthweight 500–999 g born in Victoria in the post surfactant era
J Paediatr Child Health
Cost-effective models for flutamide for prostate carcinoma patients: are they helpful to policy makers?
Cancer
Estimates of the cost-effectiveness of a single course of interferon-alpha 2b in patients with histologically mild chronic hepatitis C
Ann Intern Med
Economic evaluation of neonatal intensive care of very-low-birth-weight infants
N Engl J Med
Adjuvant therapy for stage III colon cancer: economics returns to research and cost-effectiveness of treatment
J Natl Cancer Inst
Economic and health state utility determinations for schizophrenic patients treated with risperidone or haloperidol
J Clin Psychopharmacol
Evaluating the potential cost-effectiveness of stenting as a treatment for symptomatic single-vessel coronary disease: use of a decision-analytic model
Circulation
Cost-effectiveness of carotid endarterectomy in asymptomatic patients
J Vasc Surg
Should the elderly receive chemotherapy for node-negative breast cancer? A cost-effectiveness analysis examining total and active life-expectancy outcomes
J Clin Oncol
Cost effectiveness of isoniazid chemoprophylaxis
Model of complications of NIDDM. II. Analysis of the health benefits and cost-effectiveness of treating NIDDM with the goal of normoglycemia
Diabetes Care
Cost-effectiveness of the transdermal nicotine patch as an adjunct to physicians' smoking cessation counseling
JAMA
Cost-effectiveness of warfarin and aspirin for prophylaxis of stroke in patients with nonvalvular atrial fibrillation
JAMA
Management of childhood lead poisoning: clinical impact and cost-effectiveness
Med Decis Making
Efficacy and cost-effectiveness of autologous blood predeposit in patients undergoing radical prostatectomy procedures
Urology
Cost-effectiveness of cancer chemotherapy: an economic evaluation of a randomized trial in small-cell lung cancer
J Clin Oncol
Cognitive-educational treatment of fibromyalgia: a randomized clinical trial. II. Economic evaluation
J Rheumatol
Cited by (23)
Trial sequential analysis of randomized controlled trials on neoadjuvant therapy for resectable pancreatic cancer
2022, European Journal of Surgical OncologyContinuing, reducing, switching, or stopping antipsychotics in individuals with schizophrenia-spectrum disorders who are clinically stable: a systematic review and network meta-analysis
2022, The Lancet PsychiatryCitation Excerpt :Contrary to our original hypothesis based on available literature,10,23 switching to another antipsychotic was similarly effective compared to continuing at standard doses, whereas reducing the antipsychotic dose was significantly inferior compared to both continuing and switching. For every three individuals continuing antipsychotic treatment at standard doses, one additional individual will avoid relapse compared to stopping antipsychotic treatment, which can be regarded as a large effect magnitude according to commonly used thresholds27 and results from RCTs in acute schizophrenia.28 The NNT slightly increased to about 3·5 for patients who switched antipsychotic treatment (still regarded as a large effect magnitude), and increased further to about 6·0 for those reducing the dose (a moderate effect magnitude, although notably imprecise) versus stopping antipsychotics altogether.
Group cognitive-behavioural therapy for perinatal anxiety disorders: Treatment development, content, and pilot results
2021, Journal of Affective Disorders ReportsCitation Excerpt :Our study achieved an NNT of 2.9, meaning we would need to treat 2.9 individuals to have at least one respond. This an encouraging finding, given that NNTs below 5 are considered clinically meaningful (Chong et al., 2006). A repeated measures ANOVA was conducted with time as the within-subjects independent variable and total EDPS score as the dependent variable.
Magnitude and direction of missing confounders had different consequences on treatment effect estimation in propensity score analysis
2017, Journal of Clinical EpidemiologyIs resectable hepatocellular carcinoma a contraindication to liver transplantation? A novel decision model based on "number of patients needed to transplant" as measure of transplant benefit
2014, Journal of HepatologyCitation Excerpt :The problem, therefore, is not simply to show when LT is superior to HR (absolute criterion), but how to judge the minimal benefit (in months) obtained from LT over HR. In this view, this “relative NNT classification” [13] represents one potential tool to define the optimal selection strategy between HR and LR based therapies by quantification of the benefit. It is important to underline that this is only a proposal, and not an absolute criterion, to ponder on this complex evaluation between the best therapies available for the treatment of HCC.