GRADE Series
GRADE guidelines: 13. Preparing Summary of Findings tables and evidence profiles—continuous outcomes

https://doi.org/10.1016/j.jclinepi.2012.08.001Get rights and content

Abstract

Presenting continuous outcomes in Summary of Findings tables presents particular challenges to interpretation. When each study uses the same outcome measure, and the units of that measure are intuitively interpretable (e.g., duration of hospitalization, duration of symptoms), presenting differences in means is usually desirable. When the natural units of the outcome measure are not easily interpretable, choosing a threshold to create a binary outcome and presenting relative and absolute effects become a more attractive alternative.

When studies use different measures of the same construct, calculating summary measures requires converting to the same units of measurement for each study. The longest standing and most widely used approach is to divide the difference in means in each study by its standard deviation and present pooled results in standard deviation units (standardized mean difference). Disadvantages of this approach include vulnerability to varying degrees of heterogeneity in the underlying populations and difficulties in interpretation. Alternatives include presenting results in the units of the most popular or interpretable measure, converting to dichotomous measures and presenting relative and absolute effects, presenting the ratio of the means of intervention and control groups, and presenting the results in minimally important difference units. We outline the merits and limitations of each alternative and provide guidance for meta-analysts and guideline developers.

Introduction

Key points

  1. Summary of Findings tables provide succinct presentations of evidence quality and magnitude of effects.

  2. Summarizing the findings of continuous outcomes presents special challenges to interpretation that become daunting when individual trials use different measures for the same construct.

  3. The most commonly used approach to providing pooled estimates for different measures, presenting results in standard deviation units, has limitations related to both statistical properties and interpretability.

  4. Potentially preferable alternatives include presenting results in the natural units of the most popular measure, transforming into a binary outcome and presenting relative and absolute effects, presenting the ratio of the means of intervention and control groups, and presenting results in preestablished minimally important difference units.

The first 12 articles in this series introduced the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach to systematic reviews and guideline development [1], discussed the framing of the question [2], presented GRADE's concept of quality of evidence and how to apply it [3], [4], [5], [6], [7], [8], [9] presented GRADEs approach to resource use considerations [10], described how to make overall ratings of confidence [11], and discussed Summary of Findings (SoF) tables presenting the results of binary outcomes [12]. In this thirteenth article, we address issues specific to SoF tables that report results of continuous outcomes.

Our recommendations will differ according to whether

  • 1.

    investigators have all used the same measure that is familiar to the target audiences

  • 2.

    investigators have all used the same or very similar measures that are less familiar to the target audiences

  • 3.

    investigators have used different measures

Section snippets

Options when investigators have all used the same measure that is familiar to the target audiences

In the simplest situation, authors of primary studies have all used the same measure of the continuous outcome of interest, and the target audiences will easily interpret that outcome. This is likely to be true, for instance, of durations of events, such as hospitalization or symptoms for conditions such as sore throat, otitis media, or influenza. For such outcomes, the SoF table should include a weighted difference of means.

Table 1 presents examples of such outcomes from systematic reviews in

Options when investigators have all used the same or very similar measures that are less familiar to the target audiences

Transparency becomes more challenging when clinicians and patients are unfamiliar with the units of the outcome measure. For instance, Table 2 presents data derived from a systematic review addressing the impact of compression stockings for people taking long flights [16]. Outcomes include the presence of edema. Because each study used the same measurement tool for assessing edema, it is possible to make the pooled difference between the groups (the “weighted mean difference”) of 4.7 units more

Options when investigators have used different measures

Reviewers face further challenges when studies measure the same concept but use different measurement instruments. For instance, one set of trials may have measured depression using the Beck Depression Inventory-II [22], and another set may have used the Hamilton Rating Scale for Depression [23]. Under these circumstances, providing pooled estimates of effect and making results interpretable mandates use of one of five available approaches. Table 3 summarizes the merits of each approach and our

Reflections on the interpretation of the five methods

The prior discussion makes evident that there is no ideal method for making results of continuous variables interpretable, particularly when studies have used different measurement tools for the same construct (e.g., pain, physical function, emotional function). Given the sometimes questionable assumptions that each approach makes, it would be reassuring if the methods led to essentially the same inferences. This is true for the respiratory rehabilitation example: all approaches suggest a

Recommendations for enhancing interpretability in meta-analyses in which primary studies use different instruments to measure the same underlying construct

We have described five approaches to enhancing the interpretability of continuous variables in meta-analyses in which primary studies have used different instruments. Review authors will have to tailor their approach to the individual situation but may find the following guides helpful:

  • 1.

    Using more than one presentation is likely to be both informative and, if the clinical message is similar, reassuring. It can also reduce the risk of biased selection of which presentation to use when the

Conclusion

Summarizing continuous variables in ways that are both valid and interpretable is challenging. To achieve these goals, systematic review authors and guideline developers should carefully consider the approaches we have suggested.

References (36)

  • G. Guyatt et al.

    GRADE guidelines 11 - Making an overall rating of evidence for a single outcome and for all outcomes

    J Clin Epidemiol

    (2013)
  • G.H. Guyatt et al.

    GRADE guidelines 12 - Preparing summary of findings tables (SOF) - binary outcomes

    J Clin Epidemiol

    (2013)
  • G.H. Guyatt et al.

    Methods to explain the clinical significance of health status measures

    Mayo Clin Proc

    (2002)
  • R. Jaeschke et al.

    Measurement of health status. Ascertaining the minimal clinically important difference

    Control Clin Trials

    (1989)
  • S. Suissa

    Binary methods for continuous outcomes: a parametric alternative

    J Clin Epidemiol

    (1991)
  • R. Dworkin et al.

    Interpreting the clinical importance of treatment outcomes in chronic pain clinical trials: IMMPACT recommendations

    J Pain

    (2008)
  • T. Furukawa

    From effect size into number needed to treat

    Lancet

    (1999)
  • G. Guyatt et al.

    How can quality of life researchers make their work more useful to health workers and their patients?

    Qual Life Res

    (2007)
  • Cited by (477)

    View all citing articles on Scopus

    The GRADE system has been developed by the GRADE Working Group. The named authors drafted and revised this article. A complete list of contributors to this series can be found on the Journal of Clinical Epidemiology Web site.

    View full text