The practice of collecting biospecimens in population-based studies is becoming more widespread.1 The new combinations of data arising from these enterprises offer opportunities for gaining insight into the foundations of population health that previously could not come from clinical data alone. However, what we can and cannot learn from these rich resources needs to be understood, and respected. In this issue, the study by den Elzen and colleagues,2 who investigated the association between erythropoietin levels and mortality among participants from the 1997–1999 cohort of the Leiden 85-plus Study, is an excellent example of both the opportunities and cautions. Our focus in this commentary will be on a few related concerns: generalizability, selectivity and replication. But we begin by raising a few questions about the specific analyses.
The fundamental conclusion of den Elzen and colleagues is that elevated erythropoietin levels predict an increased risk of death among very elderly people with and without anemia. The authors are careful to point out that the gradient in association of erythropoietin was not statistically discernible among people with anemia in a model that controlled for hemoglobin level. They were also scrupulous about documenting exclusions (participants with severe renal failure or those for whom data on creatinine clearance were unavailable) that resulted in the final sample for analysis and about considering the effects of those exclusions on their findings.
The analysis illustrates an important strength of the Leiden 85-plus Study: rather than attempting to use the (relatively) small sample to discover a biomarker that is related to survival, the authors explore a previously observed association — in patients with heart failure — to see whether it holds in a population-representative sample. Their sample is characterized by high participation rates and good follow-up data on mortality from the municipal registry. However, another potential strength of their sample, namely the estimation of effect size, is not realized.
Despite the care the authors have taken and the strengths of the sample and data, a few specific questions about the analyses arise. First, we wonder why erythropoietin was treated as a categorical variable in the analyses. No a priori argument was made for a threshold effect; it seems strange to throw away information without explanation. Second — and this concern looms larger as we try to understand the implications of their findings — the authors used stratum-specific (within anemia status) tertiles of erythropoietin.
It may be that the analyses support the notion that a threshold exists: compared with the hazard ratios for the lowest erythropoietin tertile, the hazard ratios for the highest tertile are discernibly different, whereas those for the middle tertile are not. However, the authors focus on testing the significance of the linear trend across the three categories. Perhaps it was this interest that led to an unfortunate decision regarding how erythropoietin strata were defined. Because the authors established tertiles within anemia status, it is not possible to compare hazard rates across the anemia states; the tertile cutoffs for erythropoietin among people with anemia are not the same as the cut-offs for those without anemia. We might have been able to compare and interpret effect size had erythropoietin been treated as a continuous variable in the model or even if the cut-offs had been fixed across anemia status. Instead, we forego any potential insight with regard to underlying links to anemia.
Apart from these quibbles about the analyses, however, the study by den Elzen and colleagues raises questions that are relevant to a wide range of analyses of biomarkers in population-representative samples. We need to think carefully about what we mean by “representative.” This sample, for example, comprised people who resided in Leiden, the Netherlands, and turned 85 years old between 1997 and 1999. Even in a low-mortality country such as the Netherlands, only about 40% of the population survives to age 85 (specifically, in 2006, with an expectation of life of 80 years, the proportion surviving to age 85 is 0.42).3 Many, perhaps most, of the participants lived through the Dutch famine of 1944. Are the results affected by that exposure? By their age at that exposure? How far can we generalize the results? Would we, for example, have seen similar results at age 65? What about a cohort of 85-year-old Canadians? Or Japanese? Would the results even be replicated in another cohort of Leiden residents of the same age? This last question might be addressed by the authors: our understanding is that there are two cohorts in the Leiden 85-plus Study: the 1997–1999 cohort included in the current study and a 1987–1989 cohort.4 The sample may be representative of the population,5 but the population itself may be highly selected because of, for example, mortality or cohort-specific exposures.
More generally, it would be hard to overstate the importance of replication. In this case, in particular, we do not have a clear causal model that provides us with some confidence regarding the applicability of the findings to other populations. Thus, repeating the analyses with other data would help clarify whether we are looking at a universal physiologic association or a more limited result. The need for replication means that, as a field, we need to be making our data as widely available as possible.6 Easier said than done. Our own efforts to provide broader access to data are moving at a glacial pace — appropriate protections for data confidentiality and the privacy of participants are complex, and data editing and documentation take time — but the goal is worthwhile.
-
Analyses of biomarkers in population-based studies may offer insight into the foundations of population health that previously could not come from clinical data alone.
-
Replication is crucial in understanding whether results are generalizable or are limited to a particular population with a particular history.
-
The need for replication underscores the need for data sharing, with appropriate protections for privacy and confidentiality.
Key points
Acknowledgement
Work of the authors is supported by the Behavioural and Social Research Program (Demography and Epidemiology Unit) of the US National Institute of Aging (grant no. R01AG16661) and by Georgetown University.
Footnotes
-
See related research article by den Elzen and colleagues, page 1953
-
Previously published at www.cmaj.ca
-
This commentary was solicited and has not been peer reviewed.
-
Competing interests: Maxine Weinstein has received consultancy fees and research grants from the US National Institutes of Health and has received speaker fees and travel assistance from RAND Corporation. No competing interests declared by Dana Glei.
-
Contributors: Both authors contributed to the substance and conceptualization of the remarks. Maxine Weinstein drafted the manuscript, and Dana Glei critically revised it for important intellectual content. Both authors approved the final version of the manuscript.