Abstract
BACKGROUND: In 1968, Wilson and Jungner published 10 principles of screening that often represent the de facto starting point for screening decisions today; 50 years on, are these principles still the right ones? Our objectives were to review published work that presents principles for population-based screening decisions since Wilson and Jungner’s seminal publication, and to conduct a Delphi consensus process to assess the review results.
METHODS: We conducted a systematic review and modified Delphi consensus process. We searched multiple databases for articles published in English in 1968 or later that were intended to guide population-based screening decisions, described development and modification of principles, and presented principles as a set or list. Identified sets were compared for basic characteristics (e.g., number, categorization), a citation analysis was conducted, and principles were iteratively synthesized and consolidated into categories to assess evolution. Participants in the consensus process assessed the level of agreement with the importance and interpretability of the consolidated screening principles.
RESULTS: We identified 41 sets and 367 unique principles. Each unique principle was coded to 12 consolidated decision principles that were further categorized as disease/condition, test/intervention or program/system principles. Program or system issues were the focus of 3 of Wilson and Jungner’s 10 principles, but comprised almost half of all unique principles identified in the review. The 12 consolidated principles were assessed through 2 rounds of the consensus process, leading to specific refinements to improve their relevance and interpretability. No gaps or missing principles were identified.
INTERPRETATION: Wilson and Jungner’s principles are remarkably enduring, but increasingly reflect a truncated version of contemporary thinking on screening that does not fully capture subsequent focus on program or system principles. Ultimately, this review and consensus process provides a comprehensive and iterative modernization of guidance to inform population-based screening decisions.
In 1968, Wilson and Jungner published Principles and Practice of Screening for Disease,1 a seminal work that highlighted 10 principles that should be considered when making a screening decision (Box 1). These screening principles were set out as normative statements regarding what should be known about the relative importance of a health problem, the natural progression of the disease or condition, the characteristics of available screening tests and follow-up treatments, and the cost-effectiveness of screening, before proceeding with a screening decision. Health care professionals, screening experts and policy-makers from all parts of the world use these principles to guide screening decisions. But despite the popularity of these principles, screening decisions remain challenging.2,3 Recent controversies regarding screening for cancer4–6 and screening in newborns7 highlight the persistent complexity of screening decisions and the intense scrutiny under which they are made. The Wilson and Jungner principles of screening often represent the de facto starting point for these inherently contentious and costly sets of decisions. But after almost 50 years, are these principles still the right ones? Since their original publication, there has not been a systematic attempt to examine how screening principles have evolved or an assessment of what constitutes a comprehensive set of screening principles to guide contemporary screening decisions.
Wilson and Jungner’s principles of screening1
The condition sought should be an important health problem.
The natural history of the condition, including development from latent to declared disease, should be adequately understood.
There should be a recognizable latent or early symptomatic stage.
There should be a suitable test or examination.
The test should be acceptable to the population.
There should be an agreed policy on whom to treat as patients.
There should be an accepted treatment for patients with recognized disease.
Facilities for diagnosis and treatment should be available.
The cost of case-finding (including diagnosis and treatment of patients diagnosed) should be economically balanced in relation to possible expenditure on medical care as a whole.
Case-finding should be a continuing process and not a “once and for all” project.
The objectives of this study were to review published work that presents principles for guiding population-based screening decisions since the publication of Wilson and Jungner’s principles in 1968, and to conduct a Delphi consensus process to assess a synthesis of the review results.
Methods
We employed a systematic review to identify, synthesize and consolidate existing principles of screening, followed by a modified Delphi consensus process with international screening experts to assess their level of agreement with the importance and interpretability of the consolidated screening principles.
For the first study objective, we conducted a systematic review of English-language literature published from 1968 to September 2015. Multiple databases, including MEDLINE (1968 to September 2015), Embase (1968 to September 2015), the Cumulative Index to Nursing and Allied Health Literature (CINAHL) (1981 to September 2015), Google Scholar and Google Books (September 2015), were searched. The search strategy was database-specific and included, where applicable, a combination of subject headings and free-text terms for “screening,” “guidance” and “decision or policy-making” in the title and abstract (Appendix 1, available at www.cmaj.ca/lookup/suppl/doi:10.1503/cmaj.171154/-/DC1, presents search strategies for MEDLINE, Embase, CINAHL, Google Scholar and Google Books). To augment the search strategy, a cited reference search of the Web of Science database (1968 to September 2015) was used to identify articles citing the Wilson and Jungner principles. Reference lists of all included citations identified were also reviewed.
We used a 2-stage process to select articles, including an initial screen of titles and abstracts followed by full-text review of the subset of articles passing the initial screen. Three research assistants and an investigator (M.J.D.) were responsible for double screening all titles and abstracts, and all articles subjected to full-text review were reviewed by an investigator (M.J.D.) and 1 of the 3 research assistants. Inclusion criteria required that articles (a) were published in English, (b) were published in 1968 or later, (c) were intended to provide guidance on making population-based screening decisions, (d) described the development or modification of principles, criteria, questions or frameworks for screening and (e) were clearly presented as a set or list.
For each included article, the specific set or list of individual screening principles (meeting inclusion criterion “e”) and the reference list were extracted into a data set by an investigator (M.J.D.) and reviewed by the research team. The data set was analyzed in 3 ways. First, the identified sets of principles were compared in terms of their basic characteristics (e.g., number, categorization). Second, a citation analysis was conducted to document formal linkages (e.g., co-authorship of multiple sets of screening principles) and cross-citations. Third, all individual principles identified were iteratively synthesized and consolidated into separate thematic categories. The synthesis and consolidation was led by an investigator (M.J.D.) and independently reviewed by a second investigator (R.C.), with discrepancies discussed at multiple time points with the research team until consensus was achieved.
For the second study objective, a 2-round modified Delphi consensus process followed the systematic review to assess the level of agreement with, and inform refinements to, the consolidated screening principles.8–10 Both rounds were conducted via online survey (www.surveymonkey.com). We used 4 sources (including the lead authors of the included articles identified from our systematic review, members of the International Cancer Screening Network, a targeted Web search of screening councils and networks and screening-related conferences, and input from members of our research team and colleagues) to identify international screening experts that represented a range of fields and disciplines (e.g., medicine, public health, health care management and policy, and health economics), areas of screening (e.g., cancer, prenatal, newborn and infectious disease), and countries. Fifty-six international screening experts received an invitation to participate in the consensus process. Eighteen participated in round 1, with 12 also participating in round 2. The participants represented 11 countries and a range of fields and disciplines, and diseases and conditions.
Participants were asked to assess each of the 12 consolidated principles, rating their level of agreement (on a 7-point Likert scale from strongly agree to strongly disagree) with 2 statements: “This principle is important and relevant for population-based programmatic screening decisions” (hereafter referred to as “importance”) and “This principle is clearly defined and understandable” (hereafter referred to as “interpretability”). Participants were also asked to provide qualitative feedback on each principle and, at the end of the survey, were asked to note any apparent gaps, to identify missing principles in the consolidated set and to offer any other general comments. We assessed both consensus (level of agreement within rounds) and stability (consistency of results between rounds)10 of the Delphi process using descriptive statistics (i.e., frequency distributions, measures of central tendency and dispersion) and assessment of the nature and number of participants’ qualitative feedback.10,11 Appendix 2 (available at www.cmaj.ca/lookup/suppl/doi:10.1503/cmaj.171154/-/DC1) provides additional methodologic details on the consensus process, including a summary of participant characteristics.
Ethics approval
The University of Toronto Research Ethics Board gave ethics approval for the study.
Results
The review identified 41 articles (including the Wilson and Jungner principles) that met the inclusion criteria (Figure 1). These articles included 41 distinct sets of screening principles that accounted for 367 unique principles. The 41 sets exhibited considerable variability in their number and categorization (Table 1). Whereas Wilson and Jungner originally set out 10 principles, subsequent versions varied from 5 to 23. Clark and Reintgen20 were the first to group the principles into distinct categories, with a number of subsequent works25,26,30,31,35,41,44,50 similarly categorizing them by disease or condition, screening test, treatment or screening program. Three of the most recently published sets of principles showed further evolution. In the context of genetic screening, Andermann and colleagues added a number of subcategorizations to a distinct overarching structure (i.e., laboratory testing, clinical services and program management);43 Martin-Moreno and colleagues used 4 health system elements — governance, finance, resource generation and service delivery — to organize their approach to cancer-screening decisions;46 and the most recent screening principles from the UK National Screening Committee added a fifth category for implementation criteria.51
Selection of articles for inclusion. CINAHL = Cumulative Index to Nursing and Allied Health Literature.
Characteristics of included sets of screening principles
The citation analysis showed many apparent inconsistencies in the evolution of the documented screening principles we identified in the review. Substantial modifications to the Wilson and Jungner principles were observed as early as the 1970s;13 however, some later formulations contained only very minor departures from the original work.25,40 Only the New Zealand National Health Committee linked the development of their own screening principles to more than 4 of the other 40 identified works.34 Further, whereas 31 identified sets of screening principles cited the Wilson and Jungner principles, another 8 did not cite Wilson and Jungner or any of the other identified sets of screening principles. Beyond Wilson and Jungner’s screening principles, only Cochrane and Holland’s were cited by more than 5 of the other identified sets of principles.12 Appendix 3 (available at www.cmaj.ca/lookup/suppl/doi:10.1503/cmaj.171154/-/DC1) graphically presents the citation analysis for the 41 sets of screening principles.
Based on our synthesis and consolidation, the 367 unique principles were coded into 12 distinct categories of consolidated principles, which were then each aligned with 3 overarching domains: disease/condition, test/intervention or program/ system principles. The 3 consolidated principles in the disease/ condition domain address the relative importance and burden of the disease or condition and who should be targeted for screening. The 3 consolidated principles in the test/intervention domain address the characteristics of screening tests (as opposed to screening programs) (e.g., test safety, performance, acceptability and simplicity), the interpretation of screening test results and who should be targeted for different types of post-screening intervention. The 6 consolidated principles in the program/system domain address characteristics (e.g., infrastructure, coordination, integration, ethics, acceptability, benefits and harms) of screening programs (as opposed to screening tests), as well as economic evaluation and performance management of the programs. Appendix 4 (available at www.cmaj.ca/lookup/suppl/doi:10.1503/cmaj.171154/-/DC1) presents a comprehensive mapping of each unique principle identified to a corresponding consolidated principle.
Since the Wilson and Jungner principles were released in 1968, many of the subsequent publications expanded the focus to program/system principles. Seven of the 10 Wilson and Jungner principles focused on disease/condition or test/ intervention principles, while the 3 subsequent sets of principles in the ensuing 16 years maintained a similar focus on disease/ condition and/or test/intervention principles. But thereafter, most sets of principles included more program or system principles, with 46% of all individual principles reviewed, and 6 of the 12 (50%) consolidated principles, categorized as program/ system principles.
For assessment of the importance of the principles, the median level of agreement was high in both rounds of the consensus process, with agreement strengthening from round 1 to round 2 (Appendix 2C). For most principles, measures of dispersion converged from round 1 to round 2, indicating stable consensus. Qualitative comments noted the need to acknowledge the effect of specific health system organization and structure (e.g., the intent of integration or coordination of the screening program will differ based on health system arrangements) and varying relevance of principles to specific types of screening test (e.g., genetic testing). For interpretability, the median level of agreement was also high but generally lower than for importance (Appendix 2D). The interpretability rankings improved from round 1 to round 2 for all but 2 principles, and measures of dispersion narrowed for all but 1 principle. Most of the qualitative feedback suggested specific edits to the wording or language.
In response to direct questioning, none of the Delphi participants identified gaps or missing principles in their review of the consolidated principles. Given the high levels of agreement and overall strengthening and convergence of consensus from round 1 to round 2, and the overall decrease in qualitative comments from round 1 to round 2, we did not conduct a third round. Drawing on round 2 feedback, we made final refinements to create the final set of consolidated screening principles (Table 2).
Final refined set of consolidated screening principles
Interpretation
Our review of the evolution of screening principles since Wilson and Jungner’s original principles in 1968 showed a lack of coordinated progression among subsequent sets of screening principles and limited acknowledgement of other related work, with the Wilson and Jungner principles persisting as the default guide for contemporary screening decisions. Therefore, this systematic review and consensus process summarizes disparate efforts and contemporary perspectives, ultimately creating a more comprehensive list of screening principles than has been produced to date.
Although Wilson and Jungner’s principles of screening were ahead of their time, our review showed a shift in subsequent sets of screening principles toward more operational and implementation issues. For example, there was increasing focus on infrastructure requirements and resource or system capacity, the coordination of screening program components and their integration with the broader health care system, the acceptability of screening programs (as opposed to screening tests) to participants and society, and program and performance management systems. Wilson and Jungner’s 10 principles aligned with 8 of the 12 consolidated screening principles developed through our review and synthesis. Of the 4 consolidated principles that did not have a corresponding Wilson and Jungner principle, 3 addressed program/system principles (i.e., screening program acceptability and ethics, screening program benefits and harms, and screening program quality and performance management). This expansion in the focus of subsequent screening principles was consistent with high importance ratings for program/system principles noted from the modified Delphi consensus process. When directly compared, only 3 of the 10 (30%) Wilson and Jungner principles address program or system issues, whereas 6 of the 12 (50%) consolidated principles, and almost half (46%) of all remaining individual principles identified through the review address program or system issues (Appendix 4).
The shifting emphasis toward program/system principles implies a shift in the types of evidence that could be used to inform screening decisions. The evidence base for disease/ condition and test/intervention principles is typically characterized by high-quality experimental or observational studies, whereas the evidence base for addressing program/system principles is much less developed and more context dependent.52,53 This affects which experts and stakeholders are best positioned to identify, interpret and apply a broader and more diverse evidence base. Although the evidence for disease/condition and test/intervention principles may be best assessed by clinical and epidemiologic experts, assessing the evidence for program/ system principles would require a more diverse set of experts and stakeholders, including health service program managers, policy analysts, information system specialists, health economists, ethicists and members of both average and high-risk population groups. The dual influences of increasing demand for high-quality evidence (which favours types of evidence that are more likely for disease/condition or test/intervention principles) and continued use of the Wilson and Jungner principles (which do not fully capture the expanded focus on program/system principles) could lead to de-emphasis of program/system principles and suboptimal decision-making.
Limitations
A few limitations of this study should be noted. First, the review focused on published work that explicitly developed new or refined existing sets of screening principles. The review did not include published work describing the application of existing sets of principles for a specific population-based screening recommendation, which might have included insights on interpretations of those principles. Because of feasibility constraints, the review also excluded non–English language work; however, contributions from non-English-speaking countries were included, and international participants from 11 countries (7 of which were non-English speaking) had the opportunity to add principles through the Delphi consensus process. Second, data extraction of screening principles from included articles was conducted by only 1 investigator; however, as one of our inclusion criteria required the principles to be clearly presented as a set or list, data extraction was straightforward and subject to minimal interpretation and, ultimately, the research team reviewed and confirmed the inclusion of all sets extracted. Third, we did not appraise the quality of included articles. We are not aware of dedicated guidance on how to assess the quality of screening principle development, whereas existing tools to assess development processes of clinical practice guidelines (e.g., AGREE II [Appraisal of Guidelines, Research and Evaluation]54) do not emphasize population-based factors relevant to screening. However, the intent was not to assess relative quality of the sets of principles identified, but rather to assess the evolution and breadth of screening principles, using synthesis and consensus methods to scrutinize their relative importance for screening decisions. Fourth, as many of the sets of principles were published in books or reports, we needed to complement our search of traditional databases (e.g., MEDLINE) with contemporary search databases (e.g., Google Scholar) that lack the consistency and reproducibility of results that traditional databases offer.55 However, to supplement these searches, we used cited reference searches and extensive manual reviews of reference lists. Fifth, the response rate to participate in the consensus process component of the study was 32%. While Delphi consensus processes do not rely on the representativeness of the sample or statistical generalization, it is important that the recruitment process identifies experts who have a profound understanding of the topic of interest and can provide critical assessment, ideally from a range of perspectives.8 Although the response rate was relatively low, 15 of the 18 Delphi participants (92%) had 15 or more years of experience with screening, and represented the range of fields or disciplines, areas of screening experience and international perspectives (from 11 countries) that we sought. Lastly, it is important to note that although we detected a shift in the emphasis of screening principles over time, our review methods do not allow us to determine the potential influence of publication biases that may favour articles proposing changes to Wilson and Jungner’s principles over articles supporting the original principles. However, to mitigate this limitation, we used the modified Delphi consensus process to subject the review results to critical assessment by a diverse and experienced group of international screening experts.
Conclusion
Screening decisions continue to be challenging. Whereas attention is often driven by emergent evidence, there is a corresponding need to apply a clear logic and consistent rationale to guide the use of varied types of evidence in support of screening decisions. Wilson and Jungner’s suggestion that “[s]ome knowledge of principles and of what it entails in practice should form part of the intellectual equipment of all concerned with the control of disease and the maintenance of health”1 is as important as ever, but approaches to updating and refining these principles require ongoing attention. Almost 50 years on, the Wilson and Jungner principles are still driving decision-making for screening, but their age is beginning to show. Although these 10 individual principles have been remarkably enduring, increasingly, they reflect a truncated version of contemporary thinking on screening that does not fully capture the extended focus of subsequent work toward program and system considerations, in particular the evolving complexity required to develop and implement the necessary infrastructure for population-wide screening. This systematic review and modified Delphi consensus process brings together many disparate efforts and perspectives to represent a comprehensive and iterative modernization of guidance for discourse and deliberations on population-based screening that we hope will contribute to more informed decisions in the future.
Acknowledgements
The authors appreciate the contributions of the participants of the modified Delphi consensus process. The authors also thank Hannah Geddie and Leena Saeed for supporting the systematic review, Angela Du for supporting the modified Delphi consensus process, and Tony Culyer, Muir Gray and Fiona Miller for thoughtful feedback on earlier drafts of the manuscript.
Footnotes
Competing interests: None declared.
This article has been peer reviewed.
Contributors: Mark Dobrow made substantial contributions to the conception or design of the work; all of the authors contributed to the acquisition, analysis or interpretation of data. Mark Dobrow drafted the manuscript, which all of the authors revised. All of the authors gave final approval of the version to be published and agreed to be accountable for all aspects of the work.
Funding: A Canadian Institutes of Health Research Team Grant in Population-Based Colorectal Cancer Screening provided funding for this project. The funder had no role in the design or conduct of the study, the analysis or interpretation of the data, or the preparation, review or approval of the manuscript.