Scolaris Content Display Scolaris Content Display

Imaging modalities for the non‐invasive diagnosis of endometriosis

Collapse all Expand all

Background

About 10% of women of reproductive age suffer from endometriosis. Endometriosis is a costly chronic disease that causes pelvic pain and subfertility. Laparoscopy, the gold standard diagnostic test for endometriosis, is expensive and carries surgical risks. Currently, no non‐invasive tests that can be used to accurately diagnose endometriosis are available in clinical practice. This is the first review of diagnostic test accuracy of imaging tests for endometriosis that uses Cochrane methods to provide an update on the rapidly expanding literature in this field.

Objectives

• To provide estimates of the diagnostic accuracy of imaging modalities for the diagnosis of pelvic endometriosis, ovarian endometriosis and deeply infiltrating endometriosis (DIE) versus surgical diagnosis as a reference standard.

• To describe performance of imaging tests for mapping of deep endometriotic lesions in the pelvis at specific anatomical sites.

Imaging tests were evaluated as replacement tests for diagnostic surgery and as triage tests that would assist decision making regarding diagnostic surgery for endometriosis.

Search methods

We searched the following databases to 20 April 2015: MEDLINE, CENTRAL, EMBASE, CINAHL, PsycINFO, Web of Science, LILACS, OAIster, TRIP, ClinicalTrials.gov, MEDION, DARE, and PubMed. Searches were not restricted to a particular study design or language nor to specific publication dates. The search strategy incorporated words in the title, abstracts, text words across the record and medical subject headings (MeSH).

Selection criteria

We considered published peer‐reviewed cross‐sectional studies and randomised controlled trials of any size that included prospectively recruited women of reproductive age suspected of having one or more of the following target conditions: endometrioma, pelvic endometriosis, DIE or endometriotic lesions at specific intrapelvic anatomical locations. We included studies that compared the diagnostic test accuracy of one or more imaging modalities versus findings of surgical visualisation of endometriotic lesions.

Data collection and analysis

Two review authors independently collected and performed a quality assessment of data from each study. For each imaging test, data were classified as positive or negative for surgical detection of endometriosis, and sensitivity and specificity estimates were calculated. If two or more tests were evaluated in the same cohort, each was considered as a separate data set. We used the bivariate model to obtain pooled estimates of sensitivity and specificity when sufficient data sets were available. Predetermined criteria for a clinically useful imaging test to replace diagnostic surgery included sensitivity ≥ 94% and specificity ≥ 79%. Criteria for triage tests were set at sensitivity ≥ 95% and specificity ≥ 50%, ruling out the diagnosis with a negative result (SnNout test ‐ if sensitivity is high, a negative test rules out pathology) or at sensitivity ≥ 50% with specificity ≥ 95%, ruling in the diagnosis with a positive result (SpPin test ‐ if specificity is high, a positive test rules in pathology).

Main results

We included 49 studies involving 4807 women: 13 studies evaluated pelvic endometriosis, 10 endometriomas and 15 DIE, and 33 studies addressed endometriosis at specific anatomical sites. Most studies were of poor methodological quality. The most studied modalities were transvaginal ultrasound (TVUS) and magnetic resonance imaging (MRI), with outcome measures commonly demonstrating diversity in diagnostic estimates; however, sources of heterogeneity could not be reliably determined. No imaging test met the criteria for a replacement or triage test for detecting pelvic endometriosis, albeit TVUS approached the criteria for a SpPin triage test. For endometrioma, TVUS (eight studies, 765 participants; sensitivity 0.93 (95% confidence interval (CI) 0.87, 0.99), specificity 0.96 (95% CI 0.92, 0.99)) qualified as a SpPin triage test and approached the criteria for a replacement and SnNout triage test, whereas MRI (three studies, 179 participants; sensitivity 0.95 (95% CI 0.90, 1.00), specificity 0.91 (95% CI 0.86, 0.97)) met the criteria for a replacement and SnNout triage test and approached the criteria for a SpPin test. For DIE, TVUS (nine studies, 12 data sets, 934 participants; sensitivity 0.79 (95% CI 0.69, 0.89) and specificity 0.94 (95% CI 0.88, 1.00)) approached the criteria for a SpPin triage test, and MRI (six studies, seven data sets, 266 participants; sensitivity 0.94 (95% CI 0.90, 0.97), specificity 0.77 (95% CI 0.44, 1.00)) approached the criteria for a replacement and SnNout triage test. Other imaging tests assessed in small individual studies could not be statistically evaluated.

TVUS met the criteria for a SpPin triage test in mapping DIE to uterosacral ligaments, rectovaginal septum, vaginal wall, pouch of Douglas (POD) and rectosigmoid. MRI met the criteria for a SpPin triage test for POD and vaginal and rectosigmoid endometriosis. Transrectal ultrasonography (TRUS) might qualify as a SpPin triage test for rectosigmoid involvement but could not be adequately assessed for other anatomical sites because heterogeneous data were scant. Multi‐detector computerised tomography enema (MDCT‐e) displayed the highest diagnostic performance for rectosigmoid and other bowel endometriosis and met the criteria for both SpPin and SnNout triage tests, but studies were too few to provide meaningful results.

Diagnostic accuracies were higher for TVUS with bowel preparation (TVUS‐BP) and rectal water contrast (RWC‐TVS) and for 3.0TMRI than for conventional methods, although the paucity of studies precluded statistical evaluation.

Authors' conclusions

None of the evaluated imaging modalities were able to detect overall pelvic endometriosis with enough accuracy that they would be suggested to replace surgery. Specifically for endometrioma, TVUS qualified as a SpPin triage test. MRI displayed sufficient accuracy to suggest utility as a replacement test, but the data were too scant to permit meaningful conclusions. TVUS could be used clinically to identify additional anatomical sites of DIE compared with MRI, thus facilitating preoperative planning. Rectosigmoid endometriosis was the only site that could be accurately mapped by using TVUS, TRUS, MRI or MDCT‐e. Studies evaluating recent advances in imaging modalities such as TVUS‐BP, RWC‐TVS, 3.0TMRI and MDCT‐e were observed to have high diagnostic accuracies but were too few to allow prudent evaluation of their diagnostic role. In view of the low quality of most of the included studies, the findings of this review should be interpreted with caution. Future well‐designed diagnostic studies undertaken to compare imaging tests for diagnostic test accuracy and costs are recommended.

Imaging tests for the non‐invasive diagnosis of endometriosis

Review question

How accurate are imaging tests in detecting endometriosis? Can any imaging test be accurate enough to replace or reduce the need for surgery in the diagnosis of endometriosis?

Background

Women with endometriosis have endometrial tissue (the tissue that lines the womb and is shed during menstruation) growing outside the womb within the pelvis, causing chronic abdominal pain and difficulty conceiving. Currently, the only reliable way of diagnosing endometriosis is to perform laparoscopic surgery and visualise the endometrial deposits inside the abdomen. Because surgery is risky and expensive, imaging tests have been assessed for their ability to detect endometriosis non‐invasively. An accurate imaging test could lead to the diagnosis of endometriosis without the need for surgery, or it could reduce the need for surgery, so only women who were most likely to have endometriosis would require it. Furthermore, if imaging tests could accurately predict the location of endometriotic lesions, surgeons would have the information they need to plan and improve their surgical approach. Other non‐invasive ways of diagnosing endometriosis by using urine, blood and endometrial and combination tests have been evaluated in separate Cochrane reviews from this series.

Study characteristics

Evidence included in this review is current to April 2015. We included 49 studies involving 4807 participants. Thirteen studies evaluated pelvic endometriosis, 10 studies ovarian endometrioma, 15 studies deep endometriosis (endometriosis deeply situated in tissues in the pelvis) and 33 studies endometriosis at specific sites within the pelvic cavity. All studies included women of reproductive age who were undergoing diagnostic surgery because they had symptoms of endometriosis.

Key results

None of the imaging methods was accurate enough to provide this information on overall pelvic endometriosis. Transvaginal ultrasound identified ovarian endometriosis with enough accuracy to help surgeons determine whether surgery was needed, and magnetic resonance imaging (MRI) was sufficiently accurate to replace surgery in diagnosing endometrioma but was evaluated in only a small number of studies. Other imaging tests were assessed in small individual studies and could not be evaluated in a meaningful way. Transvaginal ultrasound could be used to locate more anatomical sites of deep endometriosis when compared with MRI, helping surgeons better plan an operative procedure. Endometriosis in the lower bowel appears to be relatively accurately identified by both transvaginal and transrectal ultrasound, by MRI and by multi‐detector computerised tomography enema. New types of ultrasound and MRI show a lot of promise in detecting endometriosis but studies are too few to clearly show their diagnostic value.

Quality of the evidence

Generally the studies were of low methodological quality, and most imaging techniques were assessed by only a small number of studies. Differences between studies involved how they were run, groups of women studied, ways imaging tests were performed and how surgery was undertaken.

Future research

Additional high‐quality research is needed to accurately evaluate the diagnostic potential of non‐invasive imaging tests for endometriosis.

Authors' conclusions

Implications for practice

Transvaginal ultrasound (TVUS), the most studied technique, showed only moderate sensitivity, albeit high specificity for pelvic endometriosis and DIE. For these conditions, TVUS did not qualify as a replacement test or a triage test but approached the criteria for a SpPin triage test. In this review, the sensitivity and specificity of TVUS for detecting ovarian endometriosis were high but met the criteria only for a SpPin triage test. In clinical practice, this may mean that the presence of endometriosis (pelvic, ovarian, DIE) on TVUS could establish the diagnosis with high certainty, whereas no radiological evidence of the disease could not confirm that participants are disease‐free. This is consistent with international guidelines, which recommend TVUS as first‐line investigation in conjunction with history and pelvic examination among women with suspected endometriosis, but do not recommend its use as a replacement test for diagnostic surgery (ACOG Committee on Gynecology 2010; SOGC 2010; Dunselman 2014). Publications from the past decade suggest that TVUS could accurately detect ovarian endometriosis and could qualify as a replacement test. This theory can be attributed to improved technology and growing experience and should be further validated by use of universal diagnostic criteria and refined radiological protocols.

MRI appeared to be less accurate for peritoneal disease and hence could not qualify as a clinically useful test to replace surgery for overall pelvic endometriosis, but it approached the diagnostic criteria for a replacement test for DIE. Although MRI met the criteria for a replacement test for ovarian endometriosis, evidence is scant and these findings need to be confirmed in larger numbers of studies. In practice, this means that MRI could be utilised in populations for which the risk/benefit ratio of surgery is unclear, such as adolescents, women with significant medical conditions or women with infertility but few pain symptoms of endometriosis. Conservative treatment like the continuous combined oral contraceptive pill or alternative treatments like IVF would be reasonable to consider before surgery. Although guidelines from multiple authorities suggest medical management as first‐line treatment for pelvic pain, most women would prefer to receive a definitive diagnosis before commencing potentially long‐term therapy. If therapeutic surgery is considered, reliable detection of ovarian endometriomas potentially enables surgeons to assess ovarian reserve and counsel women about fertility preservation before operating on ovarian tissue and risking a reduction in future fertility. Reliably detecting DIE could add weight to a decision to prioritise surgery, and the complexity of surgery and increased risk of complications could be discussed with the woman at the time a decision is needed to undertake surgery.

For most specific anatomical sites of DIE, results of meta‐analyses suggest that TVUS could qualify as a SpPin triage test for most anatomical sites, and MRI could be utilised as a SpPin test only for POD, vaginal wall and rectosigmoid endometriosis. Currently MRI is not recommended for routine use in women with endometriosis, but it has been advocated for those with equivocal ultrasound results, for whom rectovaginal or bladder endometriosis is suspected (ACOG Committee on Gynecology 2010). We did not evaluate bladder endometriosis, but it is interesting to note that MRI did not reach the predetermined diagnostic criteria for USL and RVS endometriosis, and we did not have sufficient data to allow a recommendation on the use of MRI for anterior compartment endometriosis. The clinical utility of a reasonably reliable diagnosis of posterior compartment endometriosis could inform surgeons of the need for a general surgical presence and bowel preparation before the time of surgery. This is particularly important for detecting rectosigmoid endometriosis, as presurgical bowel preparation and surgeries that combine the expertise of gynaecologists and colorectal surgeons (or involve gynaecological surgeons with the expertise to undertake bowel surgery) can be planned preoperatively as rectosigmoid lesions are relatively reliably detected. Rectosigmoid endometriotic lesions were detected with TVUS, TRUS, MRI and MDCT‐e with sufficient accuracy (SpPin criteria for TVUS, MRI, TRUS; SpPin and SnNout criteria for MDCT‐e). Although studies were too few to allow meaningful evaluation of imaging tests used to detect other bowel endometriosis, small individual studies of TVUS, TRUS and MDCT‐e displayed similar performance to that demonstrated for rectosigmoid endometriosis.

We observed that accuracy of the TVUS appeared to be enhanced by bowel preparation (TVUS‐BP) and rectal water contrast (RWC‐TVS), whereas 3.0T MRI and MRI jelly method with introduction of ultrasonographic gel into both the rectum and the vagina yielded very high diagnostic estimates compared with other MRI modalities. This was consistent for all anatomical sites of DIE, but none of these methods were evaluated for overall pelvic endometriosis. Ultimately, an adequate imaging test is expected to have high accuracy for both diagnosis of endometriosis and presurgical mapping of DIE at specific anatomical locations to simplify the diagnostic algorithm and to reduce the costs of testing. Therefore, further evaluation of modified TVUS methods and specific MRI modalities for overall endometriosis, including peritoneal disease, and for specific anatomical sites is needed.

Data for TRUS were insufficient to permit meaningful recommendations but did not appear to be superior to those for TVUS for any type or site of endometriosis; this brings its clinical utility into question. This observation is particularly important in view of considerable discomfort for women associated with TRUS compared with TVUS.

Although diagnostic potential has been demonstrated for many imaging tests, none of the evaluated tests can be recommended for routine clinical practice, in view of the level of heterogeneity and the wide confidence intervals reported by most studies. Diagnostic estimates of imaging tests for ovarian, rectosigmoid and bowel endometriosis exhibited less heterogeneity compared with tests for other types and locations of endometriosis; this suggests greater reliability, although high/unclear risk of bias in all included studies undermines the reliability of presented results in terms of their clinical utility. We suggest cautious interpretation of presented data, which in our view cannot be used to confidently inform clinical practice. We encourage further diagnostic research with a focus on potential diagnostic tests identified in this review, in accordance with suggestions presented below for improving the quality of diagnostic research in this field.

We wish to mention that in the absence of well‐established criteria for an adequate diagnostic test, the authors of this review determined the diagnostic criteria for replacement and triage tests in a way that we believe will aid interpretation for clinically active readers. However, we encourage readers to apply different criteria according to individual clinical populations and situations.

Implications for research

Currently randomised controlled treatment trials require women with and without endometriosis to have undergone diagnostic surgery for accurate group allocation. For ethical reasons, therapeutic surgery is usually performed at the same time, potentially biasing treatment trial outcomes. Thus our current inability to diagnose and assess the progression of endometriosis in a non‐invasive way is a significant limitation in the advancement of clinical research in endometriosis.

Over the past decade, advanced ultrasonographic techniques specifically designed to identify endometriosis, such as the sliding sign, pelvic organ mobility, tenderness‐guided ultrasound and use of rectal water contrast and bowel preparation, have been observed to be associated with improvements in the diagnostic accuracy of TVUS for endometriosis. Furthermore, 3.0T MRI and the MRI 'jelly method' appear to have greater diagnostic accuracy than previous older MRI modalities. Studies on these methods are too few to show their value as replacement tests or triage tests for a laparoscopic diagnosis. Additional well‐designed diagnostic studies are required to establish the diagnostic test accuracy and clinical utility of these modern imaging methods.

The QUADAS quality assessment of included studies identified several weaknesses in study design that can impede objective evaluation of findings. We recommend that future authors consider (1) including large cohorts after predefining the sample size via a power calculation (Liu 2005); (2) focusing on a 'single‐gate' design that includes only a clinically relevant population (Rutjes 2005); (3) utilising a diagnostic accuracy study design that adheres to the recommendations of the Standards for Reporting of Diagnostic Accuracy (STARD) initiative (Bossuyt 2003); (4) incorporating the QUADAS checklist into the study design (Whiting 2011); (5) formally assessing interobserver and intraobserver variability; (6) establishing universally acceptable diagnostic criteria and radiological protocols; (7) utilising universally acceptable methods of performing laparoscopy (Becker 2014) as the reference standard test; (8) implementing validation techniques to assess how the results of a statistical analysis will generalise to an independent data set; (9) undertaking direct comparisons of promising tests in conjunction with cost‐effectiveness analyses; (10) applying testing to different clinical phenotypes (Vitonis 2014) rather than to women classified according to rASRM staging; and (11) assessing long‐term outcomes and lifetime healthcare costs of women who have participated in diagnostic test accuracy trials of specific diagnostic tests.

Specific opportunities for further research identified by this review include the following.

  • Evaluating the ability of TVUS and 3.0T MRI and/or MRI 'jelly method' to diagnose pelvic ovarian endometriosis and DIE/posterior DIE in larger high‐quality studies, utilising direct comparisons between methods in conjunction with cost‐effectiveness analyses.

  • Comparatively evaluating the diagnostic test accuracy of TVUS, TVUS‐BP and RWC‐TVS in detecting any type of endometriosis.

  • Assessing the diagnostic potential of MDCT‐e as opposed to other methods in detecting DIE/posterior DIE, rectosigmoid and bowel endometriotic lesions in larger high‐quality studies.

  • Exploring the value of sequential testing and implementing SnNout and SpPin triage tests for diagnosing endometriosis in conjunction with a cost‐effectiveness evaluation of such testing.

  • Assessing short‐ and long‐term outcomes and lifetime healthcare costs of women in diagnostic test accuracy trials that have evaluated specific diagnostic imaging tests.

Summary of findings

Open in table viewer
Summary of findings 1. Summary of findings table: diagnostic tests for endometriosis

Review question

What is the diagnostic accuracy of the imaging tests in detecting endometriosis?

Pelvic endometriosis (any site and depth of invasion)

Ovarian endometriosis

DIE

Importance

A simple and reliable non‐invasive test for endometriosis with the potential to replace laparoscopy or to triage patients to reduce surgery would minimise surgical risk and reduce diagnostic delay

Participants

Women of reproductive age (1) with suspected endometriosis and/or (2) with persistent ovarian mass and/or (3) undergoing infertility workup

Settings

Hospitals (public or private of any level): outpatient clinics (general gynaecology, reproductive medicine, pelvic pain) and/or radiology departments

Reference standard

Visualisation of endometriosis at surgery (laparoscopy or laparotomy) with or without histological confirmation

Study design

Cross‐sectional of 'single‐gate' design (n = 28) or 'two‐gate' design (n = 1); prospective enrolment; 1 study could assess more than 1 test and/or more than 1 type of endometriosis

Risk of bias and applicability concerns

Overall judgement

Poor quality of most studies (only 1 study had 'low risk' assessment in all 4 domains; Thomeer 2014)

Patient selection bias

High risk: 13 studies; unclear risk: 6 studies; low risk: 10 studies

Index test interpretation bias

High risk: 7 studies; unclear risk: 7 studies; low risk: 15 studies

Reference standard interpretation bias

High risk: 6 studies; unclear risk: 16 studies; low risk: 7 studies

Flow and timing selection bias

High risk: 9 studies; unclear risk: 2 studies; low risk: 18 studies

Applicability concerns

Concerns regarding patient selection: high concern ‐ 1 study, unclear concern ‐ 0 studies, low concern ‐ 28 studies

Concerns regarding index test: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 29 studies

Concerns regarding reference standard: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 29 studies

Diagnostic thresholds

Replacement test: sensitivity ≥ 94%; specificity ≥ 79%

SnNout triage test: sensitivity ≥ 95%; specificity ≥ 50%

SpPin triage test: sensitivity ≥ 50%; specificity ≥ 95%

Approaching criteria for 1 of the above tests: diagnostic estimates within 5% of set thresholds

Target condition

Test

N of participants;
N of studies;

N of data sets

Pooled estimates
(95% CI)

Outcomes

Implications

True positives

(endometriosis)

False positives (incorrectly

classified as endometriosis)

False negatives (incorrectly

classified as disease‐free)

True negatives (disease‐free)

Pelvic endometriosis (13 studies, 1535 participants)

TVUS

1222 participants in

5 studies

Sens = 0.65 (0.27 to 1.00)

Spec = 0.95 (0.89 to 1.00)

Meta‐analysis of 4 studies after removing 1 outlier study

Sens = 0.79 (0.36 to 1.00)

Spec = 0.91 (0.74 to 1.00)

257

24

372

569

Approaches the criteria for a SpPin triage test when 1 outlier study was excluded.

Wide confidence intervals (CIs)

MRI

303 participants in 7 studies;

396 participants in

10 data sets

Sens = 0.79 (0.70 to 0.88)

Spec = 0.72 (0.51 to 0.92)

253

21

70

52

Neither replacement nor triage test criteria met

Observation: 3.0T MRI (2 studies) demonstrated highest diagnostic accuracy

18FGD PET‐CT

10 participants in 1 study

Not availablea

0

0

9

1

Insufficient evidence to allow meaningful conclusions

Ovarian endometriosis (10 studies, 852 participants)

TVUS

765 participants in

8 studies

Sens = 0.93 (0.87 to 0.99)

Spec = 0.96 (0.92 to 0.99)

182

28

16

539

Meets the criteria for a SpPin triage test and approaches the criteria for a replacement and SnNout triage test

Observation: Studies published after 2006 (4 out of 5 studies) demonstrated highest diagnostic accuracy

TRUS

92 participants in 1 study

Not availableb

32

13

4

43

Insufficient evidence to allow meaningful conclusions

MRI

179 participants in

3 studies

Sens = 0.95 (0.90 to 1.00)

Spec = 0.91 (0.86 to 0.97)

72

9

4

94

Meets the criteria for a replacement and SnNout triage test, approaches the criteria for a SpPin triage test

Observation: 3.0T MRI (2 studies) demonstrated highest diagnostic accuracy

Insufficient evidence to allow meaningful conclusions

DIE/Posterior DIE

(15 studies, 1493 participants)

TVUS

934 participants in 9 studies;

1383 participants in

12 data sets

Sens = 0.79 (0.69 to 0.89)

Spec = 0.94 (0.88 to 1.00)

435

51

128

769

Approaches the criteria for a SpPin triage test

Observation: TVUS‐BP (1 study) demonstrated highest diagnostic accuracy

MRI

266 participants in 6 studies;

289 participants in

7 data sets

Sens = 0.94 (0.90 to 0.97)

Spec = 0.77 (0.44 to 1.00)

210

11

9

59

Approaches the criteria for a replacement and SnNout triage test

Observation: 3.0T MRI (2 studies) and MRI jelly method (1 study) demonstrated highest diagnostic accuracy

DCBE

69 participants in

1 study

Not availablec

24

0

43

2

Insufficient evidence to allow meaningful conclusions

aFor FGD PET‐CT in pelvic endometriosis, diagnostic estimates were sensitivity = 0.00 (0.00 to 0.34); specificity = 1.00 (0.03 to 1.00)

bFor TRUS in ovarian endometriosis, diagnostic estimates were sensitivity = 0.89 (0.74 to 0.97); specificity = 0.77 (0.64 to 0.87)

cFor DCBE in DIE, diagnostic estimates were sensitivity = 0.36 (0.24 to 0.48); specificity = 1.00 (0.16 to 1.00)

Open in table viewer
Summary of findings 2. Summary of findings table: surgical mapping of endometriosis to specific anatomical sites

Review question

What is the diagnostic performance of the imaging tests in mapping deep endometriotic lesions in the pelvis at specific anatomical sites?

USL endometriosis

RVS endometriosis

Vaginal wall endometriosis

POD obliteration

Anterior DIE

RS/Bowel endometriosis

Importance

Ability to diagnose DIE at specific anatomical sites at preoperative assessment helps optimise planning of surgery or guides referral to the most appropriate practice, with the potential to improve treatment outcomes

Participants

Women of reproductive age with suspected endometriosis or specifically suspected DIE

Settings

Hospitals (public or private of any level): outpatient clinics (general gynaecology, reproductive medicine, pelvic pain) and/or radiology departments

Reference standard

Visualisation of endometriosis at surgery (laparoscopy or laparotomy) with or without histological confirmation

Study design

Cross‐sectional of 'single‐gate' design (n = 33); prospective enrolment; 1 study could assess more than 1 test and/or more than 1 site of endometriosis

Risk of bias and applicability concerns

Overall judgement

Poor quality of most studies (only 1 study had 'low risk' assessment in all 4 domains; Thomeer 2014)

Patient selection bias

High risk: 16 studies; unclear risk: 6 studies; low risk: 11 studies

Index test interpretation bias

High risk: 8 studies; unclear risk: 4 studies; low risk: 21 studies

Reference standard interpretation bias

High risk: 14 studies; unclear risk: 14 studies; low risk: 5 studies

Flow and timing selection bias

High risk: 8 studies; unclear risk: 3 studies; low risk: 22 studies

Applicability concerns

Concerns regarding patient selection: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 33 studies

Concerns regarding index test: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 33 studies

Concerns regarding reference standard: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 33 studies

Diagnostic thresholds

Replacement test: sensitivity ≥ 94%; specificity ≥ 79%

SnNout triage test: sensitivity ≥ 95%; specificity ≥ 50%

SpPin triage test: sensitivity ≥ 50%; specificity ≥ 95%

Approaching criteria for 1 of the above tests: diagnostic estimates within 5% of set thresholds

Target condition

Test

N of participants;
N of studies;

N of data sets

Pooled estimates
(95% CI)

Outcomes

Implications

True positives

(endometriosis)

False positives (incorrectly

classified as endometriosis)

False negatives (incorrectly

classified as disease‐free)

True negatives (disease‐free)

USL endometriosis (11 studies, 997 participants)

TVUS

751 participants in 7 studies

Sens = 0.64 (0.50 to 0.79)

Spec = 0.97 (0.93 to 1.00)

136

18

63

534

Meets the criteria for a SpPin triage test

Observation: TVUS‐BP (1 study) demonstrated the highest diagnostic accuracy

TRUS

232 participants in 2 studies

Sens = 0.52 (0.29 to 0.74)

Spec = 0.94 (0.86 to 1.00)

48

8

45

131

Approchess the criteria for a SpPin triage test

Wide CIs

Insufficient evidence to allow meaningful conclusions

MRI

199 participants in 4 studies

221 participants in 5 data sets

Sens = 0.86 (0.80 to 0.92)

Spec = 0.84 (0.68 to 1.00)

136

13

22

50

Criteria for a triage test not met

Wide CIs

Observation: 3.0T MRI (1 out of 2 studies) demonstrated the highest diagnostic accuracy

RVS endometriosis (12 studies, 1215 participants)

TVUS

983 participants in 10 studies

1073 participants in 11 data sets

Sens = 0.88 (0.82 to 0.94)

Spec = 1.00 (0.98 to 1.00)

263

10

59

741

Meets the criteria for a SpPin triage test

Observation: TVUS‐BP (3 studies) and RWC‐TVS (1 study) demonstrated the highest diagnostic accuracy

TRUS

232 participants in 2 studies

Sens = 0.78 (0.51 to 1.00)

Spec = 0.96 (0.89 to 1.00)

35

8

10

179

Meets the criteria for a SpPin triage test

Insufficient evidence to allow meaningful conclusions

MRI

288 participants in 3 studies

Sens = 0.81 (0.70 to 0.93)

Spec = 0.86 (0.78 to 0.95)

96

23

22

147

Criteria for a triage test not met

Insufficient evidence to allow meaningful conclusions

Vaginal wall endometriosis

(10 studies, 981 participants)

TVUS

679 participants in 6 studies

Sens = 0.57 (0.21 to 0.94)

Spec = 0.99 (0.96 to 1.00)

70

11

44

554

Meets the criteria for a SpPin triage test

Wide CIs

Observation: tg‐TVUS (1 study) demonstrated the highest diagnostic accuracy

TRUS

232 participants in 2 studies

Sens = 0.39 (0.08 to 0.70)

Spec = 1.00 (1.00 to 1.00)

18

0

28

186

Criteria for a triage test not met

Wide CIs

Insufficient evidence to allow meaningful conclusions

MRI

248 participants in 4 studies

271 participants in 5 data sets

Sens = 0.77 (0.67 to 0.88)

Spec = 0.97 (0.92 to 1.00)

48

11

14

198

Meets the criteria for a SpPin triage test

Observation: 3.0T MRI (1 study) and 3D‐MRI demonstrated the highest diagnostic accuracy

POD obliteration

(11 studies, 909 participants)

TVUS

755 participants in 6 studies

Sens = 0.83 (0.77 to 0.88)

Spec = 0.97 (0.95 to 0.99)

152

17

32

554

Meets the criteria for a SpPin triage test

Observation: TVUS‐BP ( 2 studies) demonstrated the highest diagnostic accuracy

MRI

154 participants in 5 studies

177 participants in 6 data sets

Sens = 0.90 (0.76 to 1.00)

Spec = 0.98 (0.89 to 1.00)

84

3

12

78

Meets the criteria for a SpPin triage test and approaches the criteria for a SnNout triage test

Observation: 3.0T MRI (3 studies) demonstrated the highest diagnostic accuracy

Anterior DIE

(3 studies, 330 participants)

TVUS

289 participants in 2 studies

Sens = 0.41 (0.00 to 0.81)

Spec = 1.00 (1.00 to 1.00)

11

0

16

262

Criteria for a triage test not met

Wide CIs

Insufficient evidence to allow meaningful conclusions

MRI

41 participants in 1 study

Not availablea

6

0

2

33

Insufficient evidence to allow meaningful conclusions

Rectosigmoid endometriosis

(21 studies, 2222 participants)

TVUS

1616 participants in 14 studies

1817 participants in 15 data sets

Sens = 0.90 (0.82 to 0.97)

Spec = 0.96 (0.94 to 0.99)

648

47

100

1022

Meets the criteria for a SpPin triage test and approaches the criteria for a SnNout triage test

Observation: TVUS‐BP (2 studies) and RWC‐TVS (2 studies) demonstrated the highest diagnostic accuracy

TRUS

330 participants in 4 studies

Sens = 0.91 (0.85 to 0.98)

Spec = 0.96 (0.91 to 1.00)

137

8

13

172

Meets the criteria for a SpPin triage test and approaches the criteria for a SnNout triage test

MRI

612 participants in 6 studies

635 participants in 7 data sets

Sens = 0.92 (0.86 to 0.99)

Spec = 0.96 (0.93 to 0.98)

352

11

30

242

Meets the criteria for a SpPin triage test and approaches the criteria for a SnNout triage test

Observation: MRI jelly method (1 study) and 3.0T MRI (1 study) demonstrated the highest diagnostic accuracy

MDCT‐e

389 participants in 3 studies

Sens = 0.98 (0.94 to 1.00)

Spec = 0.99 (0.97 to 1.00)

241

1

6

141

Meets the criteria for a SpPin test and a SnNout triage test

Insufficient evidence to allow meaningful conclusions

DCBE

106 participants in 2 studies

Sens = 0.56 (0.32 to 0.80)

Spec = 0.77 (0.41 to 1.00)

45

6

35

20

Criteria for a triage test not met

Wide CIs

Insufficient evidence to allow meaningful conclusions

Bowel

(ileum ‐ rectum) endometriosis

(4 studies, 412 participants)

TVUS

314 participants in 3 studies

Sens = 0.89 (0.81 to 0.97)

Spec = 0.96 (0.91 to 1.00)

135

7

16

156

Meets the criteria for a SpPin triage test

Observation: TVUS, non‐modified method (1 study) demonstrated highest diagnostic estimates

Insufficient evidence to allow meaningful conclusions

TRUS

134 participants in 1 study

Not availableb

72

0

3

59

Insufficient evidence to allow meaningful conclusions

MDCT‐e

194 participants in 2 studies

Sens = 0.98 (0.92 to 1.00)

Spec = 1.00 (1.00 to 1.00)

124

0

3

67

Meets the criteria for a SpPin test and a SnNout triage test

Insufficient evidence to allow meaningful conclusions

aFor MRI in anterior DIE, diagnostic estimates were sensitivity = 0.75 (0.35 to 0.97); specificity = 1.00 (0.89 to 1.00)

bFor TRUS in bowel endometriosis, diagnostic estimates were sensitivity = 0.96 (0.89 to 0.99); specificity = 1.00 (0.94 to 1.00)

Background

Target condition being diagnosed

Endometriosis

Endometriosis is defined as an inflammatory condition characterised by endometrium‐like tissue at sites outside the uterus (Johnson and Hummelshoj 2013). Endometriotic lesions can be found at different locations, including the pelvic peritoneum and the ovary, or can penetrate pelvic structures below the surface of the peritoneum as deeply infiltrating endometriosis (DIE). Each of these types of endometriosis is thought to represent a separate clinical entity, but different types can co‐exist in the same woman. Pelvic endometriosis is defined as the presence of any endometrial tissue within the pelvic cavity, including the peritoneum, within any of the pelvic organs and inside the pouch of Douglas (POD). Ovarian endometriosis, an endometrioma, is defined as an ovarian cyst lined by endometrial tissue; it appears as ovarian masses of varying size. Endometriomas are identified more easily by imaging or by pelvic examination than are other forms of endometriosis; however, discrimination of benign ovarian endometriosis from other types of ovarian tumours can be challenging. DIE is defined as endometriotic tissue that penetrates the retroperitoneal space for a distance of 5 mm or more (Koninckx 1991) and may be present in multiple locations, involving anterior or posterior pelvic compartments, or both. Posterior DIE, a multi‐focal disease that may affect a variety of anatomical sites, represents the most common type of DIE (Kinkel 2006). The most typical sites of DIE include uterosacral ligaments (USL), rectovaginal septum (RVS), vaginal wall, POD and bowel, predominantly below the rectosigmoid junction. Anterior DIE corresponds to disease involving the anterior pouch or bladder and is much less common. Rarely, endometriotic implants can be found at more distant sites, including lung, liver, pancreas and operative scars, with consequent variation in presenting symptoms.

Endometriosis afflicts 10% of women of reproductive age, causing dysmenorrhoea (painful periods), dyspareunia (painful intercourse), chronic pelvic pain and infertility (Vigano 2004). The clinical presentation can vary from asymptomatic and unexplained infertility to severe dysmenorrhoea and chronic pain. Symptoms can occur with bowel or urinary symptoms, an abnormal pelvic examination or the presence of a pelvic mass; however, no symptom is specific to endometriosis. Prevalence of endometriosis in the symptomatic population is reported as 35% to 50% (Giudice 2004).

Women with endometriosis are at increased risk of developing several cancers (Somigliana 2006) and autoimmune disorders (Sinaii 2002). The presence of disease is associated with changes in immune response, vascularisation, neural function, peritoneal environment and eutopic endometrium, suggesting that endometriosis is a systemic, rather than a localised, condition (Giudice 2004). Endometriosis has a profound effect on psychological and social well‐being and imposes a substantial economic burden on society. Women with endometriosis incur significant direct medical costs from diagnostic and therapeutic surgeries, hospital admissions and fertility treatments; however, these costs are superseded by indirect costs of endometriosis, including absenteeism and loss of productivity (Gao 2006; Simoens 2012). In the United States, the financial burden of endometriosis is estimated at US $12,419 per woman (Simoens 2012).

Although the pathogenesis of endometriosis has not been fully elucidated, it is commonly thought that endometriosis occurs when endometrial tissue contained within menstrual fluid flows retrogradely through the fallopian tubes and implants at an ectopic site within the pelvic cavity (Sampson 1927). However, this theory does not explain the fact that although retrograde menstruation is seen in up to 90% of women, only 10% of women develop endometriosis (Halme 1984). Evidence suggests that a variety of environmental, immunological and hormonal factors are associated with endometriosis (Vigano 2004), and genetic loci that confer risk of endometriosis have been identified (Nyholt 2012). The relative contributions of these and other causal factors remain to be elucidated.

Although it is impossible to time the onset of disease, on average, women have a six‐ to 12‐year history of symptoms before obtaining a surgical diagnosis of endometriosis, which indicates considerable diagnostic delay (Matsuzaki 2006). Untreated endometriosis is associated with reduced quality of life and contributes to outcomes such as depression, inability to work, sexual dysfunction and missed opportunities for motherhood (Gao 2006).

Treatment of endometriosis

No cure for endometriosis is known. Treatment options include expectant management, pharmacological (hormonal) therapy and surgery (Johnson and Hummelshoj 2013). Treatment is individualised, taking into consideration the therapeutic goal (pain relief or subfertility) and the location of the disease. Current pharmacological therapies such as the combined oral contraceptive pill, progestogens, weak androgens and gonadotropin‐releasing hormone (GnRH) agonists and antagonists act to reduce the effects of oestrogen on endometrial tissues and to suppress menstruation. These drugs can ameliorate symptoms of dysmenorrhoea and chronic pelvic pain, but they are associated with side effects such as breast discomfort, irritability, androgenic symptoms and bone loss. Surgical excision of endometriotic lesions can reduce pain and improve fertility, but is associated with high recurrence rates of 40% to 50% at five years post surgery (Guo 2009; Duffy 2014). Early treatment of individuals with endometriosis improves pain levels and physical and psychological functioning. Furthermore, improvements in management of menstruation (use of the Mirena coil and continuous use of the combined contraceptive pill) and fertility preservation (oocyte vitrification) raise the possibility of suppressing the progression of endometriosis and prospectively managing subfertility among endometriosis sufferers. The potential success of these preventative strategies is dependent on an accurate and early diagnosis. A major impediment to earlier and more efficacious treatment of this disease is diagnostic delay due to the invasive nature of standard diagnostic tests (Dmowski 1997).

Diagnosis of endometriosis

Clinical history and pelvic examination can raise the possibility of a diagnosis of endometriosis, but heterogeneity in clinical presentation, high prevalence of asymptomatic endometriosis (2% to 50%) and poor association between presenting symptoms and severity of the disease contribute to the difficulty involved in obtaining a reliable diagnosis of endometriosis based solely on presenting symptoms (Spaczynski 2003; Fauconnier 2005; Ballard 2008). Although an abnormal pelvic examination correlates with the presence of endometriosis on laparoscopy in 70% to 90% of cases (Ling 1999), the differential diagnosis for most positive physical findings is wide. Furthermore, a normal clinical examination does not exclude endometriosis, as laparoscopically proven disease has been diagnosed in more than 50% of women with a clinically normal pelvic examination (Eskenazi 2001). A variety of tests utilising pelvic imaging, blood markers, eutopic endometrium characteristics, urinary markers or peritoneal fluid components have been suggested as diagnostic measures for endometriosis. Although large numbers of the reported markers have distinguished women with and without endometriosis in small pilot studies, many have not shown convincing potential as a diagnostic test when evaluated in larger studies by different research groups. The diagnostic value of these tests has not been fully systematically evaluated and summarised by Cochrane methods. Currently, no simple non‐invasive test for the diagnosis of endometriosis is routinely implemented in clinical practice.

Surgical diagnostic procedures for endometriosis include laparoscopy (minimal access surgery) or laparotomy (open surgery via an abdominal incision). Over the past several decades, laparoscopy has become an increasingly common procedure that has largely replaced traditional open surgery among women suspected of having endometriosis (Yeung 2009). Laparoscopy confers significant advantages over laparotomy, creating fewer complications and shorter recovery times. Furthermore, a magnified view at laparoscopy allows better visualisation of the peritoneal cavity. Despite continuing controversy in the literature with regard to the superiority of one surgical modality over another for treating women with pelvic disease, laparoscopy is the preferred technique for evaluating the pelvis and abdomen and for treating individuals with benign conditions such as ovarian endometrioma (Medeiros 2009). Surgery is also the only currently accepted way to determine the extent and severity of endometriosis. Several classification systems have been suggested for endometriosis (Batt 2003; Chapron 2003a; Martin 2006; Adamson 2008), but most researchers and clinicians use the revised American Society for Reproductive Medicine (rASRM) classification, which is internationally accepted as a respected currently available tool for objective assessment of the disease (American Society for Reproductive Medicine 1997). The rASRM classification system considers appearance, size and depth of peritoneal or ovarian implants and adhesions visualised during laparoscopy (Table 1) and allows uniform documentation of the extent of disease. Unfortunately, this classification system has little value in clinical practice because of lack of correlation between laparoscopic staging, severity of symptoms and response to treatment (Vercellini 1996; Guzick 1997; Chapron 2003b). A recent endeavour to attain consensus around the optimal classification for endometriosis has been undertaken by the World Endometriosis Society (Johnson 2015).

Open in table viewer
Table 1. Staging of endometriosis, rASRM classification

Peritoneum

Endometriosis

< 1 cm

1‐3 cm

> 3 cm

Superficial

1

2

4

Deep

2

4

6

Ovary

R Superficial

1

2

4

Deep

4

16

20

L Superficial

1

2

4

Deep

4

16

20

Posterior Cul‐de‐sac Obliteration

Partial Complete

4 40

Ovary

Adhesions

< 1/3 Enclosure

1/3‐2/3 Enclosure

> 2/3 Enclosure

R Filmy

1

2

4

Dense

4

8

16

L Filmy

1

2

4

Dense

4

8

16

Tube

R Filmy

1

2

4

Dense

4a

8a

16

L Filmy

1

2

4

Dense

4a

8a

16

aIf the fimbriated end of the fallopian tube is completely enclosed, change the point assignment to 16 American Society for Reproductive Medicine 1997

The European Society for Human Reproduction and Embryology (ESHRE) Special Interest Group for Endometriosis stated in its guidelines for the diagnosis and treatment of endometriosis that for women presenting with symptoms suggestive of endometriosis, a definitive diagnosis of most forms of endometriosis requires visual inspection of the pelvis at laparoscopy as the 'gold standard' investigation (Kennedy 2005). Currently, the visual or histological identification of endometriotic tissue in the pelvic cavity during surgery is not just the best available but the only diagnostic test for endometriosis that is used routinely in clinical practice.

Disadvantages of laparoscopic surgery include and are not limited to high cost, need for general anaesthesia and potential for adhesion formation post procedure. Laparoscopy has been associated with 2% risk of injury to pelvic organs, 0.001% risk of damage to a major blood vessel and a mortality rate of 0.0001% (Chapron 2003c). Only one‐third of women who undergo a laparoscopic procedure will receive a diagnosis of endometriosis; therefore, many disease‐free women are unnecessarily exposed to surgical risk (Frishman 2006)

The validity of laparoscopy as a reference test for endometriosis has been assessed as highly dependent on the skills of the surgeon. The diagnostic accuracy of laparoscopic visualisation has been compared with histological confirmation in a sole systematic review; 94% sensitivity and 79% specificity have been reported (Wykes 2004). Subsequent studies suggested that incorporation of histological verification into the diagnosis of endometriosis may improve diagnostic accuracy (Marchino 2005; Almeida Filho 2008; Stegmann 2008), but these papers have not been systematically reviewed. The clinical significance of histological verification remains debatable, and a diagnosis based on visual findings can be considered reliable with accurate inspection of the abdominal cavity by properly trained experienced surgeons (Redwine 2003). Furthermore, excised potentially endometriotic tissues are rarely serially sectioned in clinical practice, and small lesions can be missed by pathologists in cases of mild disease. Thus sampling inconsistencies are likely to influence the accuracy of histological reporting.

Summary

A diagnostic test in place of surgery would reduce associated surgical risks, increase diagnostic accessibility and improve treatment outcomes. The need for an accurate and non‐invasive diagnostic test for endometriosis continues to encourage extensive research in the field and was endorsed at the international consensus workshop at the 10th World Congress of Endometriosis in 2008 (Rogers 2009). Although multiple markers and imaging techniques have been explored as diagnostic tests for endometriosis, none of them have been implemented routinely in clinical practice, and many have not been subject to systematic review.

Index test(s)

This review assesses the diagnostic imaging techniques that have been proposed as non‐invasive tests for the diagnosis of endometriosis (Table 2) as part of the review series on non‐invasive diagnostic tests for endometriosis. The other reviews from this series include 'Blood biomarkers for the non‐invasive diagnosis of endometriosis', 'Endometrial biomarkers for the non‐invasive diagnosis of endometriosis', 'Urinary biomarkers for the non‐invasive diagnosis of endometriosis' and 'Combination of the non‐invasive tests for the diagnosis of endometriosis', which is the summary review for this series.

Open in table viewer
Table 2. Index tests ‐ description and common abbreviations

Test name as presented in the review

Description

Alternative names presented in the included studies

MRI tests

MRI (magnetic resonance imaging)

Equipment: 1.5 Tesla magnet device with a parallel or phased array body or pelvic coil for signal excitation and reception

Participants’ preparation: Fasting for 3‐6 hours before the test and/or bowel preparation with oral laxatives was described by some investigators; an intravenous injection of anti‐peristaltic agent at the outset of the examination to decrease bowel peristalsis; supine position. Some groups performed MRI with full bladder to correct the angle of the ante‐flexed uterus; some groups described introducing of ultrasonographic gel (˜ 50 to 60 mL) into the vaginal canal to distend the vaginal fornices

Protocol: Imaging is performed in the axial plane with or without sagittal or coronal planes. Different types of sequences allow to image the same tissue in various ways, and combinations of sequences reveal important diagnostic information about the tissue in question. The imaging parameters (section thickness, field of view (FOV), matrix size) vary between protocols. Images are documented on radiographic film and in digital files and analysed at workstation

  • MRI T1/T2‐w

(conventional T1‐/T2‐weighted)

The protocol includes axial spin‐echo or gradient echo T1‐weighted (T1‐w) images followed by fast spin‐echo (FSE)/turbo spin‐echo (TSE) images or fast relaxation fast‐spin echo (FR‐FSE) T2‐w images

MRI;

CSE (conventlonal spin echo)

  • MRI fat‐suppressed

(T1‐weighted)

Protocol includes T1‐w imaging using chemical fat suppression, which aids in the differentiation of lipid and haemorrhagic pathologies. Fat suppression is a generic term that includes various techniques to suppress the signal from normal adipose tissue to reduce chemical shift artefact and can be achieved by various methods. This is commonly a part of the MRI protocol and is rarely used in isolation

Fat‐saturated MRI

  • MRI T1/T2‐w + fat‐suppressed/ Gd

(T1‐/T2‐weighted with fat‐suppression contrast enhanced)

Protocol includes gradient echo T1 images with and without fat suppression followed by FSE or FR‐FSE T2‐w images before and after intravenous injection of the paramagnetic contrast agent gadolinium

MRI;

CSE/TIFS (conventlonal spin echo in combination with T1‐w fat‐suppressed)

CSE/TIFS/Gd‐TIFS (conventlonal spin echo in combination with T1‐w fat‐suppressed and gadolinium‐enhanced TlFS)

  • MRI 'jelly method'

Protocol involves pretreatment of participants for MRI by simultaneous injection of ultrasonographic gel into the vagina (˜ 50 mL) and into the rectum (150 mL gel 50% diluted with water). Another technique evolves introduction of 300‐400 mL of diluted ultrasonographic gel (1:8 dilution) for rectosigmoid distension without use of intravaginal gel

MRI‐e (magnetic resonance enema)

3D‐MRI (3‐dimensional MRI)

Protocol includes 3D coronal single‐slab (containing all the slices) MRI, entitled 'CUBE' with FSE T2‐w images. The technique involves using variable flip angle refocusing, auto‐calibrating, 2D accelerated parallel imaging and nonlinear view ordering to produce high‐resolution volumetric image data sets and to reduce imaging time by using multi‐planar reformations

3.0T MRI

Equipment: 3.0Tesla Magnetom system with a multi‐channel phased‐array surface body‐coil

Participants’ preparation: Fasting for 3 hours before the test was reported by some but not all studies; intravenous injection of anti‐peristaltic agent at the outset of the examination to decrease bowel peristalsis; administration of a negative super‐paramagnetic oral contrast agent to reduce signal intensity of the bowels. Examination with the full bladder in a ‘feet first’ supine position

Protocol: combination of all or some of the following sequences: T‐w FSE, 2D‐T2‐w FR‐FSE/FSE, 3D‐T2‐w FR‐FSE CUBE, 3D‐T1‐w fat‐suppressed and/or LAVA‐flex (liver imaging with volume acceleration‐flexible) sequences. MRI images are acquired according to multiple scan planes, in particular axial, coronal and sagittal planes of the pelvis and sacral para‐coronal plane. Contrast agent (gadolinium) is administered in selected cases. Total acquisition time ˜ 20 min without or 30‐40 min with contrast injection

Ultrasound tests

TVUS

(transvaginal ultrasonography)

Equipment: any of the commercially available ultrasound machines equipped with a wide‐band high‐resolution vaginal transducer (brands of scanners and frequencies of transducers vary between studies)

Participants' preparation: Examination is performed in a dorsal lithotomy position with empty or half‐full bladder; no bowel preparation is routinely required

Protocol: An ultrasound gel is applied to the tip of the transducer probe to create a lubricating, acoustically correct interface with the tissue. Scans are obtained by inserting the transducer (protected by disposable thin cover) into the vagina, followed by sequential movement of the probe within the vaginal canal to allow systematic evaluation of pelvic structures (uterus and adnexal regions; attention paid to the ovaries, pouch of Douglas, vesicouterine pouch and uterosacral ligament). The technique involves longitudinal, transverse and angled movements of the probe with sliding up and down, back and forward to obtain both longitudinal and transversal scans of pelvic structures. Examination protocols vary between studies. Each examination is interpreted in real time and can be documented in printed photographs

TVS

'transvaginal ultrasound'

'transvaginal sonography'

  • TVUS‐BP

(transvaginal ultrasonography with bowel preparation)

Examination consists of TVUS combined with bowel preparation including the following: low‐residue diet for 1‐3 days, oral laxative on the eve of the examination, rectal enema within an hour before the examination or a combination of the above

  • RWC‐TVS

(rectal water contrast transvaginal ultrasonography)

Examination consists of TVUS combined with bowel preparation and instillation of water contrast in rectum during TVUS; procedure does not require general anaesthesia

Protocol: After the transducer is introduced into the vagina, a flexible thin catheter (18‐28 Ch) with a rubber balloon is inserted into the rectal lumen up to 20 cm from the anus (gel infused with lidocaine is used to facilitate passage of the catheter). Rectal water contrast of 100 to 300 mL of warm saline solution is instilled inside the balloon under ultrasonographic guidance to provide high‐definition images of the rectal wall and its layers. Back flow of the solution is prevented by placement of a Klemmer forceps on the catheter. Images are obtained before, during and after saline injection

'transvaginal sonography with water‐contrast in the rectum'

'water‐contrast in the rectum during transvaginal ultrasonography'

  • SVG

(sonovaginography)

Examination consists of TVUS combined with the introduction of saline solution or gel to the vagina to create an acoustical window between the transvaginal probe and surrounding structures and to distend the vaginal walls, permitting enhanced visualisation of pelvic structures

Protocol: Procedure involves introduction of a Foley catheter into the vagina followed by insertion of the transvaginal probe with further injection of 200‐400 mL of saline through the catheter by the assistant. To prevent reflux of saline solution from the vagina, the vaginal canal is closed with the operator’s hand. Alternative method involves placement of 20 mL of ultrasound gel into the posterior vaginal fornix with a plastic syringe, followed by insertion of a transvaginal probe. Reported procedure time ranges from 30 to 45 minutes

'transvaginal sonography and acoustic window with intravaginal gel'

  • tg‐TVUS

(tenderness‐guided TVUS)

Examination consists of TVUS combined with particular attention to the tender points evoked during examination

Protocol: Larger amount of ultrasound gel (˜ 12 mL instead of the usual 4 mL) is introduced into the probe cover to create a stand‐off for visualisation of the near‐field area. The probe is inserted gently to avoid the risk of squeezing out the gel. After the initial sonographic evaluation, the participant is asked to inform the operator about the onset and site of any tenderness experienced during probe pressure within the posterior fornix. When tenderness is evoked, the sliding movement is stopped, and particular attention is paid to the painful site via gentle pressure with the probe’s tip to detect endometriosis lesions. Reported procedure time is 15 to 20 minutes in cases of suspected lesions, but less time when the examination is negative

  • 3D‐TVUS

(3‐dimensional transvaginal ultrasonography)

Equipment: An ultrasound scanner equipped with 3D/4D imaging modes and a wide‐band high resolution volume transvaginal transducer. The method enables the acquisition of ultrasonographic volumetric data that can be assessed off‐line; in most institutions used as an adjunct to 2D US

Protocol: region‐of‐interest (ROI) is identified using a B‐mode scan and a transvaginal volume transducer. During the volumetric scan, the transducer carries out a series of parallel scans of varying speeds focusing on the ROI. The anatomical ROI is visualised on the monitor as a graphic containing the 3 orthogonal planes. During volumetric scans, the investigator adopts some expedients such as positioning the probe near the anatomical ROI and reducing or eliminating participant movements. The volume obtained is stored on a hard disk and displayed later using dedicated software

  • Introital 3D‐US

(introital 3‐dimensional ultrasound)

Examination is performed with the transducer placed on the perineum against the symphysis pubis (firmly but without causing significant discomfort). To acquire a correct volume, the symphysis pubis, urethra, vagina, and rectum should be visualised in the same image. Gain is adjusted and focal area is set to the region of interest, with the sweep angle set at 90 or 120 degrees to produce a multi‐planar image in 3 planes: longitudinal, transverse and coronal

TRUS (transrectal ultrasonography)

Equipment: An ultrasound scanner with a 2‐dimensional axial and sagittal convex high‐frequency probe with or without a rigid linear probe or a flexible endoscope with lateral view and a convex high frequency echo probe

Participants' preparation: A low‐residue diet for 3 days before the examination with or without laxatives and/or rectal enema is reported in some but not all studies; several groups described using general or local anaesthesia for the procedure, and some groups used no analgesia

Protocol: A gel‐filled rubber sheath or water‐filled balloon is placed over the tip of the transducer to obtain better visibility. The transducer is inserted into the rectum and is advanced until the midline image of the cervix is visualised in the longitudinal view. Pelvic structures are evaluated by moving the transducer along its longitudinal axis and rotating it 130° to 140° along the main axis in both axial and longitudinal planes. Alternative technique includes insertion of the flexible probe into the sigmoid colon, over the aortic bifurcation and/or the upper part of the body of the uterus, with subsequent slow withdrawal, allowing optimum imaging of rectal and sigmoid colon walls/pelvic structures, with instillation of water into the intestinal lumen and alternating use of several frequencies (e.g. 5, 7.5, 12 MHz)

TRS (transrectal sonograph)

Tr EUS (transrectal endoscopic ultrasonography)

RES (rectal endoscopic sonography)

REU (rectal endoscopic ultrasonography)

Other tests

MDCT‐e

(multi‐detector computerised tomography enema)

Equipment: multi‐detector computed tomograph, which has a 2‐dimensional array of detector elements that permits CT scanners to acquire multiple slices or sections simultaneously and greatly increase the speed of CT image acquisition (unlike the linear array of detector elements used in typical conventional and helical CT scanners)

Participants’ preparation: low‐residue diet for 3 days and bowel preparation with an oral laxative day before the examination; intravenous injection of anti‐peristaltic agent during the test

Protocol: colonic distension performed by introducing about 2000 mL of water at 37ºC into the left lateral decubitus position. All participants receive an intravenous injection of iodine‐containing contrast. Participants are scanned in supine position from the dome of the diaphragm to the pubic symphysis in the portal phase (40 seconds after the arterial peak). Scan parameters (collimation, rotation time, tube voltage, effective mAs) differ between studies. Estimated radiation exposure is calculated by the scanner using CT dose index and is saved to the dose report. Both axial plane and multi‐planar reconstructions (sagittal and coronal) are evaluated. Images are reviewed at a workstation

MSCTe (multi‐slice computed tomography combined with colon distension by water enteroclysis)

'Water enema CT'

18FDG‐PET (fluorodeoxyglucose positron emission tomography)

Equipment: PET‐computed tomograph

Participants’ preparation: Fasting for at least 6 hours before the test; 18FDG (a glucose analogue) injection 60 min before the test

Protocol: Acquisition is performed with the participant in supine position, from mid‐thigh to the base of the skull. No iodine‐based contrast is administered. CT parameters reported in a single included study are 120 kV, 120 mA, pitch 1.5:1, speed 15 mm/rot. The PET element operates in 2D mode for 4 minutes per bed position. Attenuation correction is based on CT data

DCBE (double‐contrast barium enema)

Equipment: motorised tilting radiographic table and standard equipment for fluoroscopic and radiological examination

Participants’ preparation: low‐residue diet for 1‐3 days before the examination with or without oral laxatives day before the procedure; an anti‐peristaltic agent is administered intravenously at the outset of the examination to decrease bowel peristalsis

Protocol: The procedure is performed in 2 steps to obtain double contrast and involves change of participant positions to ensure detailed visualisation of all intestinal segments. Barium sulphate contrast (600 to 800 mL) is instilled into rectum with a gravity pressure in the left lateral decubitus position. Once the barium reached the hepatic flexure, the colon was drained by gravity to remove as much barium as possible from the rectal ampulla without clearing completely the rectosigmoid colon of barium. Room air is then gently insufflated into the colon. Sequential views of the bowel are obtained. Each colonic segment is viewed in detail on spot radiographs and in magnification images. The procedure lasts 15 to 20 minutes

The definition of ‘non‐invasive’ varies between medical dictionaries, but the term refers to a procedure that does not involve penetration of skin or physical entrance into the body (McGraw‐Hill Dictionary of Medicine 2006; The Gale Encyclopedia of Medicine 2008). Although some imaging tests are associated with an intracavitary approach (e.g. transvaginal, transrectal) and therefore are invasive by this definition, when compared with diagnostic surgery for endometriosis, these tests are generally considered to be 'non‐invasive' or 'minimally invasive'. For the purpose of these reviews, we will define all tests that do not involve anaesthesia and surgery as ‘non‐invasive’.

Advantages of using imaging tests for the diagnosis of endometriosis include that they are minimally invasive, readily available and more acceptable to women; provide a rapid result; and are more cost‐effective when compared with surgery. However, imaging testing is dependent on the skills of the operator and the ability of women to access appropriate radiology services. At this point in time, all imaging modalities have been assessed in a limited number of small studies, which vary in the type of imaging methods used and the anatomical locations evaluated.

Magnetic resonance imaging (MRI) and ultrasonography (US) (which includes transabdominal, transvaginal and transrectal approaches) are the most widely reported diagnostic modalities for endometriosis. A systematic review that primarily summarised the diagnostic performance of ultrasound for endometriosis‐associated ovarian masses (endometriomas) concluded that transvaginal ultrasound (TVUS) has clinical utility in differentiating endometriomas from other types of ovarian cysts (Moore 2002). This review concentrated on studies that used transabdominal and transvaginal US with or without Doppler and did not include reports on other forms of ultrasound, nor did it evaluate non‐ovarian forms of endometriosis. Studies that evaluated the ability of ultrasound to detect endometriotic implants at other pelvic sites reported varying degrees of accuracy for deep endometriotic lesions and failure to detect small lesions and pelvic adhesions (Kinkel 2006). Because of high costs and limited availability, MRI is not frequently implemented in routine clinical practice; however, a growing number of studies suggest that it has a role in the diagnosis of deep endometriotic lesions and greater ability than other modalities to detect small lesions (Kinkel 2006). Recently, MRI was promoted as the non‐invasive imaging technique of choice for detection and classification of endometriosis (Saba 2014a). Several recent systematic reviews on imaging in endometriosis (Hudelist 2011b; Medeiros 2014; Guerriero 2015) and narrative reviews on the topic primarily addressed diagnostic performance of US and/or MRI for deep endometriosis, predominantly with bowel involvement.

To improve diagnostic performance, variations in ultrasound techniques have been used, including transvaginal ultrasonography with bowel preparation (TVUS‐BP) (Goncalves 2010), use of water contrast in the rectum (RWC‐TVS) (Menada 2008a) or vagina (sonovaginography (SVG)) (Dessole 2003) and three‐dimensional ultrasonography (3D‐US) (Grasso 2010). Several modifications have been made to conventional MRI such as use of T1/T2‐weighted (T1/T2‐w) images, including addition of fat suppression with or without contrast enhancement. Three‐dimensional MRI (3D‐MRI) has also been evaluated as a single test for endometriosis, and 3.0T MRI has been developed using the 3.0T Magnetom system (in contrast to the widely used 1.5T system) with incorporation of T1/T2‐w, fat‐suppressed and 3D sequences (Hottat 2009; Manganaro 2013; Thomeer 2014). Computed tomography (CT)‐based imaging (Biscaldi 2007), barium enema (Ribeiro 2008a) and other techniques have been explored as diagnostic tests for endometriosis. Improvements in imaging technology over time have positively affected the diagnostic ability of the same type of imaging test to detect endometriosis. Re‐evaluation of diagnostic test accuracy by Cochrane methods for a variety of imaging modalities is needed.

Clinical pathway

Women who present with symptoms of endometriosis (dysmenorrhoea, dyspareunia, chronic pelvic pain, difficulty conceiving) generally are investigated by a gynaecological examination and pelvic ultrasound scan to exclude other pathologies, in keeping with international guidelines (ACOG Committee on Gynecology 2010; SOGC 2010; Dunselman 2014). No other standard investigative tests are available, and MRI is used conservatively because of its cost. If women seek pain management rather than conception, empirical treatment with progestogens or the combined oral contraceptive pill is commonly started. Diagnostic laparoscopy is considered if empirical treatment fails, or if women decline or do not tolerate empirical treatment. In women who have difficulty conceiving, laparoscopy can be undertaken before fertility treatment is provided (particularly if severe pelvic pain or endometriomas are present) or after failed ART (assisted reproductive technology) treatment. Endometriosis can be diagnosed during fertility investigations in women who have minimal or no pain symptoms.

On average, a delay of six to 12 years is seen from onset of symptoms to definitive diagnosis at surgery (Matsuzaki 2006). Rapid referral to a gynaecologist with the ability to perform diagnostic surgery is associated with shorter time to diagnosis (Greene 2009). Collectively, young women, women in remote and rural locations and women of lower socioeconomic status have reduced access to surgery and are less likely to obtain prompt diagnosis and/or localisation of endometriosis.

Prior test(s)

Most women who present with symptoms suggestive of endometriosis undergo a full history and physical examination and a routine gynaecological ultrasound before the decision is made to perform diagnostic surgery. However, no consensus exists on whether ultrasound or any other test should be used routinely as part of a standardised approach.

Role of index test(s)

A new diagnostic test can fulfil one of three roles.

  • Replacement: used to replace an existing test by providing greater or similar accuracy, along with other advantages.

  • Triage: used as an initial step in a diagnostic pathway to identify women who need to undergo further testing with an existing test. Although ideally a triage test has high sensitivity and specificity, it may have lower sensitivity but higher specificity than the current test, or vice versa. The triage test does not aim to improve the diagnostic accuracy of the existing test but rather to reduce the number of individuals undergoing an unnecessary diagnostic test.

  • Add‐on: used in addition to an existing test to improve diagnostic performance (Bossuyt 2008).

Ideally, a diagnostic test is expected to correctly identify all women with a specific disease and to exclude all who do not have that disease, in other words, it should have sensitivity and specificity of 100%. High sensitivity indicates that a small number of women who have a negative test do have the disease (i.e. small number of false‐negative results). High specificity corresponds to a small number of women who receive a positive test result but do not have the disease (i.e. small number of false‐positive results). In practice, however, it is extremely rare to find a test with equally high sensitivity and specificity. An acceptable replacement test would need to have similar or higher sensitivity and specificity than the current gold standard of laparoscopy. The only systematic review performed to determine the accuracy of laparoscopy in diagnosing endometriosis reported sensitivity of 94% and specificity of 79% (Wykes 2004), and we have used this as a cut‐off value for a replacement test.

The purpose of triage tests can vary depending on clinical context and patient priorities. One reasonable approach is to exclude the diagnosis to avoid further unnecessary and expensive diagnostic investigation. High‐sensitivity tests yield few false‐negative results and act to rule conditions out (SnNout). A negative result from a test with high sensitivity will exclude the disease with high certainty independent of the specificity. As women without disease would be assured of having a negative test, unnecessary invasive interventions can be avoided. However, a positive result has less diagnostic value, particularly when specificity is low. We predetermined that a clinically useful 'SnNout' triage test should have sensitivity of 95% or more and specificity of 50% or above. The sensitivity cut‐off for a 'SnNout' triage test was set at 95% or above, if it is assumed that a 5% false‐negative rate is statistically and clinically acceptable. The specificity cut‐off was set at 50% or above, to avoid diagnostic uncertainty about more than 50% of the population receiving a positive result.

An alternative approach would be to avoid a missed diagnosis. High‐specificity tests yield few false‐positive results and act to rule conditions in (SpPin). A positive result for a highly specific triage test indicates a high likelihood of endometriosis. This information could be used to prioritise women for surgical treatment. A positive 'SpPin' test could also provide a clinical rationale for starting targeted disease‐specific medical treatment for a woman without a surgical diagnosis, under the assumption that disease is present. Surgical management could be reserved for cases when conservative treatment fails. This is particularly relevant in some populations for which the therapeutic benefits of surgery for endometriosis have to be carefully balanced with the disadvantages (e.g. young women, women with medical conditions, pain‐free women with a history of infertility). In this scenario, we considered sensitivity of 50% or above and specificity of 95% or higher as suitable cut‐offs for a 'SpPin' triage test.

We evaluated imaging tests for their potential to replace surgery (replacement test) or to improve selection of women for surgery (triage test) that can rule out (SnNout) or rule in (SpPin) the disease. Both types of triage tests are clinically useful, minimising the number of unnecessary interventions. Sequential implementation of SnNout and SpPin tests can also optimise a diagnostic algorithm (Figure 1). We did not assess any test as an add‐on test, as we sought tests that reduce the need for surgery ‐ not tests that improve the accuracy of the currently available surgical diagnosis.


Sequential approach to non‐invasive testing of endometriosis.

Sequential approach to non‐invasive testing of endometriosis.

Knowledge of the accuracy of imaging index tests for detecting DIE at specific intrapelvic anatomical locations provides valuable information for surgeons, who can preoperatively arrange bowel preparation or availability of specialist surgical expertise for removal of lesions at particular locations. Surgical mapping of disease in isolated anatomical sites cannot exclude the disease somewhere else in the pelvis, hence it is not appropriate to use replacement test criteria for anatomical mapping, and we considered these types of tests only in the context of SnNout and SpPin triage criteria.

Alternative test(s)

No alternative tests for the diagnosis of endometriosis are available in routine clinical practice.

Rationale

Many women with endometriosis suffer long‐standing pelvic pain and infertility before they receive the diagnosis. Surgery is the only method currently used to diagnose endometriosis, but it is associated with high costs and surgical risks. A simple and reliable non‐invasive test for endometriosis with the potential to replace laparoscopy or to triage women to reduce surgery would minimise surgical risk and reduce diagnostic delay. Endometriosis could be detected at less advanced stages, and earlier interventions instituted. This would provide the opportunity for a preventative approach to this debilitating disease. Healthcare and social security costs of endometriosis would be expected to be reduced by early diagnosis and more cost‐effective and efficient treatments.

Accurate diagnostic tests are important in strategic considerations of treatment planning. Women with severe invasive disease particularly benefit from surgical management, the efficacy of which depends on the completeness of excision of endometriotic lesions (Garry 1997). Therefore, ability to diagnose deep infiltrating endometriosis in general and at specific anatomical sites in particular might lead to selection of surgical technique, involvement of a multi‐disciplinary surgical team or referral to the most appropriate practice (Chapron 2003a).

Objectives

Primary objectives

  • To provide the estimates of the diagnostic accuracy of imaging modalities for the diagnosis of pelvic endometriosis, ovarian endometriosis and deeply infiltrating endometriosis (DIE) versus surgical diagnosis as a reference standard.

  • To describe performance of imaging tests for mapping of deep endometriotic lesions in the pelvis at specific anatomical sites.

Imaging tests were evaluated as replacement tests for diagnostic surgery and as triage tests that would assist decision making regarding diagnostic surgery for endometriosis.

Secondary objectives

To investigate the influence of heterogeneity on the diagnostic accuracy of imaging modalities for endometriosis. Potential sources of heterogeneity include the following.

  • Characteristics of the study population: age (adolescence vs later reproductive years); clinical presentation (subfertility, pelvic pain, ovarian mass, asymptomatic women); stage of disease (revised American Society for Reproductive Medicine (rASRM) classification system); geographic location of study.

  • Histological confirmation in conjunction with laparoscopic visualisation versus laparoscopic visualisation alone.

  • Changes in technology over time: year of publication; modifications applied to conventional imaging techniques.

  • Methodological quality: differences in the QUADAS‐2 (Quality Assessment of Diagnostic Accuracy Studies‐2) evaluation (low vs unclear or high risk); consecutive versus non‐consecutive enrolment; blinding of surgeons to results of index tests.

  • Study design ('single gate design' vs 'two‐gate design' studies).

Methods

Criteria for considering studies for this review

Types of studies

Published peer‐reviewed studies that compared results of one or several types of imaging tests versus results obtained from a surgical diagnosis of endometriosis.

We included studies if they were:

  • randomised controlled trials;

  • observational studies of prospectively recruited women of the following designs:

    • ‘single gate design’ (studies with a single set of inclusion criteria defined by clinical presentation). All participants had clinically suspected endometriosis; or

    • ‘two‐gate design’ (studies in which participants are sampled from distinct populations with respect to clinical presentation). The same study includes participants with a clinical suspicion of having the target condition (e.g. women with pelvic pain) and participants in whom the target condition is not suspected (e.g. women admitted for tubal ligation). Two‐gate studies were eligible only when all cases and controls belonged to the same population with respect to the reference standard (i.e. all participants were scheduled for laparoscopy) (Rutjes 2005).

  • performed in any healthcare setting; or

  • published in any language;

We did not impose a minimal limit on the number of participants in the included studies nor on the number of studies that have evaluated each index test.

We excluded the following studies.

  • Studies of specific study designs.

    • Narrative or systematic review.

    • Study of retrospective design when the index test was performed after execution of a reference test, or participants were selected through a retrospective review of case notes. Knowledge of the reference test could bias relatively subjective index tests. If endometriosis is found at a diagnostic surgical procedure, excision is commonly carried out concurrently, and this could bias the results of an index test performed after the reference standard.

    • Case report or case series.

  • Studies reported only in abstract form or in conference proceedings for which the full text was not available. This limitation was applied when we faced substantial difficulty in obtaining the information from abstracts, which precluded a reliable assessment of eligibility and methodological quality.

Participants

Study participants included women of reproductive age (puberty to menopause) with suspected endometriosis based on clinical symptoms and/or pelvic examination, who undertook both the index test and the reference standard.

Participants were selected from populations of women undergoing abdominal surgery for the following indications: (1) clinically suspected endometriosis (pelvic pain, infertility, abnormal pelvic examination or a combination of these), (2) ovarian mass regardless of symptoms, (3) a mixed group, which consists of women with suspected endometriosis/ovarian mass and/or women with other benign gynaecological conditions (e.g. surgical sterilisation, fibroid uterus).

Articles that included participants of postmenopausal age were eligible when data for the reproductive age group were available in isolation. Studies were excluded when the study population involved participants who clearly would not undergo the index test in a clinical scenario and/or would not benefit from the test (e.g. women with ectopic pregnancies, gynaecological malignancies, acute pelvic inflammatory disease). We also excluded publications in which only a subset of participants with a positive index test or reference standard were included in the analysis, and data for the whole cohort were not available.

Index tests

All types of imaging modalities for endometriosis, including possible modifications to conventional techniques, were assessed separately or in combination with other imaging tests. We attempted to group several types of tests that were based on common technical principles and similarity in clinical applicability. The index tests assessed are presented and described in Table 2.

We considered studies only if data were reported in sufficient detail for construction of 2 × 2 contingency tables. We included only studies that reported diagnostic accuracy estimates per number of participants ('participant‐level' analysis).

We undertook an independent evaluation of the diagnostic test accuracy of imaging tests to anatomically map endometriotic lesions because multiple endometriotic implants can co‐exist at different sites in the same individual. For this 'region level' analysis, only analyses that recorded data estimates per number of participants were included, as information about the accuracy of imaging tests for mapping the disease is more informative and clinically applicable when presented as per‐participant calculations of accuracy estimates.

Combined evaluations of imaging tests and other methods of diagnosing endometriosis (e.g. pelvic examination; urine, endometrial or blood tests) are beyond the scope of this review and are presented separately in another review titled 'Combined tests for the non‐invasive diagnosis of endometriosis'. We excluded from the review studies that solely assessed specific technical aspects, radiological criteria or interobserver variability of index tests without reporting data on diagnostic performance.

The diagnostic performance of an index test was considered high when the test reached the criteria for a replacement test (sensitivity ≥ 94% with specificity ≥ 79%) or a triage test (sensitivity ≥ 95% with specificity ≥ 50%, or vice versa). We categorised as 'approaching' high accuracy imaging tests with diagnostic estimates within 5% of set thresholds. We considered all other diagnostic estimates as low.

Target conditions

Investigators assessed three target conditions.

  • Pelvic endometriosis: defined as endometrial tissue located within the pelvic cavity, including any of the pelvic organs, peritoneum and pouch of Douglas.

  • Ovarian endometriosis (endometrioma): defined as ovarian cysts lined by endometrial tissue and appearing as an ovarian mass of varying size.

  • DIE: defined as subperitoneal infiltration of endometrial implants, for example, when endometriotic implants penetrate the retroperitoneal space for a distance of 5 mm or more. Posterior DIE is the most common form of DIE, and both conditions are interchangeably reported. For the purpose of this review, we combined them as a single target condition ‐ DIE/posterior DIE.

In addition, the ability of diagnostic imaging to map endometriotic lesions at specific anatomical pelvic locations was evaluated. Anatomical locations included rectovaginal septum (RVS), uterosacral ligament (USL), vaginal wall, POD obliteration, anterior DIE, rectosigmoid colon and the entire bowel from ileum to rectum. These locations are defined in Table 3.

Open in table viewer
Table 3. Target conditions ‐ types and anatomical distribution of endometriosis

Type of endometriosis

Description

Main clinical types of endometriosis

Pelvic endometriosis

Endometriotic lesions, deep or superficial, located at any site in pelvic/abdominal cavity: on the peritoneum, fallopian tubes, ovaries, uterus, bowel, bladder or PODa

Ovarian endometriosis

Ovarian cysts lined by endometrial tissue (endometrioma)

DIEb

Deep endometriotic lesions extending more than 5 mm under the peritoneum located at any site of pelvic/abdominal cavity

Subtypes of deep endometriosis per anatomical localisationc

Posterior DIE

Deep endometriotic lesions involve ≥ 1 site of the posterior pelvic compartment (USLd RVSe, vaginal wall, bowel) and/or obliterate PODa

USLd endometriosis

Endometriotic lesions infiltrate uterosacral ligaments unilaterally or bilaterally

RVSe endometriosis

Deep endometriotic implants infiltrate the retroperitoneal area between posterior wall of vaginal mucosa and anterior wall of rectal muscularis

Vaginal endometriosisf

Endometriotic lesions infiltrate vaginal wall, particularly posterior vaginal fornix

PODa obliteration

Defined when the peritoneum of the PODa is only partially or no longer visible during surgery, and occurs as a result of adhesion formation; can be partial or complete, respectively

Bowel endometriosis

Endometriotic lesions infiltrating at least the muscular layer of the intestinal wall ileum ‐ rectum; predominantly affects rectosigmoid colon

Rectosigmoid endometriosis

Endometriotic lesions infiltrating at least the muscular layer of the rectosigmoid colon; the most common form of bowel endometriosis

Anterior DIE

Deep endometriotic lesions located at any site of the anterior pelvic compartment (bladder ± anterior pouch)

Rare types of endometriosis (not included in this review)

Bladder endometriosis

Endometriotic lesions infiltrating bladder muscularis propria

Ureteral endometriosis

Endometriotic lesions involving ureters

Extrapelvic/Atypical endometriosis

Rare types of endometriosis involving various sites outside pelvic cavity, such as:

CNS: cerebral endometriosis, extradural spinal endometriosis

Thoracic: pleural endometriosis, pulmonary endometriosis, diaphragmatic endometriosis

Abdominal: hepatic endometriosis, renal endometriosis, appendix endometriosis, pancreas endometriosis

Musculoskeletal: abdominal wall endometriosis, umbilical endometriosis, pyramidalis muscle endometriosis, inguinal endometriosis, canal of Nuck endometriosis

Perianal endometriosis, perineal endometriosis, extrapelvic endometriosis of sciatic nerve

Subcutaneous endometriosis, operative scar endometriosis

aDIE: deep infiltrating endometriosis

bPOD: pouch of Douglas
cDefinitions of subtypes of DIE are adopted from Bazot 2007c. Additional definitions presented in the literature include 'Rectovaginal endometriosis (RVE)' defined as DIE that infiltrates the vagina, rectum and RVS and obliterates POD (Martin 2001) or 'deep retrocervical endometriosis' defined as involvement of USL, torus uterini, posterior vaginal fornix and/or RVS by endometriotic lesions (Abrao 2007).

dUSL: uterosacral ligament

eRVS: rectovaginal septum

fVaginal endometriosis also defined as 'lesions infiltrating the anterior rectovaginal pouch, posterior vaginal fornix and retroperitoneal area between anterior rectovaginal pouch and posterior vaginal fornix (Chapron 2003a)

Certain rare types of endometriosis such as extrapelvic, bladder and ureteric endometriosis were not included in this review because most of these were described in case reports or in case series, and laparoscopy and laparotomy are not reliable reference standards for these conditions.

We excluded studies in which the diagnosis of endometriosis was not the primary outcome of the trial (e.g. malignant vs benign masses, normal vs abnormal pelvis) and separate data for endometriosis were not available.

We also excluded studies in which findings of the index test formed the basis of selection for the reference standard because this was likely to distort any assessment of the diagnostic value of the index test.

We included studies that involved only selected populations of women with endometriosis (i.e. at specific rASRM stages), in view of emerging evidence on poor correlation of this classification with infertility and pain symptoms. Exclusion of these studies could result in loss of potentially important diagnostic information from otherwise eligible publications. When possible, we addressed the impact of these studies in investigations of heterogeneity. When a study analysed a large population with a wide spectrum of endometriosis and additionally reported subgroup analyses of different stages of disease severity, we considered only estimates for the entire population because subgroup analyses do not directly address the review question regarding clinical utility of biomarkers in detecting the disease.

Reference standards

The reference standard was visualisation of endometriosis at surgery (laparoscopy or laparotomy) with or without histological confirmation, as this is currently the best available test for endometriosis. We reviewed information regarding interobserver and intraobserver correlation of the reference standard, if reported.

We included only studies in which the reference test was performed within 12 months of the index test, on the assumption that disease status could change within a period of one year or longer, naturally or as a result of treatment. We did not include in this review studies in which the participants did not undergo the reference standard, or for whom findings of the index test formed the basis of selection for the reference standard.

Summary of inclusion/exclusion criteria
Inclusion criteria

  • Types of studies.

    • Published peer‐reviewed.

    • Randomised controlled trials (RCTs).

    • Observational with prospective recruitment in the following design.

      • ‘Single‐gate design’ (single set of inclusion criteria defined by clinical presentation) ‐ all participants had clinically suspected endometriosis.

      • ‘Two‐gate design’ (two sets of inclusion criteria with respect to clinical presentation and one set of inclusion criteria with respect to reference standard) ‐ participants with or without clinical suspicion of endometriosis scheduled for abdominal surgery.

    • Published in any language.

    • Performed in any healthcare setting.

    • Any sample size.

  • Participants.

    • Women of reproductive age.

    • Women with clinically suspected endometriosis, including women who underwent abdominal surgery for other benign gynaecological conditions and surgical assessment for presence/absence of endometriosis.

    • Those who undertook both index test and reference standard.

  • Index tests.

    • One or several types of imaging tests.

    • Data reported in sufficient detail for construction of 2 × 2 tables and presented as 'participant‐level' analysis.

  • Target conditions.

    • Pelvic endometriosis, ovarian endometrioma, DIE or specific pelvic sites of DIE.

  • Reference standard.

    • Surgical visualisation of lesions for the diagnosis of endometriosis (laparoscopy or laparotomy) with or without histological verification.

    • Performed within 12 months of the index test.

Exclusion criteria

  • Types of studies.

    • Narrative or systematic reviews.

    • Retrospective design in which the index test was performed after execution of the reference test and/or participants were selected from a retrospective review of case notes.

    • Case reports or case series.

    • Conference proceedings.

  • Participants.

    • Included cohort was not representative of the target population that would benefit from the test (e.g. women with known genital tract malignancy, ectopic pregnancy, acute pelvic inflammatory disease).

    • Study included participants of postmenopausal age, and data for the reproductive age group were not available in isolation.

    • Only participants with positive index test or positive reference standard were included in the analysis.

  • Index tests.

    • Imaging tests were presented in combination with other diagnostic tests for endometriosis, and separate information was not available for the imaging modalities.

    • Study presented only specific technical aspects of an index test or data on interobserver variability, rather than diagnostic performance of the test.

    • Study presented only qualitative description of radiological appearance of endometriotic lesions.

    • Only the number of lesions rather than the number of participants with endometriosis was reported (i.e. 'lesion‐level' analyses).

  • Target conditions.

    • Endometriosis was not the primary outcome of the trial (e.g. malignant vs benign masses, normal vs abnormal pelvis).

    • Atypical, rare sites of endometriosis.

  • Reference standard.

    • Reference standard performed only in a subset of study/control group.

    • Findings of the index test formed the basis of selection for the reference standard.

    • Other than specified in inclusion criteria.

Search methods for identification of studies

The search strategy was developed in collaboration with the Trials Search Co‐ordinator of the Gynaecology and Fertility Review Group, according to recommendations provided in the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy (de Vet 2008). Searches were not limited to particular types of study design and did not apply language or publication date restrictions. The search strategy incorporated words in the title, abstracts, text words across the record and medical subject headings (MeSH). The search was created initially for one broad review examining all diagnostic tests for endometriosis, but because of the complexity of this review, it was split into five separate reviews, and separate searches were used for imaging tests and for biomarker tests (endometrial, blood, urinary, combined). All searches were performed from inception until present. Search strategies for each database and the number of hits per search are presented in Appendix 1. A summary of search results is presented under Results of the search.

Electronic searches

We searched the following databases to identify published articles that assessed the diagnostic value of imaging tests for endometriosis.

  • MEDLINE.

  • EMBASE.

  • Cochrane Central Register of Controlled Trials (CENTRAL).

  • Cumulative Index to Nursing and Allied Health Literature (CINAHL).

  • PsycINFO.

  • Web of Science Core Collection.

  • Latin American Caribbean Health Sciences Literature (LILACS).

  • Open Archives Initiative database (OAIster).

  • Turning Research Into Practice database (TRIP).

  • Databases of trial registers.

    • ClinicalTrials.gov.

    • World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) search portal.

  • Databases to identify reviews as a source of references to potentially relevant studies.

    • Database of diagnostic studies and diagnostic reviews (MEDION).

    • Database of Abstracts of Reviews of Effects (DARE).

    • PubMed, a ‘Systematic Review’ search under the ‘Clinical Queries' link.

  • Searches for papers recently published and not yet indexed in the major databases.

    • PubMed (simple search of the past six months).

Searching other resources

We handsearched reference lists of all relevant publications (retrieved full texts of key articles and identified reviews).

We abandoned an attempt to locate grey literature (unpublished studies and conference proceedings), as we faced substantial difficulty in obtaining full‐text publications or additional details of studies reported in abstract form. This precluded reliable assessment of eligibility and methodological quality of studies, and it was decided that we would not include these publication sources in this review.

Data collection and analysis

Selection of studies

One review author (VN) scanned the titles of studies identified by our search to remove clearly irrelevant articles, and reviewed titles and abstracts of remaining studies to select potentially relevant publications. Two review authors (VN and LH) independently reviewed full‐text versions of articles selected by title and abstract, and assessed eligibility for inclusion on the basis of criteria listed above under Criteria for considering studies for this review. A single failed eligibility criterion was sufficient for a study to be excluded from the review.

Review authors who assessed the relevance of studies and eligibility for inclusion were not blind to information about each article, including publishing journal, names of authors, institutions and results. Disagreements were resolved by discussion and, if necessary, by consultation with a third review author (CF), who is an expert in the field and in methodological aspects of Cochrane systematic reviews.

When papers updated previous publications and were performed on the same study population at different recruitment points, we used the most complete data set that superseded previous publications to avoid double counting of participants or studies. We retrieved missing data by directly contacting the authors to clarify study eligibility. When potentially relevant studies were found in languages other than English, we arranged for a translation. For excluded studies, we documented reasons for exclusion and details of which criteria were not met. We have presented characteristics of included, excluded and awaiting classification studies under Characteristics of included studies, Characteristics of excluded studies and Characteristics of studies awaiting classification, respectively.

Data extraction and management

Two independent review authors (VN, LH) extracted data from eligible studies and resolved disagreements by consulting with a third review author (CF). If required, we contacted study investigators to resolve questions regarding the data.

To collect details from included studies, we specifically designed for this review a data extraction form and pilot‐tested it on three studies of diagnostic accuracy tests for endometriosis. We recorded the following information for each study.

  • General information and study design: first author, year of publication, country, language, setting, objectives, inclusion/exclusion criteria, type of enrolment.

  • Characteristics of study participants: age, symptoms/history/previous tests, type of target condition and its prevalence in the study population, number of participants enrolled and available for analysis, reasons for withdrawal.

  • Features of the index test and the reference standard: type, diagnostic criteria, number and experience of the operators, blinding of operators to other tests and/or clinical data, interobserver variability, time interval between index test and reference standard.

  • Reported numbers of true‐positives (TPs), false‐negatives (FNs), true‐negatives (TNs) and false‐positives (FPs) were used to construct a 2 × 2 table for each index test. If these values were not reported, we attempted to reconstruct 2 × 2 tables from the diagnostic estimates presented in the article.

We extracted data into Review Manager (RevMan) software, which was used to graphically display quality assessment, diagnostic estimates data and descriptive analyses.

Assessment of methodological quality

We used QUADAS‐2, a modified version of the QUADAS tool, to assess the quality of each included study (Whiting 2011).

We have presented the review‐specific QUADAS‐2 tool and an explanatory document in Table 4. We judged each paper as having a 'low', 'high' or 'unclear' risk of bias for each of four domains, and we assessed concerns about applicability in three domains. We considered studies as having low methodological quality when classified at high or unclear risk of bias and/or high concern regarding applicability in at least one domain. Two review authors (VN, LH) independently assessed each included study and settled disagreements by reaching consensus. Two review authors independently piloted the topic‐specific tool to rate four of the included studies at a high level of agreement. We made the following modifications (specific to the imaging modalities review) to signalling questions of the original QUADAS‐2 tool.

Open in table viewer
Table 4. Application of the QUADAS‐2 tool for assessment of methodological quality of included studies

Domain 1 ‐ Patient selection

Description

Describe methods of participant selection and characteristics of the included population

Type of bias assessed

Selection bias, spectrum bias

Review question

Women of reproductive age with clinically suspected endometriosis (symptoms, clinical examination ± presence of pelvic mass), scheduled for surgical exploration of pelvic/abdominal cavity for confirmation of the diagnosis ± treatment

Informaton collected

Study objectives, study population, selection (inclusion/exclusion criteria), study design, clinical presentation, age, number of enrolled and number available for analysis, setting, place and period of the study

Signalling question

Was a consecutive or random sample of participants enrolled?

Yes

If a consecutive sample or a random sample of eligible participants was included in the study

No

If a non‐consecutive sample or a non‐random sample of eligible participants was included in the study

Unclear

All studies that did not specify enrolment as a consecutive or random sample of patients were classified as 'no'; therefore none of the included studies were classified as 'unclear'

Signalling question

Did the study avoid inappropriate exclusions?

Yes

If all participants with suspected endometriosis were included, with an exception for those not able to undergo an index test (e.g. virgins or genital tract anomalies for transvaginal imaging, claustrophobia for MRI) or unfit for surgery

No

If the study selected participants on the basis of particular clinical features (e.g. only suspected bowel involvement, were referred for treatment of deep endometriosis) or excluded participants with any co‐morbidities, other than specified above

Unclear

If the study did not provide clear definition of selection (inclusion/exclusion) criteria and 'no' judgement was not applicable

Signalling question

Was a two‐gate design avoided?

Yes

If the study had a single set of inclusion criteria, defined by the clinical presentation (i.e. only participants in whom the target condition is suspected) ‐ a ‘single‐gate design’

No

If the study had more than 1 set of inclusion criteria with respect to clinical presentation (i.e. participants suspected of target condition, participants with alternative diagnosis in whom the target condition would not be suspected in clinical practice) ‐ a 'two‐gate' study design

Unclear

If it was unclear whether a 'two‐gate deign' was avoided

Risk of bias

Could the selection of participants have introduced bias?

High

If 'no' classification for any of the above 3 questions

Low

If 'yes' classification for 3 questions above

Unclear

If 'unclear' classification for any of the above questions and 'high risk' judgement were not applicable

Concerns about applicability

Are there concerns that included participants do not match the review question?

High

If the study population differed from the population defined in the review question in terms of demographic features and co‐morbidity (e.g. studies with multiple sets of inclusion criteria with respect to clinical presentation, including healthy controls or alternative diagnosis controls that would not have undergone index test in real practice). We excluded studies in which participants were not in the reproductive age group, and most included studies were of 'single‐gate' design; therefore, we expected few studies to be classified as 'high concern'

Low

If the study included only a clinically relevant population that would have undergone index test in real practice

Unclear

If this information was unclear

Domain 2 ‐ Index test

Description

Describe the index test, how it was conducted and interpreted

Type of bias assessed

Test review bias, clinical review bias, interobserver variation bias

Review question

Any type of imaging modality

Informaton collected

Index test name, description of positive case definition by index test as reported, examiners (numbers, level of expertise, blinding), interobserver variability, conflicts of interest

Signalling question

Were the index test results interpreted without knowledge of results of the reference standard?

Yes

We excluded studies in which the index test was performed retrospectively after execution of the reference standard; therefore, all included studies were classified 'yes'

No

Unclear

Signalling question

Did the study provide a clear prespecified definition of what was considered to be a 'positive result of the index test?

Yes

If study provided clear definition of positive findings, and this was defined before execution/interpretation of index test

No

If definition of the positive result was not provided, or if study described findings derived from the index test and not defined before its execution

Unclear

If it was unclear whether the criteria were prespecified

Signalling question

Was the index test performed by a single operator or interpreted by consensus in a joint session?

Yes

If test was performed/interpreted by single operator or was interpreted after collegial discussion of the case

No

If test was performed/interpreted by various operators for different participants

Unclear

If this information was unclear

Signalling question

Were the same clinical data available when the index test results were interpreted as would be available when the test is used in practice?

Yes

If operators performing/interpreting the test were aware of suspected endometriosis and/or of the clinical history but were not aware of results of other imaging tests or of a previous diagnosis of endometriosis, including the results of previous surgeries

No

If operators performing/interpreting the test were informed of previously or recently surgically diagnosed endometriosis or were not blinded to results of other imaging tests or tests raising suspicion for endometriosis

Unclear

If this information was unclear

Risk of bias

Could the conduct or interpretation of the index test have introduced bias?

High

If 'no' classification for any of the above 4 questions

Low

If 'yes' classification for all the above 4 questions, or if 'unclear' classification for question 'Was the index test performed by a single operator or interpreted by consensus in a joint session?' and ''yes' classification for the remaining 3 questions

Unclear

If 'unclear' classification at least for the question 'Did the study provide a clear pre‐specified definition of what was considered to be a 'positive' result of index test?' or for the question 'Were the same clinical data available when the index test results were interpreted as would be available when the test is used in practice?' and 'high risk' judgement was not applicable

Concerns about applicability

Are there concerns that the index test, its conduct or its interpretation differs from the review question?

High

We did not consider studies in which index tests other than imaging modalities were included (or that excluded information on other index tests reported in addition to imaging modalities), or in which the index test looked at other target conditions not specified in the review (e.g. studies aimed at classifying pelvic masses as benign and malignant); therefore, none of the included studies was classified as 'high concern'

Low

We considered all types of imaging modalities as eligible; therefore, all included studies were classified as 'low concern', as anticipated

Unclear

Only studies with sufficient information on the index test were included; therefore, none of the included studies was classified as 'unclear concern'

Domain 3 ‐ Reference standard

Description

Describe the reference standard and how it was conducted and interpreted

Type of bias assessed

Verification bias, bias in estimation of diagnostic accuracy due to inadequate reference standard

Review question

Target condition ‐ pelvic endometriosis, ovarian endometriosis, DIE overall or at specific anatomical sites; Reference standard ‐ visualisation of endometriosis at surgery (laparoscopy or laparotomy) with or without histological confirmation

Informaton collected

Target condition, prevalence of target condition in the sample, reference standard, description of positive case definition by reference test as reported, examiners (numbers, level of expertise, blinding)

Signalling question

Is the reference standard likely to correctly classify the target condition?

Yes

If the study reported at least 1 of the following: surgical procedure described in sufficient detail and/or criteria for positive reference standard stated and/or the procedure was performed by the team with a high level of expertise in diagnosis/surgical treatment of the target condition

No

If the reference standard did not classify the target condition correctly; in the light of inclusion criteria and the nature of the reference standard, no studies were classified as 'no' for this item

Unclear

If information on execution of the reference standard or its interpretation or on operators was unclear

Signalling question

Were reference standard results interpreted without knowledge of results of the index tests?

Yes

If operators performing the reference test were unaware of the results of the index test

No

If operators performing the reference test were aware of the results of the index test

Unclear

If this information was unclear

Risk of bias

Could the reference standard, its conduct or its interpretation have introduced bias?

High

If 'no' classification for either of the above 2 questions

Low

If 'yes' classification for both of the above 2 questions

Unclear

If 'unclear' classification for either of the above 2 questions and 'high risk' judgement was not applicable

Concerns about applicability

Are there concerns that the target condition as defined by the reference standard does not match the question?

High

We excluded studies in which participants did not undergo surgery for diagnosis of endometriosis; therefore, none of the included studies were classified as 'high concern'

Low

In the light of inclusion criteria, all studies were classified as 'low concern', as anticipated

Unclear

Only studies in which laparoscopy/laparotomy served as a reference test were included; therefore, no included studies were classified as 'unclear concern'

Domain 4 ‐ Flow and timing

Description

Describe any participants who did not receive the index tests or the reference standard, or who were excluded from the 2 × 2 table; describe the interval and any interventions between index tests and the reference standard

Type of bias assessed

Disease progression bias, bias of diagnostic performance due to missing data

Review question

Less than 12‐month interval between index test and reference standard ‐ endometriosis may progress over the time, so we had chosen an arbitrary time interval of 12 months as an acceptable time interval between the index test and surgical confirmation of the diagnosis

Informaton collected

Time interval between index test and reference standard, withdrawals (overall number reported and whether they were explained)

Signalling question

Was there an appropriate interval between index test and reference standard?

Yes

If time interval was reported and was less than 12 months

No

We excluded all studies for which the time interval was longer than 12 months; therefore, no included studies were classified as 'no' for this item

Unclear

If the time interval was not stated clearly but the study authors' description allowed one to assume that the interval was reasonably short

Signalling question

Did all participants receive the same reference standard?

Yes

In the light of inclusion criteria, all studies were classified as 'yes' for this item, as anticipated

No

Unclear

Signalling question

Were all participants included in the analysis?

Yes

If all participants were included in the analysis, or if participants were excluded because they did not meet inclusion criteria or if withdrawals were less than 5% of the enrolled population (arbitrary selected cut‐off)

No

If any participants were excluded from the analysis because of uninterpretable results, because of inability to undergo index test or reference standard or for unclear reasons

Unclear

No studies were classified as 'unclear' for this item

Risk of bias

Could the participant flow have introduced bias?

High

If 'no' classification for any of the above 3 questions

Low

If 'yes' classification for all of the above 3 questions

Unclear

If 'unclear' classification for any of the above 3 questions and 'high risk' judgement was not applicable

Domain 1

  • An original signalling question 'Was a case control design avoided?' was rephrased as 'Was a two‐gate design avoided?'. Diagnostic studies are cross‐sectional in nature, aiming to compare results of an index test versus results of the reference standard in the same group of participants. In these studies, parameters are measured at a single point in time, and groups are classified by the outcome of the reference standard, albeit the analysis is performed retrospectively. Therefore, unlike in epidemiological studies, the terms 'cohort' and 'case‐control' are less informative for diagnostic test trials and were substituted by 'single‐gate' and 'two‐gate' designs. We included this question because a two‐gate design has greater potential to introduce selection bias.

Domain 2

  • An additional signalling question 'Was the index test performed by a single operator?' was included to assess interobserver variation bias.

  • An additional signalling question 'Were the same clinical data available when the index test results were interpreted as that which would be available when the test is used in practice?' was included to assess bias in clinical applicability.

  • An original signalling question 'If a threshold was used, was it prespecified?' was rephrased as 'Did the study provide a clear prespecified definition of what was considered to be a positive index test result?' because this question was more applicable to imaging modalities.

We assessed methodological quality for each domain but did not calculate a summary score to estimate the overall quality of studies (Whiting 2005).

Statistical analysis and data synthesis

We analysed diagnostic imaging techniques in the following subsets.

  • Tests for detecting pelvic endometriosis.

  • Tests for detecting ovarian endometriosis (endometrioma).

  • Tests for detecting DIE.

  • Tests for identifying deep endometriotic lesions at separate pelvic anatomical sites (USL, RVS, vaginal wall, obliterated POD, rectosigmoid colon, bowel (ileum to caecum)).

We generated estimates of sensitivity and specificity in forest plots and plotted them in the receiver operating characteristic (ROC) space for each index test using RevMan. We investigated the diagnostic performance of each test and visually explored interstudy variation in performance of each index test in relation to participant characteristics, study design and study quality factors. We included two or more tests evaluated in the same cohort as separate data sets because the unit of analysis was the test result ‐ not the participant.

We obtained the estimate of the expected operating point (mean sensitivity and specificity) and corresponding 95% confidence region by using the bivariate logit normal random‐effects model for all meta‐analyses including four or more studies. When fewer than four studies were included, we did not attempt to estimate co‐variance and reported this as zero. To estimate the performance of other tests in small meta‐analyses (two to three data sets), we performed a fixed‐effect meta‐analysis in the absence of substantial heterogeneity, resulting in the summary estimate for sensitivity and for specificity. We performed meta‐analyses by using SAS NLMIXED software. We entered results from SAS into RevMan to provide plots of estimated mean or summary points and confidence regions, superimposed on study‐specific estimates of sensitivity and specificity.

We assessed the comparative accuracy of index tests for each target condition in two ways. In direct, fully paired comparisons in which all study participants received more than one index test, as well as the reference standard, we plotted the estimates in RevMan. If meta‐analysis was possible, we used test‐level co‐variates in the bivariate logit normal model to identify statistically significant differences. Otherwise, we reported available comparative data in a narrative way and illustrated the data using forest and ROC plots.

When test performance was judged against predetermined diagnostic criteria, we considered the point estimates of sensitivity and specificity as the most informative presentation of test performance. We acknowledge that tests with point estimates that did not reach the predetermined criteria but included confidence intervals (CIs) that contained values above the threshold could have provided diagnostic value. Furthermore, tests with point estimates that reached the criteria but with CIs that contained values below the threshold could have provided overestimated diagnostic value. If the range of CIs rather than the point estimates of data are used, the predetermined cut‐off becomes meaningless. Therefore we did not consider CIs in qualifying the test performance but used this information when interpreting reliability of the data obtained.

Dealing with missing data

We defined missing data as any information regarding study population, index tests or reference standards that was not available in the publication but was required to determine the eligibility of the study for inclusion, to assess its methodological quality or to construct results tables. If we identified missing data, we contacted study authors in an attempt to obtain this information. If missing data prevented a clear judgement regarding applicability for inclusion or construction of accurate 2 × 2 tables, and if data were not provided by the primary investigators (e.g. we were not able to locate contact details of study authors, we received no reply from study authors, study authors replied that the requested information was unavailable), we excluded the study from the review.

Investigations of heterogeneity

We initially assessed heterogeneity by visually examining forest plots of sensitivities and specificities and ROC plots for all index tests. We stated potential sources of heterogeneity under Secondary objectives. For diagnostic tests that involved more than 10 eligible studies or data sets, we planned to formally explore heterogeneity by using study level co‐variates. We also planned to assess the sensitivity of results to inclusion and exclusion of outlying studies in all analyses but refrained from doing so because of the small number of studies available for most analyses. It is important to use caution when interpreting small meta‐analyses (few studies) with a limited total sample size.

Sensitivity analyses

We planned to conduct sensitivity analyses to assess the impact of the methodological quality of included studies on results of the meta‐analysis, if sufficient data were available. We defined low‐quality studies as having high risk of bias for one or more QUADAS‐2 domains. We also planned to use the ’leave‐one‐out’ procedure to assess the impact of each study on results of the meta‐analysis (leading study effect), but we were not able to do this because of the small number of studies available for most groups of tests.

Assessment of reporting bias

A comprehensive search of multiple sources for eligible studies, a search of trial registers and application of no language restrictions minimised the risk of reporting bias. However, publication bias generally arises when studies have a greater chance of being published if their results are positive. Therefore, we initially searched unpublished and published study databases and conference proceedings and evaluated identified sources. During the process of qualifying studies for inclusion in this review, we faced substantial difficulty in obtaining full‐text publications or additional details of studies published in abstract form. This precluded reliable assessment of eligibility and methodological quality, and it was decided to excluded these publication sources from this review.

Results

Results of the search

The literature search identified 32,275 references as follows: MEDLINE (n = 7391), EMBASE (n = 12,161), CENTRAL (n = 445), CINAHL (n = 668), PsycINFO (n = 174), Web of Science (n = 7425), LILACS (n = 420), OAIster (n = 446), TRIP (n = 1648), trial registers for ongoing and registered trials (n = 523), MEDION (n = 190), DARE (n = 99), PubMed, a ‘Systematic Review’ search (n = 418) and simple search PubMed (n = 267). We searched these databases from inception to 20 April. The flow of the selection process is presented in Figure 2. We screened titles to exclude duplicates (n = 10,705) and clearly irrelevant studies (n = 19,189). We eliminated another 2205 references eliminated after reading the abstracts because they did not address the research question or clearly did not meet the inclusion criteria. We retrieved the full texts of the remaining 181 references and assessed them for eligibility. Data from 63 studies required additional clarification from study authors and 25 non‐English publications were translated. Ultimately, 49 studies that were eligible according to the inclusion criteria provided data for the review; we excluded 132 studies. In addition, we identified three ongoing trials through the clinical trials registries (Characteristics of ongoing studies) but found that the outcomes of these studies were not yet available (two trials were still open to participant recruitment, and the status of one study was unclear). We will monitor and address the progress of these studies in future updates.


Flow of studies identified in literature search for systematic review on imaging modalities for a non‐invasive diagnosis of endometriosis.

Flow of studies identified in literature search for systematic review on imaging modalities for a non‐invasive diagnosis of endometriosis.

Basic features of included studies

We have presented the list and details of the included studies under Characteristics of included studies. The 49 included studies studied 4807 participants, with a median of 87 women per study (range 10 to 710). Of these studies, 27 were conducted in Europe, six in South America, four in Asia, two in North America, three in Australia and one in the Middle East. Ninety‐four per cent (46/49) of these studies were conducted at university hospitals, of which 14 were designated as referral centres for endometriosis. The earliest article was published in 1993, 41 articles were published after 2000 and 26 after 2010. All included studies assessed women of reproductive age; 46 studies included a population with clinical suspicion of endometriosis based on symptoms and clinical examination with or without an ovarian mass. Two studies assessed only women with a persistent ovarian mass (Guerriero 1996a; Guerriero 1996b), and one study focused exclusively on women undergoing infertility workup (Ubaldi 1998). Only one study (Eskenazi 2001) used a 'two‐gate design' and included a wider group of participants, defined as 'women scheduled to undergo laparoscopy or laparotomy for pelvic pain, infertility, tubal ligation, or masses of the adnexa or uterus'. Eleven studies (Okada 1995; Guerriero 1996a; Guerriero 1996b; Ghezzi 2005; Takeuchi 2005; Chamie 2009a; Fastrez 2011; Manganaro 2012a; Manganaro 2012b; Manganaro 2013; Mangler 2013) reported abnormal imaging findings (other than the index test) as one of the inclusion criteria, but the remaining studies presented no information on pre‐enrolment imaging tests. One study limited the study population to 'women with symptoms suggestive of endometriosis with normal ovarian size and no evidence of an ovarian cyst' (Said 2014). Seventeen studies (Stratton 2003; Biscaldi 2007; Bazot 2009; Hottat 2009; Piketty 2009; Bergamini 2010; Falco 2011; Fastrez 2011; Ferrero 2011; Savelli 2011; Mangler 2013; Reid 2013a; Stabile 2013; Biscaldi 2014; Guerriero 2014; Piessens 2014; Said 2014) included women with a history of previous surgery for endometriosis representing 7% to 66% of the study population. Two studies (Holland 2010; Mangler 2013) included participants with a recent laparoscopic diagnosis of endometriosis who were awaiting definitive surgery; however, index test operators were blind to previous surgical findings. Nine studies described exclusion of participants who had undergone any pelvic surgery (Dessole 2003; Ghezzi 2005; Takeuchi 2005; Chamie 2009a; Biscaldi 2014; Said 2014) or specific excision of DIE (Fedele 1998; Hudelist 2011a; Hudelist 2013). Laparoscopy was the predominant surgical modality in all studies, whereas laparotomy was reserved for selected cases. Eighty‐eight per cent (43/49) of the included studies used histopathology to confirm the surgical diagnosis. The reported prevalence of endometriosis varied, ranging from 43% to 100% for pelvic endometriosis, from 7.5% to 100% for ovarian endometriosis and from 30% to 100% for DIE.

Authors of five papers declared that they received no financial support from external sources (Ribeiro 2008a; Hottat 2009; Fastrez 2011; Manganaro 2012b; Manganaro 2013). Guerriero 2014 stated that this study was partially supported by the Regione Autonoma della Sardegna (project code CPR‐24750) but declared no conflict of interest. Stratton 2003 and Hudelist 2013 declared that work was funded by the Intramural Program, National Institute of Child Health and Human Development, Bethesda, Maryland, and by the OEGEO, Österreichische Gesellschaft für Endokrinologische Onkologie, respectively, but made no statement regarding a conflict of interest. Nine other articles declared no conflict of interest (Guerriero 2008; Bazot 2009; Hottat 2009; Fastrez 2011; Manganaro 2012b; Manganaro 2013; Mangler 2013; Said 2014; Thomeer 2014), and the remaining included studies provided no such information.

Basic features of excluded studies

We have presented the list and descriptions of excluded studies under Characteristics of excluded studies. On the basis of full‐text assessment, we excluded 132 publications, 34 of which were of retrospective design by which the population was selected from medical records, and index tests were reviewed retrospectively. We excluded an additional 26 studies as they reported outcomes for number of lesions ‐ not number of participants (a 'lesion‐level' analysis). Twenty‐six studies were not diagnostic test accuracy studies and focused on technical aspects of the test, interobserver variability or a description of radiological criteria of the target condition. We excluded 11 papers as they enrolled a wide age group (n = 9) or pregnant women (n = 2), and independent assessment of women of reproductive age could not be performed. Many articles in this excluded group were comparisons of endometriomas versus benign and malignant ovarian masses in older women. In eight excluded papers, a reference standard other than surgery was used, or investigators provided no data on surgical diagnosis. In nine of the excluded studies, the target condition was outside the inclusion criteria, and data were reported for benign versus malignant masses or normal versus abnormal pelvis with no separate data given for endometriosis. We excluded another eight studies as they reported on a cohort that overlapped with a cohort in another updated included paper. Four studies presented insufficient descriptions of methods and/or study populations and provided no information. We could not extract data for 2 × 2 tables from three studies. For two other studies, the index test was outside the inclusion criteria, reporting data for a combination of imaging tests and pelvic examination. We excluded one study as investigators did not consider healthy controls in the analysis, and another study because the time interval between index test and surgery exceeded 12 months.

Methodological quality of included studies

We have presented details on the quality of included studies in the QUADAS‐2 results summary (Figure 3 and Figure 4). Overall, most studies were of poor methodological quality, and only one study (Thomeer 2014) was assigned low risk of bias in every domain assessed.


Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.


Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.

Twenty‐six studies presented high risk of patient selection bias (Ha 1994; Ascher 1995;Okada 1995; Fedele 1998; Ubaldi 1998; Eskenazi 2001; Dessole 2003; Takeuchi 2005; Biscaldi 2007; Menada 2008a; Ribeiro 2008a; Chamie 2009a; Piketty 2009; Bergamini 2010; Grasso 2010; Falco 2011; Fastrez 2011; Ferrero 2011; Hudelist 2011a; Manganaro 2012a; Manganaro 2012b; Manganaro 2013; Reid 2013a; Biscaldi 2014; Leon 2014; Said 2014), nine were rated as having unclear risk (Sugimura 1993; Stratton 2003; Guerriero 2007; Guerriero 2008; Bazot 2009; Pascual 2010; Bazot 2013; Mangler 2013; Piessens 2014) and 14 demonstrated low risk. Non‐consecutive or non‐random enrolment, absence of a clear definition of inclusion/exclusion criteria and inclusion of a highly selected group of participants were the main reasons for assessment of high risk of bias.

Eleven studies presented with high risk of index test interpretation bias (Sugimura 1993; Fedele 1998; Dessole 2003; Bergamini 2010; Holland 2010; Fastrez 2011; Ferrero 2011; Savelli 2011; Mangler 2013; Piessens 2014; Reid 2014), 10 demonstrated unclear risk (Okada 1995; Guerriero 1996a; Guerriero 1996b; Ghezzi 2005; Guerriero 2007; Guerriero 2008; Pascual 2010; Manganaro 2012a; Scarella 2013; Leon 2014) and 28 were rated as having low risk. Lack of clear prespecified criteria for a positive diagnosis and lack of blinding of index test operators to the clinical history or to results of other diagnostic tests were the main reasons for high risk assessment. High risk of bias for this domain was also attributed to articles in which the index test was performed/interpreted by different operators for different participants, as varying skill levels could undermine the reliability of results. Overall, interobserver variability was rarely reported. Six studies stated that disagreement between test operators was resolved by consensus in a joint session (Ascher 1995;Ghezzi 2005;Biscaldi 2007;Chamie 2009a; Manganaro 2012a; Thomeer 2014); two calculated accuracy estimates of the index test separately for the two examiners (Hottat 2009; Holland 2010) and eight assessed interobserver and intraobserver variability in the whole cohort or in a subset of randomly selected participants (Ubaldi 1998; Guerriero 2008;Hottat 2009; Manganaro 2012b; Bazot 2013;Stabile 2013; Guerriero 2014;Thomeer 2014). None of the included studies carried risk of test review bias, as studies in which the index test was performed or interpreted after execution of the reference standard were excluded. As criteria for a positive index test were variable between studies and as index test protocols were not standardised, quality judgements for the index test were complex; however, these factors were not directly addressed by the QUADAS‐2 tool.

Fifteen studies were at high risk of bias in the 'reference standard' domain (Fedele 1998; Dessole 2003; Abrao 2007; Ribeiro 2008a; Bergamini 2010; Ferrero 2011; Hudelist 2011a; Savelli 2011; Hudelist 2013; Mangler 2013; Biscaldi 2014; Guerriero 2014; Leon 2014; Piessens 2014; Reid 2014), 27 were classified as unclear risk (Sugimura 1993; Ha 1994; Ascher 1995; Okada 1995; Guerriero 1996a; Guerriero 1996b; Ubaldi 1998; Eskenazi 2001; Ghezzi 2005; Biscaldi 2007; Guerriero 2007; Guerriero 2008; Menada 2008a; Bazot 2009; Chamie 2009a; Piketty 2009; Goncalves 2010; Grasso 2010; Pascual 2010; Falco 2011; Manganaro 2012a; Manganaro 2012b; Bazot 2013; Manganaro 2013; Reid 2013a; Stabile 2013; Said 2014) and seven demonstrated low risk. We assigned high risk of bias when reference standards were interpreted with knowledge of index test results. Although it would be unethical to withhold from surgeons information on preoperative imaging investigations, lack of blinding to the index test contributes to diagnostic review bias. Most studies provided insufficient information to indicate how likely the reference standard was to have correctly classified the target condition. Specifically, surgical procedures were not well described, criteria for a positive reference standard were not stated or no information was provided regarding the experience of the surgeons and/or pathologists involved.

Ten studies were at high risk of bias in the 'flow and timing' domain (Ascher 1995; Stratton 2003; Hottat 2009; Falco 2011; Savelli 2011; Bazot 2013; Scarella 2013; Guerriero 2014; Leon 2014; Piessens 2014), four were at unclear risk (Dessole 2003; Takeuchi 2005; Chamie 2009a; Bergamini 2010) and 35 demonstrated low risk. A study was classified as having high risk of bias when withdrawals were not adequately explained and exceeded 5% of the enrolled population. In all studies, the interval between index test and reference standard was 12 months or less, and the most commonly reported time interval was up to three months. In every study, all participants received the same reference standard.

One study presented high concern for patient selection applicability (Eskenazi 2001), and the remaining 48 studies demonstrated low concern. We assigned high concern for patient selection applicability if the study utilised two‐gate selection for cases and controls, as any sampling deviation from a representative group of the entire clinically relevant population could skew the estimates of diagnostic accuracy in any direction. No studies had concerns about applicability in 'index test' and 'reference standard' domains.

Findings

Findings are presented under two main categories.

Diagnostic tests for endometriosis

We analysed the diagnostic test accuracy of imaging tests for three types of endometriosis in a total of 29 studies.

  • Pelvic endometriosis at all locations at any depth of invasion (13 studies, 1535 participants).

  • Ovarian endometriosis (10 studies, 852 participants).

  • DIE/posterior DIE (15 studies, 1493 participants).

Findings are outlined in summary of findings Table 1 and Appendix 2. Twelve studies performed eight head‐to‐head direct comparisons between tests. Data were insufficient to permit meta‐analyses of paired tests, hence, we have reported available comparative results narratively and have illustrated them in forest plots and ROC plots.

Pelvic endometriosis
Pelvic endometriosis using ultrasonography

Five articles, which included a total of 1222 participants, were published between 2001 and 2014 and explored the accuracy of TVUS in diagnosing pelvic endometriosis. These studies were conducted in Europe (n = 4) and in the Middle East (n = 1). The mean sensitivity and specificity of all included studies were 0.65 (95% confidence interval (CI) 0.27 to 1.00) and 0.95 (95% CI 0.89 to 1.00). Four studies evaluated conventional TVUS, and one study addressed the tenderness‐guided method (tg‐TVUS). Forest plots (Figure 5) and the ROC plot (Figure 6) demonstrated a high degree of heterogeneity between papers, which was greater for estimates of sensitivity than of specificity. One of the studies (710 participants) (Ghezzi 2005) utilised the 'kissing sign' as a sole single marker of endometriosis, in contrast to the other four studies, which surveyed pelvic anatomy in general. This paper reported markedly low sensitivity at 0.09 (95% CI 0.06 to 0.12), which influenced the sensitivity estimate for the group, as the range of sensitivities for the other four studies (512 participants) was between 0.56 and 0.96, whereas specificities ranged between 0.80 and 0.99. The mean sensitivity and specificity of these four studies were 0.79 (95% CI 0.36 to 1.00) and 0.91 (95% CI 0.74 to 1.00). Even when data from a large outlying study were excluded, sensitivity and specificity estimates were heterogeneous and confidence intervals wide, and estimates did not meet the criteria for a replacement or a triage test but approached the criteria for a SpPin triage test. No other ultrasound techniques were evaluated as a diagnostic test for pelvic endometriosis.


Forest plot of TVUS for detection of pelvic endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to the year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.

Forest plot of TVUS for detection of pelvic endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to the year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.


Summary ROC plot of TVUS for detection of pelvic endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of TVUS for detection of pelvic endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Pelvic endometriosis using MRI

Seven studies, including 10 data sets with a total of 303 participants, assessed the value of MRI in detecting pelvic endometriosis. Eligible MRI evaluations were published between 1993 and 2011, and most (n = 4) were published in the early 1990s. Studies were conducted in Asia (n = 3), North America (n = 2) and Europe (n = 2). Five different MRI methods were assessed: (1) T1/T2‐w MRI (three studies, 97 participants); (2) fat‐suppressed MRI (one study, 31 participants); (3) T1/T2‐w MRI with fat‐suppression (two studies, 105 participants); (4) T1/T2‐w MRI with fat‐suppression/Gd (two studies, 77 participants); and (5) 3.0T MRI (two studies, 86 participants). Three studies compared more than one MRI method in the same cohort of women (Sugimura 1993; Ha 1994; Ascher 1995). The mean sensitivity and specificity of all included studies were 0.79 (95% CI 0.70 to 0.88) and 0.72 (95% CI 0.51 to 0.90), which did not meet the criteria for a replacement or a triage test. Forest plots (Figure 7) and the ROC plot (Figure 8) showed a high degree of heterogeneity for estimates of both sensitivity and specificity.


Forest plot of MRI for detection of pelvic endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication. Tests on the same population (different MRI methods) are presented separately as MRI* and MRI**. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Forest plot of MRI for detection of pelvic endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication. Tests on the same population (different MRI methods) are presented separately as MRI* and MRI**. FN: false negative; FP: false positive; TN: true negative; TP: true positive.


Summary ROC plot of MRI for detection of pelvic endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI* and MRI**. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of pelvic endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI* and MRI**. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Pelvic endometriosis using other imaging modalities

Authors of one paper determined the accuracy of 18FGD PET‐CT in detecting pelvic endometriosis (10 participants, published in 2011, conducted in Europe), showing sensitivity of 0.00 (95% CI 0.00 to 0.34) and specificity of 1.00 (95% CI 0.03 to 1.00). Similarly, different groups in another small descriptive study showed negative findings for the same test; this study did not meet the inclusion criteria (Setubal 2011). No other imaging techniques described in the included studies evaluated pelvic endometriosis.

Indirect comparisons of imaging tests for pelvic endometriosis

With regards to TVUS modalities, no specific technique, year of publication or geographical location resulted in a better performing method. The two most recent small studies evaluated 3.0T MRI; each showed high sensitivity and specificity for diagnosing pelvic endometriosis (sensitivity 0.97, 95% CI 0.84 to 1.00; specificity 1.00, 95% CI 0.77 to 1.00 ‐ Manganaro 2012a; sensitivity 0.81, 95% CI 0.65 to 0.92; specificity 1.00, 95% CI 0.29 to 1.00 ‐ Thomeer 2014). The latter study displayed wide confidence intervals, suggesting that caution should be used in interpreting these findings. Different MRI methods were not formally compared because the small number of studies and their small size precluded meaningful results.

Mean estimates of TVUS after exclusion of the outlier study showed comparable sensitivity but higher specificity than were seen with MRI.

Direct comparisons of imaging tests for pelvic endometriosis

Three studies made a direct head‐to‐head comparison of two or three MRI methods, but all were small and inconclusive and reported wide and overlapping confidence intervals (Ha 1994; Ascher 1995; Sugimura 1993) (see Appendix 2; Figure 9; Figure 10; Figure 11). No studies have compared MRI and TVUS.


Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.


Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.


Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Ovarian endometriosis
Ovarian endometriosis using ultrasonography

Eight studies with a total of 765 participants explored the diagnostic accuracy of TVUS for ovarian endometriosis. These included studies were published between 1996 and 2015. Studies were conducted in Europe (n = 6), Australia (n = 1) and South America (n = 1). Mean sensitivity and specificity estimates for all included studies were 0.93 (95% CI 0.87 and 0.99) and 0.96 (95% CI 0.92 and 0.99), respectively, meeting the criteria for a SpPin triage test and approaching the criteria for a replacement tet and a SnNout triage test. Estimates for both sensitivity and specificity showed less heterogeneity than were seen in other types of endometriosis (Figure 12; Figure 13).


Forest plot of US methods (TVUS, TRUS) for detection of ovarian endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are presented for TVUS and TRUS and are ordered according to year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.

Forest plot of US methods (TVUS, TRUS) for detection of ovarian endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are presented for TVUS and TRUS and are ordered according to year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.


Summary ROC plot of US methods (TVUS, TRUS) for detection of ovarian endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line) (for TVUS).

Summary ROC plot of US methods (TVUS, TRUS) for detection of ovarian endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line) (for TVUS).

Ovarian endometriosis using MRI

Three studies with a total of 179 participants were published in 2009 and 2011 and assessed the diagnostic accuracy of MRI for ovarian endometriosis. All studies were conducted in Europe. One study (92 participants) used T1/T2‐w MRI with fat‐suppression/Gd, and two studies (87 participants) utilised 3.0T MRI. Meta‐analysis of these three studies revealed summary sensitivity and specificity of 0.95 (95% CI 0.90 to 1.00) and 0.91 (95% CI 0.86 to 0.97), meeting the criteria for a replacement test and a SnNout triage test, and approaching the criteria for a SpPin triage test (Figure 14). However, the few identified studies provided insufficient evidence to allow meaningful conclusions on the diagnostic role of MRI for endometrioma.


Forest plot of MRI for detection of ovarian endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line). Studies are ordered by year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Forest plot of MRI for detection of ovarian endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line). Studies are ordered by year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Indirect comparisons of imaging tests for ovarian endometriosis

For TVUS, articles published after 2006 (n = 5) demonstrated higher sensitivity for diagnosing endometrioma. The most accurate ultrasound methods appeared to be tenderness‐guided TVUS (one study in 50 women), which showed sensitivity of 1.00 (95% CI 0.66 to 1.00) and specificity of 1.00 (95% CI 0.91 to 1.00) (Guerriero 2007), and TVUS‐BP (two studies in 142 women), which demonstrated sensitivity of 0.97 (95% CI 0.83 to 1.00) and 1.00 (95% CI 0.81 to 1.00) and specificity of 1.00 (95% CI 0.87 to 1.00) and 0.93 (95% CI 0.84 to 0.98) (Scarella 2013; Piessens 2014). Data were insufficient to permit formal comparisons of TVUS methods.

Higher estimates were reported for 3.0T MRI with sensitivities of 0.95 and 1.00 (95% CI 0.76 to 1.00 and 0.82 to 1.00) and specificities of 0.95 and 0.96 (95% CI 0.75 to 1.00 and 0.81 to 1.00) than for T1/T2‐w MRI with fat‐suppression/Gd, which showed sensitivity of 0.92 (95% CI 0.78 to 0.98) and specificity of 0.88 (95% CI 0.76 and 0.95), although confidence intervals overlapped.

When pooled estimates were considered, TVUS showed lower sensitivity but higher specificity compared with MRI.

Direct comparisons of imaging tests for ovarian endometriosis

One study (92 participants, published in 2009, conducted in Europe) evaluated TRUS and demonstrated sensitivity of 0.89 (95% CI 0.74 to 0.97) and specificity of 0.77 (95% CI 0.64 to 84) for diagnosis of ovarian endometriosis (Figure 12). This study directly compared TRUS, TVUS and MRI (Bazot 2009) and found that TRUS had lower diagnostic estimates than TVUS (sensitivity 0.94, 95% CI 0.81 to 0.99; specificity 0.86, 95% CI 0.74 to 0.94) and MRI (sensitivity 0.92, 95% CI 0.78 to 0.98; specificity 0.88, 95% CI 0.76 to 0.95). TVUS and MRI provided comparable estimates for diagnosing ovarian endometriosis (Appendix 3: Figure 15; Figure 16; Figure 17).


Forest plot demonstrating the direct comparison between TVUS and TRUS for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and TRUS for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TRUS and MRI for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TRUS and MRI for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TVUS and MRI for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and MRI for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

We identified no comparative studies of MRI and other imaging tests for ovarian endometriosis, other than the one presented above.

Deep infiltrating endometriosis/poster DIE
Deep infiltrating endometriosis using ultrasonography

Nine articles included 12 data sets with a total of 934 participants and assessed the accuracy of TVUS in detecting DIE (n = 3) and posterior DIE (n = 7). All included studies were published after 2002, and most (n = 7) were published after 2009. These studies were conducted in Europe (n = 7), South America (n = 1) and Australia (n = 1). TVUS techniques included (1) TVUS (seven studies, eight data sets, 721 participants); (2) 3D‐TVUS (two studies, 226 participants); and (3) SVG (two studies, 235 participants). Mean sensitivity and specificity estimates for all included studies were 0.79 (95% CI 0.69 to 0.89) and 0.94 (95% CI 0.88 to 1.00), which approached the criteria for a SpPin triage test. Forest plots (Figure 18) and the ROC plot (Figure 19) revealed a high degree of heterogeneity for both sensitivity and specificity, with greater heterogeneity for sensitivity.


Forest plot of TVUS for detection of DIE/Posterior DIE. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for DIE and Posterior DIE, respectively. Tests on the same population (different TVUS methods) are presented separately as TVUS*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.

Forest plot of TVUS for detection of DIE/Posterior DIE. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for DIE and Posterior DIE, respectively. Tests on the same population (different TVUS methods) are presented separately as TVUS*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.


Summary ROC plot of TVUS for detection of DIE/Posterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of TVUS for detection of DIE/Posterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Deep infiltrating endometriosis using MRI

Six studies, including seven data sets with a total of 266 participants, evaluated MRI for the diagnosis of DIE (n = 4) and posterior DIE (n = 2; three data sets). All studies were published after 2004 and were conducted in Europe (n = 5) and Asia (n = 1). MRI methods included (1) MRI jelly (one study, 31 participants); (2) T1/T2‐w MRI with fat‐suppression/Gd (two studies, 125 participants); (3) 2D‐MRI T2‐w (one study, 23 participants); (4) 3D‐MRI (one study, 23 participants); and (5) 3.0T MRI (two studies, 87 participants). Mean estimates of sensitivity and specificity for all studies were 0.94 (95% CI 0.90 to 0.97) and 0.77 (95% CI 0.44 to 1.00), which approached the criteria for a replacement test and a SnNout triage test. Forest plots (Figure 20) and the ROC plot (Figure 21) showed greater heterogeneity for estimates of specificity than sensitivity.


Forest plot of MRI for detection of DIE/Posterior DIE. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional MRI are presented as 'modified method'.

Forest plot of MRI for detection of DIE/Posterior DIE. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional MRI are presented as 'modified method'.


Summary ROC plot of MRI for detection of DIE/Posterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of DIE/Posterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Deep infiltrating endometriosis using other imaging modalities

One study determined the accuracy of double‐contrast barium enema (DCBE) in detecting DIE (69 participants, published in 2011, conducted in Europe), showing sensitivity of 0.36 (95% CI 0.24 to 0.48) and specificity of 1.00 (95% CI 0.16 to 1.00). This test was inferior to TVUS when directly compared in the same individuals (Appendix 4: Figure 22). The included studies evaluated no other imaging techniques for DIE/posterior DIE.


Forest plot demonstrating the direct comparison between TVUS and DCBE for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and DCBE for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Indirect comparisons of imaging tests for deep infiltrating endometriosis

TVUS‐BP (one study, 57 participants) (Scarella 2013) showed the highest diagnostic accuracy of all TVUS methods with sensitivity of 0.94 (95% CI 0.81 to 0.99) and specificity of 1.00 (95% CI 0.85 to 1.00). Tenderness‐guided TVUS (one study, 50 participants) (Guerriero 2007) had relatively high sensitivity of 0.90 (95% CI 0.74 to 0.98) and high specificity of 0.95 (95% CI 0.74 to 1.00), but a subsequent study by the same group using the same methods in a separate cohort (172 participants) (Guerriero 2014) did not reach a similar level of diagnostic accuracy with sensitivity of 0.71 (95% CI 0.61 to 0.80) and specificity of 0.88 (95% CI 0.81 to 0.94). Data were insufficient for a formal comparison of different methods of TVUS. Researchers evaluated no other ultrasound techniques as a diagnostic test for DIE/posterior DIE.

3.0T MRI (Hottat 2009; Manganaro 2012a) showed the highest diagnostic accuracy with sensitivity of 0.96 (95% CI 0.78 to 1.00 and 0.81 to 1.00) and specificity of 1.00 (95% CI 0.77 to 1.00 and 0.85 to 1.00), and the MRI jelly method (Takeuchi 2005) with sensitivity of 0.94 (95% CI 0.71 to 1.00) and specificity of 1.00 (95% CI 0.77 to 1.00). Data were insufficient for formal comparative analyses between MRI methods for DIE/posterior DIE.

Similarly to ovarian endometriosis, pooled estimates of TVUS demonstrated lower sensitivity but higher specificity compared with MRI.

Direct comparisons of imaging tests for deep infiltrating endometriosis

  • Direct comparison between tenderness‐guided TVUS and 3D‐TVUS (one study, 202 participants) (Guerriero 2014) revealed that conventional TVUS is less accurate (sensitivity 0.71, 95% CI 0.61 to 0.80; specificity 0.88, 95% CI 0.81 to 0.94) than 3D‐TVUS (sensitivity 0.87, 95% CI 0.78 to 0.93; specificity 0.94, 95% CI 0.87 to 0.97) (Appendix 4: Figure 23).

  • TVUS had lower estimates of sensitivity 0.44 (95% CI 0.26 to 0.62) and specificity 0.50 (95% CI 0.23 to 0.77) compared with SVG (sensitivity 0.91, 95% CI 0.75 to 0.98; specificity 0.86, 95% CI 0.57 to 0.98) in another study of 46 women (Dessole 2003) (Appendix 4: Figure 23).

  • One paired evaluation (23 participants) (Bazot 2013) demonstrated that 3D‐MRI had higher sensitivity (1.0, 95% CI 0.81 to 1.00) than 2D‐MRI (0.89, 95% CI 0.65 to 0.99), but both tests had identically low specificity of 0.2 (95% CI 0.01 to 0.72) (Appendix 4: Figure 24).

  • MRI (sensitivity 0.96, 95% CI 0.80 to 1.00; specificity 0.86, 95% CI 0.42 to 1.00) appeared to be superior to 3D‐TVUS (sensitivity 0.79, 95% CI 0.54 to 0.94; specificity 0.60, 95% CI 0.15 to 0.95) in one small study that had unequal numbers of participants (MRI, n = 33; 3D‐TVUS, n = 25) from the same cohort (Grasso 2010) (Appendix 4: Figure 25).


Forest plot demonstrating the direct comparison between TVUS methods for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS methods for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between MRI methods for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI methods for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between 3D‐TVUS and MRI for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between 3D‐TVUS and MRI for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Mapping of DIE to specific anatomical sites

A total of 33 studies evaluated the ability of imaging tests to accurately map endometriotic lesions to specific anatomical sites within the pelvic cavity (see Target conditions). Most papers described more than one anatomical site and/or assessed more than one imaging test. Ninety‐four per cent (31/33) were published between 2007 and 2015 (summary of findings Table 2 Appendix 6). Twenty‐seven studies reported a total of 25 direct imaging modality comparisons in mapping endometriotic lesions. Insufficient data and considerable concerns about the risk of bias undermined the validity and reliability of results obtained from these comparisons. Study‐level comparative data are presented in a descriptive form for each anatomical site.

USL endometriosis

Eleven studies (14 data sets) assessed the diagnostic accuracy of TVUS, TRUS and MRI for detecting USL endometriosis in Europe (n = 8), Australia (n = 2) and South America (n = 1). For TVUS (seven studies, 751 participants), mean sensitivity and specificity were 0.64 (95% CI 0.50 to 0.79) and 0.97 (95% CI 0.93 to 1.00). For MRI (four studies, five data sets, 199 participants), mean sensitivity and specificity were 0.86 (95% CI 0.80 to 0.92) and 0.84 (95% CI 0.68 to 1.00). In the two studies that evaluated TRUS in 232 participants, summary sensitivity was 0.52 (95% CI 0.29 and 0.74) and summary specificity was 0.94 (95% CI 0.86 to 1.00). For TVUS, estimates of sensitivity were more heterogeneous than those for specificity (Figure 26; Figure 27; Figure 28), whereas for MRI, specificity was more heterogeneous than sensitivity. For TRUS, both sensitivity and specificity were highly variable.


Forest plot of all imaging tests for diagnosis of USL involvement by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of USL involvement by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Summary ROC plot of TVUS for detection of USL involvement by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of TVUS for detection of USL involvement by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).


Summary ROC plot of MRI for detection of USL involvement by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of USL involvement by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Indirect comparisons of imaging tests for USL endometriosis

TVUS‐BP (one study, 57 participants) demonstrated the highest sensitivity (0.86, 95% CI 0.42 to 1.00) and specificity (1.00, 95% CI 0.93 to 1.00) of the TVUS methods (Scarella 2013). In the MRI group, the 3.0T method appeared to be highly sensitive (0.95, 95% CI 0.74 to 1.00) and specific (0.91, 95% CI 0.72 to 0.99) in one study that included 42 participants (Manganaro 2013), but it yielded lower diagnostic estimates (sensitivity 0.82, 95% CI 0.60 to 0.95; specificity 0.89, 95% CI 0.67 to 0.99) in another study of similar size (41 participants) (Hottat 2009). The latter findings were comparable with those for T1/T2‐w MRI with fat‐suppression/Gd, which was evaluated in one study comprising 92 participants, which reported sensitivity of 0.84 (95% CI 0.75 to 0.91) and specificity of 0.89 (95% CI 0.52 to 1.00) (Bazot 2009). Overall, although TVUS met the criteria for a SpPin triage test in mapping USL endometriosis, TRUS approached these criteria but presented wide CIs and insufficient data for meaningful evaluation. MRI displayed the highest sensitivity of all modalities but did not reach SpPin or SnNout criteria.

Direct comparisons of imaging tests for USL endometriosis

  • Direct comparison between MRI, TVUS and TRUS performed by Bazot et al. (Bazot 2009) showed that MRI was the most accurate method, and TVUS (sensitivity 0.78, 95% CI 0.68 to 0.87; specificity 0.67, 95% CI 0.30 to 0.93) performed better than TRUS (sensitivity 0.48, 95% CI 0.37 to 0.59; specificity 0.44, 95% CI 0.14 to 0.79) for detection of USL endometriosis (Appendix 5: Figure 29; Figure 30; Figure 31).

  • Another direct comparison (23 participants) (Bazot 2013) revealed that 2D‐MRI and 3D‐MRI had a similar diagnostic performance (sensitivity 0.88, 95% CI 0.64 to 0.99; specificity 0.33, 95% CI 0.04 to 0.78) for both tests (Appendix 5: Figure 32).


Forest plot demonstrating the direct comparison between TVUS and TRUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and TRUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between MRI and TRUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI and TRUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between MRI and TVUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI and TVUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

RVS endometriosis

Twelve studies (16 data sets) assessed the diagnostic accuracy of TVUS, TRUS and MRI in detecting RVS endometriosis in Europe (n = 7), South America (n = 3) and Australia (n = 2). For TVUS (10 studies, 11 data sets, 983 participants), mean sensitivity and specificity were 0.88 (95% CI 0.82 to 0.94) and 1.00 (95% CI 0.98 to 1.00), respectively. For MRI (three studies, 288 participants), summary sensitivity and specificity were 0.81 (95% CI 0.70 to 0.93) and 0.86 (95% CI 0.78 to 0.95), respectively. For TRUS (two studies, 232 participants), summary sensitivity and specificity were 0.78 (95% CI 0.51 to 1.00) and 0.96 (95% CI 0.89 to 1.00), respectively. The heterogeneity of sensitivity was greater than that of specificity for all imaging tests (Figure 33). Substantial scatter of the estimates of sensitivity was evident when TVUS estimates were plotted in the ROC space (Figure 34).


Forest plot of all imaging tests for diagnosis of RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to the year of publication for each test. Tests on the same population (different TVUS methods) are presented separately as TVUS*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to the year of publication for each test. Tests on the same population (different TVUS methods) are presented separately as TVUS*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Summary ROC plot of TVUS for detection of RVS involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of TVUS for detection of RVS involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Indirect comparisons of imaging tests for RVS endometriosis

TVUS‐BP studies (three studies, 250 participants) (Abrao 2007; Menada 2008a; Scarella 2013) and RWC‐TVS (one study, 90 participants) (Menada 2008a) demonstrated the highest diagnostic accuracy, with sensitivities ranging from 0.93 to 0.97 and specificities ranging from 0.90 to 1.00. Both TVUS and TRUS met the criteria for a SpPin triage test. TRUS could not be adequately assessed because of the paucity of data and displayed lower diagnostic estimates and wider CIs compared with TVUS. MRI did not meet the criteria for either of the triage tests, but data were insufficient for assessment of its role in a meaningful way.

Direct comparisons of imaging tests for RVS endometriosis

  • Direct comparison (one article, 90 participants) (Menada 2008a) showed that TVUS (RWC‐TVS) (sensitivity 0.97, 95% CI 0.90 to 1.00; specificity 1.00, 95% CI 0.84 to 1.00) displayed greater accuracy than conventional TVUS (sensitivity 0.93, 95% CI 0.84 to 0.98; specificity 0.90, 95% CI 0.70 to 0.99) in detecting RVS endometriosis (Appendix 6: Figure 35).

  • When TRUS and TVUS were directly compared (one study, 92 participants) (Bazot 2009), sensitivities were very low for both (0.18, 95% CI 0.02 to 0.52; 0.09, 95% CI 0.00 to 0.41), respectively, although TVUS had higher specificity (0.99, 95% CI 0.93 to 1.00) than TRUS (0.95, 95% CI 0.88 to 0.99) (Appendix 6: Figure 36). The same study revealed that TRUS and TVUS appeared to be less sensitive than MRI (sensitivity 0.55, 95% CI 0.23 to 0.83; specificity 0.99, 95% CI 0.93 to 1.00); specificity for MRI was higher than for TRUS and comparable with that for TVUS (Appendix 6: Figure 37; Figure 38).

  • In contrast, another comparative study of 104 participants (Abrao 2007) showed that TVUS (sensitivity 0.95, 95% CI 0.83 to 0.99; specificity 0.98, 95% CI 0.91 to 1.00) yielded higher diagnostic estimates than MRI (sensitivity of 0.76, 95% CI 0.60 to 0.88; specificity 0.68, 95% CI 0.55 to 0.79) for detection of RVS endometriosis (Appendix 6: Figure 38).


Forest plot demonstrating the direct comparison between TVUS and RWC‐TVS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and RWC‐TVS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TVUS and TRUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and TRUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between MRI and TRUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI and TRUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between MRI and TVUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI and TVUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

The confidence intervals were wide and overlapping in all direct comparisons, and data were insufficient data for statistical comparison of the different imaging modalities for RVS endometriosis.

Vaginal wall endometriosis

Ten studies (13 data sets) assessed the diagnostic accuracy of TVUS, TRUS and MRI for detecting vaginal wall endometriosis in Europe (n = 7), South America (n = 1) and Australia (n = 2). For TVUS (six studies, 679 participants), mean sensitivity and mean specificity were 0.57 (95% CI 0.21 to 0.94) and 0.99 (95% CI 0.96 to 1.00), respectively. For MRI (four studies, five data sets, 248 participants), mean sensitivity and specificity were 0.77 (95% CI 0.67 to 0.88) and 0.97 (95% CI 0.92 to 1.00), respectively. In the two studies that evaluated TRUS in 232 participants, summary sensitivity and specificity were 0.39 (95% CI 0.08 to 0.70) and 1.00 (95% CI 1.00 to 1.00), respectively. Heterogeneity was greater for estimates of sensitivity than specificity for all test modalities (Figure 39; Figure 40; Figure 41).


Forest plot of all imaging tests for diagnosis of vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Summary ROC plot of TVUS for detection of vaginal wall involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of TVUS for detection of vaginal wall involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).


Summary ROC plot of MRI for detection of vaginal wall involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the mean sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of vaginal wall involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the mean sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Indirect comparisons of imaging tests for vaginal wall endometriosis

Tg‐TVUS (one study, 88 participants) had the highest diagnostic estimates among TVUS methods (sensitivity 0.91, 95% CI 0.76 to 0.98; specificity 0.89, 95% CI 0.77 to 0.96) (Guerriero 2008). 3D MRI (one study, 23 participants) (Bazot 2013) and 3.0T MRI (one study, 41 participants) (Hottat 2009) were the best performing MRI modalities with sensitivities of 0.80 and 0.82 (95% CI 0.28 to 0.99 and 0.48 to 0.98) and specificities of 1.0 and 0.97 (95% CI 0.81 to 1.00 and 0.83 to 1.00), respectively. Both TVUS and MRI met the criteria for a SpPin triage test. TVUS showed lower sensitivity but higher specificity compared with MRI. For TRUS, the criteria for either triage test were not met and CIs were wide, although data were insufficient data to permit meaningful conclusions.

Direct comparisons of imaging tests for vaginal wall endometriosis

  • In a direct comparison comprising 92 participants (Bazot 2009), MRI (sensitivity 0.80, 95% CI 0.61 to 0.92; specificity, 0.86, 95% CI 0.74 to 0.93) showed higher sensitivity but lower specificity than TVUS (sensitivity 0.47, 95% CI 0.28 to 0.66; specificity 0.95, 95% CI 0.87 to 0.99) and TRUS (sensitivity 0.07, 95% CI 0.10 to 0.22; specificity 1.00, 95% CI 0.94 to 1.00); TRUS had much lower sensitivity but higher specificity than either TVUS or MRI (Appendix 7: Figure 42; Figure 43; Figure 44).

  • 2D‐MRI (sensitivity 0.60, 95% CI 0.15 to 0.95; specificity 0.94, 95% CI 0.73 to 1.00) demonstrated lower accuracy estimates than 3D‐MRI (sensitivity 0.8, 95% CI 0.28 to 0.99; specificity 1.00, 95% CI 0.81 to 1.00) in a paired comparative study of 23 participants (Bazot 2013) (Appendix 7: Figure 45).


Forest plot demonstrating the direct comparison between TVUS and TRUS for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and TRUS for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TRUS and MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TRUS and MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TVUS and MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

POD obliteration by endometriosis

Eleven publications (12 data sets) assessed the diagnostic accuracy of TVUS and MRI for detecting an obliterated POD in endometriosis in Europe (n = 6), Australia (n = 3), South America (n = 1) and Asia (n = 1). For TVUS (six studies, 755 participants), mean sensitivity and specificity were 0.83 (95% CI 0.77, 0.88) and 0.97 (95% CI 0.95 to 0.99), respectively. For MRI (five studies, six data sets, 154 participants), mean sensitivity and specificity were 0.90 (95% CI 0.76 to 1.00) and 0.98 (95% CI 0.89 to 1.00), respectively. Heterogeneity was greater for sensitivity than for specificity for TVUS, whereas both estimates were heterogeneous for MRI (Figure 46; Figure 47; Figure 48).


Forest plot of all imaging tests for diagnosis of POD obliteration by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of POD obliteration by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Summary ROC plot of TVUS for detection of POD obliteration by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of TVUS for detection of POD obliteration by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).


Summary ROC plot of MRI for detection of POD obliteration by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of POD obliteration by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Indirect comparisons of imaging tests for POD obliteration by endometriosis

TVUS‐BP (two studies, 136 participants) demonstrated the highest diagnostic accuracy of all TVUS methods with sensitivities of 0.89 and 0.88 (95% CI 0.71 to 0.98 and 0.73 to 0.97) and specificities of 0.92 and 0.90 (95% CI 0.73 to 0.99 and 0.79 to 0.97) (Leon 2014; Piessens 2014). 3.0T MRI (three studies, 100 participants) was the best performing MRI technique with sensitivities ranging from 0.93 to 1.00 and specificities ranging from 0.75 to 1.00 (Hottat 2009; Manganaro 2012a; Thomeer 2014). Both TVUS and MRI could qualify as a SpPin triage test for detecting POD obliteration in endometriosis with slightly higher diagnostic estimates for MRI, which also approached the criteria for a SnNout triage test.

Direct comparisons of imaging tests for POD obliteration by endometriosis

2D‐MRI had similar accuracy to 3D‐MRI for detection of POD obliteration with sensitivity of 0.71 (95% CI 0.42 to 0.92) and specificity of 1.00 (95% CI 0.66 to 1.00) for both data sets in one small direct comparison comprising 23 participants (Bazot 2013) (Appendix 7: Figure 49).


Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for POD obliteration by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for POD obliteration by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Anterior DIE

Three studies assessed the diagnostic accuracy of TVUS and MRI in diagnosing anterior DIE in Europe. For TVUS (two studies, 289 participants), summary sensitivity and specificity were 0.41 (95% CI 0.00 to 0.81) and 1.00 (95% CI 1.00 to 1.00). MRI (one study, 41 participants) demonstrated sensitivity of 0.75 (95% CI 0.35 to 0.97) and specificity of 1.00 (95% CI 0.89 to 1.00) in detecting anterior DIE (Figure 50). The diagnostic accuracy of bladder and ureteric endometriosis was not assessed in this review (see Target conditions).


Summary ROC plot of TVUS and MRI for detection of anterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size, and the shape designates different imaging modalities. The solid black circle represents the pooled sensitivity and specificity for TVUS, and the bars correspond to 95% CIs of each individual study.

Summary ROC plot of TVUS and MRI for detection of anterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size, and the shape designates different imaging modalities. The solid black circle represents the pooled sensitivity and specificity for TVUS, and the bars correspond to 95% CIs of each individual study.

Rectosigmoid endometriosis

A total of 21 studies (31 data sets) assessed the accuracy of TVUS, TRUS, MRI, MDCT‐e and DCBE for detecting rectosigmoid endometriosis in Europe (n = 15), South America (n = 4) and Australia (n = 2). Mean estimates for each imaging modality were as follows: for TVUS (14 studies, 15 data sets, 1616 participants), sensitivity of 0.90 (95% CI 0.82 to 0.97) and specificity of 0.96 (95% CI 0.94 to 0.99); for TRUS (four studies, 330 participants), sensitivity of 0.91 (95% CI 0.85 to 0.98) and specificity of 0.96 (95% CI 0.91 to 1.00); for MRI (six studies, seven data sets, 612 participants), sensitivity of 0.92 (95% CI 0.86 to 0.99) and specificity of 0.96 (95% CI 0.93 to 0.98). Less heterogeneity was seen in the estimates for TVUS, TRUS and MRI in rectosigmoid endometriosis than in other anatomical locations (Figure 51; Figure 52; Figure 53; Figure 54). For MDCT‐e (three studies, 389 participants), summary sensitivity and specificity were 0.98 (95% CI 0.94 to 1.00) and 0.99 (95% CI 0.97 to 1.00) (Figure 55). For DCBE (two studies, 106 participants), summary sensitivity and specificity were 0.56 (95% CI 0.32 to 0.80) and 0.77 (95% CI 0.41 to 1.00), and both studies displayed considerable heterogeneity (Figure 56).


Forest plot of all imaging tests for diagnosis of rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different TVUS and MRI methods) are presented separately as TVUS* and MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different TVUS and MRI methods) are presented separately as TVUS* and MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Summary ROC plot of TVUS for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of TVUS for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).


Summary ROC plot of TRUS for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of TRUS for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).


Summary ROC plot of MRI for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).


Summary ROC plot of MDCT‐e for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size and the shape designates consecutive or non‐consecutive enrolment. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.

Summary ROC plot of MDCT‐e for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size and the shape designates consecutive or non‐consecutive enrolment. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.


Summary ROC plot of DCBE for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.

Summary ROC plot of DCBE for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.

Indirect comparisons of imaging tests for rectosigmoid endometriosis

TVUS‐BP (two studies, 288 participants) (Abrao 2007; Goncalves 2010) demonstrated the highest sensitivity (0.98, 95% CI 0.91 to 1.00 and 0.9 to 1.00; specificity 1.00, 95% CI 0.97 to 1.00 and 0.93 to 1.00) of all TVUS methods. The highest diagnostic estimates of all MRI methods included 3.0T MRI (one study, 41 participants) (sensitivity 1.00, 95% CI 0.75 to 1.00; specificity 0.96, 95% CI 0.82 to 1.00) (Hottat 2009) and MRI 'jelly method' of introducing ultrasonographic gel into both the rectum and the vagina (one study, 260 participants) (sensitivity 0.99, 95% CI 0.96 to 1.00; specificity 0.96, 95% CI 0.90 to 0.99) (Biscaldi 2014). TVUS, TRUS and MRI met the criteria for for a SpPin triage test and approached the criteria for a SnNout triage test; all demonstrated comparable diagnostic estimates. MDCT‐e displayed the best diagnostic performance and met the criteria for both SpPin and SnNout triage tests; however, only three studies examined MDCT‐e, and further work is required to confirm these findings. Data for DCBE were scant but largely discouraging.

Direct comparisons of imaging tests for rectosigmoid endometriosis

  • 2D‐TVUS (sensitivity 0.95, 95% CI 0.87 to 0.99; specificity 0.93, 95% CI 0.87 to 0.97) appeared to be more sensitive and less specific than 3D‐TVUS (sensitivity 0.91, 95% CI 0.82 to 0.96; specificity 0.97, 95% CI 0.92 to 0.99) for diagnosing rectosigmoid endometriosis in one paired study of 202 participants (Guerriero 2014) (Appendix 8: Figure 57).

  • The study that directly compared TVUS, TRUS and MRI (92 participants) (Bazot 2009) revealed that TVUS had higher diagnostic values (sensitivity 0.94, 95% CI 0.85 to 0.98; specificity 100, 95% CI 0.88 to 100) when compared with MRI (sensitivity 0.87, 95% CI 0.77 to 0.94; specificity 0.93, 95% CI 0.77 to 0.99) and TRUS (sensitivity 0.89, 95% CI 0.78 to 0.95; specificity 0.89, 95% CI 0.78 to 0.95); MRI and TRUS yielded comparable estimates (Appendix 8: Figure 58; Figure 59; Figure 60). This finding was in agreement with two other studies that reported similar types of paired data for detection of rectosigmoid endometriosis (presented below).

  • TVUS (sensitivity 0.96, 95% CI 0.87 to 1.00; specificity 0.90, 95% CI 0.55 to 1.00) was more sensitive and specific than TRUS (sensitivity 0.88, 95% CI 0.76 to 0.96; specificity 0.80, 95% CI 0.44 to 0.97) in a study of 61 participants (Bergamini 2010) (Appendix 8: Figure 61).

  • Further, TVUS (sensitivity 0.98, 95% CI 0.90 to 1.00; specificity 1.00, 95% CI 0.93 to 1.00) was more sensitive and specific than MRI (sensitivity 0.83, 95% CI 0.71 to 0.92; specificity 0.98, 95% CI 0.89 to 1.00) in another direct comparison in 104 participants (Abrao 2007) (Appendix 8: Figure 58).

  • TVUS had higher sensitivity (0.91, 95% CI 0.80 to 0.97) than DCBE (0.43, 95% CI 0.30 to 0.57), although both methods displayed identically high specificity (1.00, 95% CI 0.75 to 1.00) in another head‐to head comparison of 69 participants (Savelli 2011) (Appendix 8: Figure 62).

  • Estimates for TRUS (sensitivity 1.00, 95% CI 0.87 to 1.00; specificity 0.90, 95% CI 0.55 to 1.00) were higher than those for DCBE (sensitivity 0.88, 95% CI 0.68 to 0.97; specificity 0.54, 95% CI 0.25 to 0.81) in a separate direct comparison of 37 participants (Ribeiro 2008a) (Appendix 8: Figure 63).

  • Another paired study (96 participants) (Ferrero 2011) showed that TVUS (RWC‐TVS) (sensitivity 0.94, 95% CI 0.83 to 0.99; specificity 0.98, 95% CI 0.89 to 1.00) had lower accuracy estimates than MDCT‐e (sensitivity 0.96, 95% CI 0.86 to 0.99; specificity 1.00, 95% CI 0.93 to 1.00) in diagnosing rectosigmoid endometriosis, although both methods demonstrated reasonably high values with overlapping confidence intervals (Appendix 8: Figure 64).

  • MDCT‐e (sensitivity 0.99, 95% CI 0.97 to 1.00; specificity 0.99, 95% CI 0.94 to 1.00) and MRI (sensitivity 0.99, 95% CI 0.96 to 1.00; specificity 0.96, 95% CI 0.90 to 0.99) yielded similarly high diagnostic accuracy estimates in one comparative study (260 participants) (Biscaldi 2014) (Appendix 8: Figure 65).

  • 2D‐MRI (sensitivity 0.85, 95% CI 0.55 to 0.98; specificity 1.00, 95% CI 0.69 to 1.00)) demonstrated similar sensitivity and higher specificity than 3D‐MRI (sensitivity 0.85, 95% CI 0.55 to 0.98; specificity 0.90, 95% CI 0.55 to 1.00) in a paired comparative study of 23 participants (Bazot 2013) (Appendix 8: Figure 66).


Forest plot demonstrating the direct comparison between TVUS and 3D‐TVUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and 3D‐TVUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TVUS and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TVUS and TRUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and TRUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TRUS and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TRUS and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between RWC‐TVS and TRUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between RWC‐TVS and TRUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TVUS and DCBE for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and DCBE for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between TRUS and DCBE for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TRUS and DCBE for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between RWC‐TVS and MDCT‐e for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between RWC‐TVS and MDCT‐e for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between MDCT‐e and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MDCT‐e and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Bowel endometriosis (ileum ‐ rectum)

Four studies (six data sets) assessed the accuracy of TVUS, TRUS and MDCT‐e in detecting bowel endometriosis from the ileum to the rectum in Europe (n = 3) and Australia (n = 1). For TVUS (three studies, 314 participants), summary sensitivity and specificity were 0.89 (95% CI 0.81 to 0.97) and 0.96 (95% CI 0.91 to 1.00). For TRUS (one study, 134 participants), sensitivity was 0.96 (95% CI 0.89 to 0.99) and specificity was 1.00 (95% CI 0.94 to 1.00). For MDCT‐e (two studies, 194 participants), summary sensitivity and specificity were 0.98 (95% CI 0.92 to 1.00) and 1.00 (95% CI 1.00 to 1.00). Both sensitivity and specificity showed only a small degree of variability; both values were generally were high for all tests (Figure 67; Figure 68; Figure 69).


Forest plot of all imaging tests for diagnosis of bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Summary ROC plot of US methods (TVUS, TRUS) for detection of bowel [ileum ‐ rectum] involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity (for TVUS), and the bars correspond to 95% CIs of each individual study.

Summary ROC plot of US methods (TVUS, TRUS) for detection of bowel [ileum ‐ rectum] involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity (for TVUS), and the bars correspond to 95% CIs of each individual study.


Summary ROC plot of MDCT‐e for detection of bowel [ileum ‐ rectum] involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.

Summary ROC plot of MDCT‐e for detection of bowel [ileum ‐ rectum] involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.

Indirect comparisons of imaging tests for bowel endometriosis (ileum ‐ rectum)

The TVUS non‐modified technique (one study, 133 participants) (Piketty 2009) showed higher diagnostic estimates than TVUS‐BP (one study, 85 participants) and RWC‐TVS (one study, 96 participants) with sensitivity of 0.91 (95% CI 0.82 to 0.96) and specificity of 0.97 (95% CI 0.88 to 1.00). Although studies were too few for a meaningful evaluation of the role of imaging tests in diagnosing bowel endometriosis, TVUS, TRUS and MDCT‐e met the criteria for a SpPin triage test, and TRUS and MDCT‐e met the criteria for a SnNout triage test for bowel endometriosis.

Direct comparisons of imaging tests for bowel endometriosis (ileum ‐ rectum)

TVUS (sensitivity 0.91, 95% CI 0.82 to 0.96; specificity 0.97, 95% CI 0.88 to 1.00) yielded lower diagnostic accuracy estimates than TRUS (sensitivity 0.96, 95% CI 0.89 to 0.99; specificity 1.00, 95% CI 0.94 to 1.00) in one paired study of 134 participants (Piketty 2009) (Appendix 8: Figure 70). One study including 96 women (Ferrero 2011) found that MDCT‐e (sensitivity 0.96, 95% CI 0.87 to 1.00; specificity 1.00, 95% CI 0.92 to 1.00) had slightly higher estimates than RWC‐TVS (sensitivity 0.88, 95% CI 0.76 to 0.96; specificity 0.98, 95% CI 0.88 to 1.00) for the diagnosis of bowel endometriosis (Appendix 8: Figure 71).


Forest plot demonstrating the direct comparison between TVUS and TRUS for bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and TRUS for bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.


Forest plot demonstrating the direct comparison between RWC‐TVS and MDCT‐e for bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between RWC‐TVS and MDCT‐e for bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Investigation of heterogeneity and sensitivity analyses

Potential sources of heterogeneity are outlined under Secondary objectives. Although we attempted to assess these sources of heterogeneity, studies evaluating each test were too few to make this a meaningful analysis, except for the meta‐analysis with more than 10 studies/data sets (TVUS DIE/posterior DIE, TVUS RVS and TVUS rectosigmoid). For these tests, we found no significant differences in sensitivity or specificity between studies with regards to year of publication, geographical location of the study or application of the modified technique. We were not able to explore the effects of the following potential sources of heterogeneity.

  • Age (adolescents vs later reproductive years): information on isolated subgroups not available in any study.

  • Clinical presentation (pelvic pain ± infertility vs ovarian mass; symptomatic vs asymptomatic women) or stage of disease (minimal mild, rASRM stage I to II vs moderate to severe, rASRM stage III to IV): information on isolated subgroups not available in any of the studies; all participants symptomatic.

  • Histological confirmation versus laparoscopic visualisation without histology: histological test used in conjunction with surgery in most studies.

  • Modifications applied to conventional imaging techniques: insufficient number of studies for each method.

  • Methodological quality ‐ low versus unclear or high risk: all studies of low methodological quality with high or unclear risk of bias.

  • Study design: 'single‐gate' versus 'two‐gate' studies; all studies except one of single‐gate design.

Furthermore, observer variability bias or bias related to interpretation of results cannot be formally assessed in the context of this review.

Discussion

Summary of main results

Data from 4807 women of reproductive age with symptoms of endometriosis who undertook a non‐invasive imaging test followed by diagnostic surgery for endometriosis were analysed in 49 articles published from 1993 through 2015. This is the first diagnostic test review to use Cochrane methods and the most comprehensive review to date.

For pelvic endometriosis, no imaging method met the sensitivity criteria for a replacement test or a triage test, albeit TVUS approached the criteria for a SpPin triage test.

For ovarian endometriosis, MRI met the criteria for a replacement test and a SnNout triage test and approached the criteria for a SpPin triage test, but studies were too few to allow conclusions on the role of MRI in detecting ovarian endometriosis. TVUS met the criteria for a SpPin triage test and approached the criteria for a replacement test and a SnNout triage test.

For DIE/posterior DIE, MRI approached the criteria for a replacement test and a SnNout triage test, and TVUS approached the criteria for a SpPin triage test.

Studies were too few for a prudent evaluation of any imaging test for diagnosing anterior DIE.

TVUS, TRUS and MRI reached the criteria for a SpPin triage test and approached the criteria for a SnNout triage test for rectosigmoid endometriosis, which was the most frequently evaluated anatomical site of DIE. TVUS also met SpPin test criteria for other bowel endometriosis (ileum to rectum). MDCT‐e displayed the highest diagnostic performance for rectosigmoid and other bowel endometriosis and met the criteria for both SpPin and SnNout triage tests, but studies were too few to provide meaningful results. We found less heterogeneity among estimates for imaging tests in rectosigmoid and bowel endometriosis compared with other anatomical locations, excluding DCBE, which showed heterogeneous and unsatisfactory diagnostic values.

Concerning other anatomical locations, TVUS met the criteria for a SpPin triage test in mapping DIE to USL, RVS, vaginal wall and POD, and MRI could qualify as a SpPin triage test only for POD and vaginal wall endometriosis. TRUS could not be adequately assessed for any of these sites because heterogeneous data were scant.

Data were insufficient for formal comparative analyses between TVUS and MRI methods, although modified ultrasound methods (TVUS‐BP and RWC‐TVS) and specific MRI modalities (3.0T MRI and MRI jelly method with introduction of ultrasonographic gel into both the rectum and the vagina) showed the highest diagnostic accuracy for evaluated types and anatomical locations of endometriosis.

Studies of poor quality showing considerably heterogeneous results with wide confidence intervals for most evaluated tests suggest caution in interpretation of study results.

Strengths and weaknesses of the review

This review is part of a comprehensive review series of minimally invasive biomarkers for the diagnosis of endometriosis.

Strengths of this review include the following.

  • Review authors undertook a very thorough search of the current literature including studies written in languages other than English.

  • Two independent review authors extracted data and used a modified QUADAS‐2 tool for quality assessments.

  • Stringent selection criteria ensured that eligible studies were prospective, included only symptomatic women of reproductive age and performed the index test before providing results of the reference test, which minimised the risk of bias in interpretation of index test results.

  • Most of the included studies (48/49) were of 'single‐gate' design, including only clinically relevant populations.

  • We approached authors of studies in an attempt to obtain missing information required to assess eligibility and critically appraise studies.

Limitations of this review include the following.

  • Few heterogeneous, small studies performed most of the index tests evaluated. This may undermine the reliability of pooled estimates from the meta‐analyses and is likely to have contributed to the marked variability in sensitivity and specificity seen for most index tests. Studies varied with respect to participant preparation, operator experience and imaging equipment used, as well as in the definition of the target condition and the diagnostic criteria for imaging tests. Sources of heterogeneity could not be formally explored for most tests because few studies were available for most evaluations. When assessed, geographical location, prevalence of the target condition and assessed risk of bias did not appear to contribute to variation in results.

  • All included studies had high/unclear risk of bias; this, together with considerable heterogeneity among studies, contributed to the low quality of evidence presented in this review.

  • Reported prevalence of endometriosis in most studies was generally higher than was previously reported for endometriosis (6% to 10% in the general female population and 35% to 50% among symptomatic women for overall endometriosis (Giudice 2004); 30% for DIE in symptomatic populations) (Koninckx and Martin 1994). This may reflect a high level of surgical diagnostic expertise but could be due to preselection of more challenging cases at tertiary referral centres and high risk of patient selection bias in most studies. Selection bias appeared to be reduced but not eliminated by consecutive enrolment of participants; however, information on the method of enrolment was missing from most of the included studies.

  • Inappropriate assignation to endometriosis and control groups could not be excluded in many studies and is another weakness of the review. Surgical misdiagnosis is a potential cause of bias, as the number and experience of the surgical team, surgical diagnostic criteria and surgical methods were poorly described in most included studies. We now have a standardised technique for performing laparoscopy, and we recommend that future studies should use this standardised method of undertaking laparoscopy (Becker 2014). Additionally, we did not confine the studies included in this review to those that reported histological confirmation of endometriotic lesions. Although a recent ESHRE guideline stated that evidence is lacking to support laparoscopy without histology to confirm endometriosis (Dunselman 2014), the clinical significance of histological verification remains debatable. Diagnosis by surgical visualisation only remains a common clinical practice and can be considered reliable when accurate inspection of the abdominal cavity is performed by experienced surgeons. We chose to include the six (15%) studies that reported only surgical visualisation as the reference standard, and we did not wish to lose this potentially valuable information; however, this decision could impact the accuracy of assignation to case and control groups. Moreover, surgeons were commonly aware of results of the index imaging test preoperatively, which could potentially contribute to bias in interpretation of the reference standard.

  • Only five studies addressed interobserver and intraobserver variability for TVUS, reporting that both 2D‐ and 3D‐TVUS were reliable and reproducible techniques. High levels of interobserver concordance were seen between experienced operators (Holland 2010) and operators with varying degrees of experience (Guerriero 2007; Pascual 2013; Reid 2013b; Guerriero 2014). For MRI, interobserver agreement varied, with greater intraobserver agreement noted for expert readers and less agreement for junior readers (Bazot 2013). The diagnostic concordance of observers varied with the location of endometriosis, with high interobserver and intraobserver agreement for ovarian endometrioma, rectosigmoid and RVS disease, and less agreement for identification of uterosacral ligament lesions (Saba 2010; Bazot 2011b; Saba 2014b).

  • Methods for systematic reviews of diagnostic accuracy are emerging, and no criteria for replacement or triage diagnostic tests have been established. We chose criteria that were both realistic and clinically applicable to assist in interpretation of complex results. For a replacement test, we considered the threshold reported by the one and the only systematic review on accuracy of the reference standard (laparoscopy) in detecting endometriosis (Wykes 2004) to be the most objective. The meta‐analysis was published in 2004 and included four eligible studies comprising 433 women. We acknowledge the limitations associated with emphasising a single review, particularly if it does not present the latest and possibly more accurate data that reflect advances in surgical expertise and technology. Several studies on the accuracy of laparoscopy in detecting endometriosis have been published over the past decade; however, their results were not addressed in a systematic way. A further systematic analysis to determine the accuracy of laparoscopy was beyond the scope of this review. Criteria for triage tests utilised the common concepts of SnNout and SpPin in medical statistics, and cut‐offs were set at levels that we considered to be clinically relevant (see Role of index test(s)). We encourage the readers to apply independent interpretation of the diagnostic estimates presented while using thresholds that may be more applicable to specific populations and clinical circumstances.

Applicability of findings to the review question

Most studies used QUADAS‐2 to rank clinical applicability as high (only one study presented high concern for applicability with regard to patient selection). This reflects inclusion criteria ensuring that prospective symptomatic cohorts of women constituted the participant population, which is highly applicable to the review question and to clinical practice. Most included studies were conducted at specialised centres for endometriosis with a high level of expertise in gynaecological imaging, and index test outcome measures may not be reproducible in all institutions or may not be extrapolated to general practice.

We excluded some potentially relevant well‐designed studies as they did not directly address the review question. These included studies that reported the number of endometriotic lesions instead of the number of affected participants as an endpoint. Studies that compared endometriomas versus other ovarian masses did not meet our inclusion criteria for reproductive age or assessed numbers of cysts rather than numbers of participants. Despite well‐defined radiological criteria, endometriomas can be misdiagnosed because of their complex echo texture and multi‐faceted appearance, and their appearance can be different among premenopausal and postmenopausal women (Exacoustos 2014). We also excluded rare forms of endometriosis, such as that involving the bladder, ureter or extrapelvic sites (e.g. umbilicus, hernia sacs, abdominal wall, lung, kidney), as studies are informed predominantly by case reports or small case series, and diagnostic laparoscopy is not an applicable reference test for these conditions.

Sequential approach to non‐invasive testing of endometriosis.
Figures and Tables -
Figure 1

Sequential approach to non‐invasive testing of endometriosis.

Flow of studies identified in literature search for systematic review on imaging modalities for a non‐invasive diagnosis of endometriosis.
Figures and Tables -
Figure 2

Flow of studies identified in literature search for systematic review on imaging modalities for a non‐invasive diagnosis of endometriosis.

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.
Figures and Tables -
Figure 3

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.
Figures and Tables -
Figure 4

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study.

Forest plot of TVUS for detection of pelvic endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to the year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.
Figures and Tables -
Figure 5

Forest plot of TVUS for detection of pelvic endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to the year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.

Summary ROC plot of TVUS for detection of pelvic endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 6

Summary ROC plot of TVUS for detection of pelvic endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Forest plot of MRI for detection of pelvic endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication. Tests on the same population (different MRI methods) are presented separately as MRI* and MRI**. FN: false negative; FP: false positive; TN: true negative; TP: true positive.
Figures and Tables -
Figure 7

Forest plot of MRI for detection of pelvic endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication. Tests on the same population (different MRI methods) are presented separately as MRI* and MRI**. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Summary ROC plot of MRI for detection of pelvic endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI* and MRI**. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 8

Summary ROC plot of MRI for detection of pelvic endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI* and MRI**. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.
Figures and Tables -
Figure 9

Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.
Figures and Tables -
Figure 10

Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.
Figures and Tables -
Figure 11

Forest plot demonstrating the direct comparison between MRI methods for pelvic endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Forest plot of US methods (TVUS, TRUS) for detection of ovarian endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are presented for TVUS and TRUS and are ordered according to year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.
Figures and Tables -
Figure 12

Forest plot of US methods (TVUS, TRUS) for detection of ovarian endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are presented for TVUS and TRUS and are ordered according to year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.

Summary ROC plot of US methods (TVUS, TRUS) for detection of ovarian endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line) (for TVUS).
Figures and Tables -
Figure 13

Summary ROC plot of US methods (TVUS, TRUS) for detection of ovarian endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line) (for TVUS).

Forest plot of MRI for detection of ovarian endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line). Studies are ordered by year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive.
Figures and Tables -
Figure 14

Forest plot of MRI for detection of ovarian endometriosis. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line). Studies are ordered by year of publication. FN: false negative; FP: false positive; TN: true negative; TP: true positive.

Forest plot demonstrating the direct comparison between TVUS and TRUS for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 15

Forest plot demonstrating the direct comparison between TVUS and TRUS for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TRUS and MRI for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 16

Forest plot demonstrating the direct comparison between TRUS and MRI for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and MRI for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 17

Forest plot demonstrating the direct comparison between TVUS and MRI for ovarian endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of TVUS for detection of DIE/Posterior DIE. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for DIE and Posterior DIE, respectively. Tests on the same population (different TVUS methods) are presented separately as TVUS*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.
Figures and Tables -
Figure 18

Forest plot of TVUS for detection of DIE/Posterior DIE. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for DIE and Posterior DIE, respectively. Tests on the same population (different TVUS methods) are presented separately as TVUS*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional TVUS are presented as 'modified method'.

Summary ROC plot of TVUS for detection of DIE/Posterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 19

Summary ROC plot of TVUS for detection of DIE/Posterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Forest plot of MRI for detection of DIE/Posterior DIE. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional MRI are presented as 'modified method'.
Figures and Tables -
Figure 20

Forest plot of MRI for detection of DIE/Posterior DIE. Plot shows study‐specific estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional MRI are presented as 'modified method'.

Summary ROC plot of MRI for detection of DIE/Posterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 21

Summary ROC plot of MRI for detection of DIE/Posterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Forest plot demonstrating the direct comparison between TVUS and DCBE for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 22

Forest plot demonstrating the direct comparison between TVUS and DCBE for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS methods for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 23

Forest plot demonstrating the direct comparison between TVUS methods for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI methods for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 24

Forest plot demonstrating the direct comparison between MRI methods for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between 3D‐TVUS and MRI for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 25

Forest plot demonstrating the direct comparison between 3D‐TVUS and MRI for DIE/Posterior DIE. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of USL involvement by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 26

Forest plot of all imaging tests for diagnosis of USL involvement by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Summary ROC plot of TVUS for detection of USL involvement by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 27

Summary ROC plot of TVUS for detection of USL involvement by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of USL involvement by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 28

Summary ROC plot of MRI for detection of USL involvement by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Forest plot demonstrating the direct comparison between TVUS and TRUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 29

Forest plot demonstrating the direct comparison between TVUS and TRUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI and TRUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 30

Forest plot demonstrating the direct comparison between MRI and TRUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI and TVUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 31

Forest plot demonstrating the direct comparison between MRI and TVUS for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 32

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for USL involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to the year of publication for each test. Tests on the same population (different TVUS methods) are presented separately as TVUS*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 33

Forest plot of all imaging tests for diagnosis of RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to the year of publication for each test. Tests on the same population (different TVUS methods) are presented separately as TVUS*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Summary ROC plot of TVUS for detection of RVS involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 34

Summary ROC plot of TVUS for detection of RVS involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Forest plot demonstrating the direct comparison between TVUS and RWC‐TVS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 35

Forest plot demonstrating the direct comparison between TVUS and RWC‐TVS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and TRUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 36

Forest plot demonstrating the direct comparison between TVUS and TRUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI and TRUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 37

Forest plot demonstrating the direct comparison between MRI and TRUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MRI and TVUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 38

Forest plot demonstrating the direct comparison between MRI and TVUS for RVS involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 39

Forest plot of all imaging tests for diagnosis of vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Summary ROC plot of TVUS for detection of vaginal wall involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 40

Summary ROC plot of TVUS for detection of vaginal wall involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of vaginal wall involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the mean sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 41

Summary ROC plot of MRI for detection of vaginal wall involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the mean sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Forest plot demonstrating the direct comparison between TVUS and TRUS for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 42

Forest plot demonstrating the direct comparison between TVUS and TRUS for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TRUS and MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 43

Forest plot demonstrating the direct comparison between TRUS and MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 44

Forest plot demonstrating the direct comparison between TVUS and MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 45

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for vaginal wall involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of POD obliteration by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 46

Forest plot of all imaging tests for diagnosis of POD obliteration by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different MRI methods) are presented separately as MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Summary ROC plot of TVUS for detection of POD obliteration by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 47

Summary ROC plot of TVUS for detection of POD obliteration by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of POD obliteration by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 48

Summary ROC plot of MRI for detection of POD obliteration by endometriosis. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for POD obliteration by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 49

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for POD obliteration by endometriosis. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Summary ROC plot of TVUS and MRI for detection of anterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size, and the shape designates different imaging modalities. The solid black circle represents the pooled sensitivity and specificity for TVUS, and the bars correspond to 95% CIs of each individual study.
Figures and Tables -
Figure 50

Summary ROC plot of TVUS and MRI for detection of anterior DIE. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size, and the shape designates different imaging modalities. The solid black circle represents the pooled sensitivity and specificity for TVUS, and the bars correspond to 95% CIs of each individual study.

Forest plot of all imaging tests for diagnosis of rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different TVUS and MRI methods) are presented separately as TVUS* and MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 51

Forest plot of all imaging tests for diagnosis of rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. Tests on the same population (different TVUS and MRI methods) are presented separately as TVUS* and MRI*. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Summary ROC plot of TVUS for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 52

Summary ROC plot of TVUS for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different TVUS methods) are presented separately as TVUS*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of TRUS for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 53

Summary ROC plot of TRUS for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MRI for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).
Figures and Tables -
Figure 54

Summary ROC plot of MRI for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. Tests on the same population (different MRI methods) are presented separately as MRI*. The solid black circle represents the pooled sensitivity and specificity, which is surrounded by a 95% confidence region (dashed line).

Summary ROC plot of MDCT‐e for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size and the shape designates consecutive or non‐consecutive enrolment. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.
Figures and Tables -
Figure 55

Summary ROC plot of MDCT‐e for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size and the shape designates consecutive or non‐consecutive enrolment. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.

Summary ROC plot of DCBE for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.
Figures and Tables -
Figure 56

Summary ROC plot of DCBE for detection of rectosigmoid involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.

Forest plot demonstrating the direct comparison between TVUS and 3D‐TVUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 57

Forest plot demonstrating the direct comparison between TVUS and 3D‐TVUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 58

Forest plot demonstrating the direct comparison between TVUS and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and TRUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 59

Forest plot demonstrating the direct comparison between TVUS and TRUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TRUS and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 60

Forest plot demonstrating the direct comparison between TRUS and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between RWC‐TVS and TRUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 61

Forest plot demonstrating the direct comparison between RWC‐TVS and TRUS for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TVUS and DCBE for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 62

Forest plot demonstrating the direct comparison between TVUS and DCBE for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between TRUS and DCBE for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 63

Forest plot demonstrating the direct comparison between TRUS and DCBE for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between RWC‐TVS and MDCT‐e for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 64

Forest plot demonstrating the direct comparison between RWC‐TVS and MDCT‐e for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between MDCT‐e and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 65

Forest plot demonstrating the direct comparison between MDCT‐e and MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 66

Forest plot demonstrating the direct comparison between 2D‐MRI and 3D‐MRI for rectosigmoid involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot of all imaging tests for diagnosis of bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 67

Forest plot of all imaging tests for diagnosis of bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. Studies are ordered according to year of publication for each test. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Summary ROC plot of US methods (TVUS, TRUS) for detection of bowel [ileum ‐ rectum] involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity (for TVUS), and the bars correspond to 95% CIs of each individual study.
Figures and Tables -
Figure 68

Summary ROC plot of US methods (TVUS, TRUS) for detection of bowel [ileum ‐ rectum] involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity (for TVUS), and the bars correspond to 95% CIs of each individual study.

Summary ROC plot of MDCT‐e for detection of bowel [ileum ‐ rectum] involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.
Figures and Tables -
Figure 69

Summary ROC plot of MDCT‐e for detection of bowel [ileum ‐ rectum] involvement. Each point represents the pair of sensitivity and specificity from a study. The size of each point is proportional to the study sample size. The solid black circle represents the pooled sensitivity and specificity, and the bars correspond to 95% CIs of each individual study.

Forest plot demonstrating the direct comparison between TVUS and TRUS for bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 70

Forest plot demonstrating the direct comparison between TVUS and TRUS for bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

Forest plot demonstrating the direct comparison between RWC‐TVS and MDCT‐e for bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.
Figures and Tables -
Figure 71

Forest plot demonstrating the direct comparison between RWC‐TVS and MDCT‐e for bowel [ileum ‐ rectum] involvement. Plot shows study‐specific paired estimates of sensitivity and specificity (squares) with 95% CI (black line) and country in which the study was conducted. FN: false negative; FP: false positive; TN: true negative; TP: true positive. Modifications to the conventional technique are presented as 'modified method'.

TVUS pelvic.
Figures and Tables -
Test 1

TVUS pelvic.

TVUS ovarian.
Figures and Tables -
Test 2

TVUS ovarian.

TVUS DIE.
Figures and Tables -
Test 3

TVUS DIE.

TVUS posterior DIE.
Figures and Tables -
Test 4

TVUS posterior DIE.

TVUS* posterior DIE.
Figures and Tables -
Test 5

TVUS* posterior DIE.

TVUS USL.
Figures and Tables -
Test 6

TVUS USL.

TVUS RVS.
Figures and Tables -
Test 7

TVUS RVS.

TVUS* RVS.
Figures and Tables -
Test 8

TVUS* RVS.

TVUS vaginal.
Figures and Tables -
Test 9

TVUS vaginal.

TVUS POD.
Figures and Tables -
Test 10

TVUS POD.

TVUS anterior DIE.
Figures and Tables -
Test 11

TVUS anterior DIE.

TVUS rectosigmoid.
Figures and Tables -
Test 12

TVUS rectosigmoid.

TVUS* rectosigmoid.
Figures and Tables -
Test 13

TVUS* rectosigmoid.

TVUS bowel [ileum ‐ rectum].
Figures and Tables -
Test 14

TVUS bowel [ileum ‐ rectum].

TRUS ovarian.
Figures and Tables -
Test 15

TRUS ovarian.

TRUS USL.
Figures and Tables -
Test 16

TRUS USL.

TRUS RVS.
Figures and Tables -
Test 17

TRUS RVS.

TRUS vaginal.
Figures and Tables -
Test 18

TRUS vaginal.

TRUS rectosigmoid.
Figures and Tables -
Test 19

TRUS rectosigmoid.

TRUS bowel [ileum ‐ rectum].
Figures and Tables -
Test 20

TRUS bowel [ileum ‐ rectum].

MRI pelvic.
Figures and Tables -
Test 21

MRI pelvic.

MRI* pelvic.
Figures and Tables -
Test 22

MRI* pelvic.

MRI** pelvic.
Figures and Tables -
Test 23

MRI** pelvic.

MRI ovarian.
Figures and Tables -
Test 24

MRI ovarian.

MRI DIE.
Figures and Tables -
Test 25

MRI DIE.

MRI posterior DIE.
Figures and Tables -
Test 26

MRI posterior DIE.

MRI* posterior DIE.
Figures and Tables -
Test 27

MRI* posterior DIE.

MRI USL.
Figures and Tables -
Test 28

MRI USL.

MRI* USL.
Figures and Tables -
Test 29

MRI* USL.

MRI RVS.
Figures and Tables -
Test 30

MRI RVS.

MRI vaginal.
Figures and Tables -
Test 31

MRI vaginal.

MRI* vaginal.
Figures and Tables -
Test 32

MRI* vaginal.

MRI POD.
Figures and Tables -
Test 33

MRI POD.

MRI* POD.
Figures and Tables -
Test 34

MRI* POD.

MRI anterior DIE.
Figures and Tables -
Test 35

MRI anterior DIE.

MRI rectosigmoid.
Figures and Tables -
Test 36

MRI rectosigmoid.

MRI* rectosigmoid.
Figures and Tables -
Test 37

MRI* rectosigmoid.

MDCT‐e rectosigmoid.
Figures and Tables -
Test 38

MDCT‐e rectosigmoid.

MDCT‐e bowel [ileum ‐ rectum].
Figures and Tables -
Test 39

MDCT‐e bowel [ileum ‐ rectum].

18FDG PET–CT pelvic.
Figures and Tables -
Test 40

18FDG PET–CT pelvic.

DCBE DIE.
Figures and Tables -
Test 41

DCBE DIE.

DCBE rectosigmoid.
Figures and Tables -
Test 42

DCBE rectosigmoid.

MRI pelvic1.
Figures and Tables -
Test 43

MRI pelvic1.

Summary of findings 1. Summary of findings table: diagnostic tests for endometriosis

Review question

What is the diagnostic accuracy of the imaging tests in detecting endometriosis?

Pelvic endometriosis (any site and depth of invasion)

Ovarian endometriosis

DIE

Importance

A simple and reliable non‐invasive test for endometriosis with the potential to replace laparoscopy or to triage patients to reduce surgery would minimise surgical risk and reduce diagnostic delay

Participants

Women of reproductive age (1) with suspected endometriosis and/or (2) with persistent ovarian mass and/or (3) undergoing infertility workup

Settings

Hospitals (public or private of any level): outpatient clinics (general gynaecology, reproductive medicine, pelvic pain) and/or radiology departments

Reference standard

Visualisation of endometriosis at surgery (laparoscopy or laparotomy) with or without histological confirmation

Study design

Cross‐sectional of 'single‐gate' design (n = 28) or 'two‐gate' design (n = 1); prospective enrolment; 1 study could assess more than 1 test and/or more than 1 type of endometriosis

Risk of bias and applicability concerns

Overall judgement

Poor quality of most studies (only 1 study had 'low risk' assessment in all 4 domains; Thomeer 2014)

Patient selection bias

High risk: 13 studies; unclear risk: 6 studies; low risk: 10 studies

Index test interpretation bias

High risk: 7 studies; unclear risk: 7 studies; low risk: 15 studies

Reference standard interpretation bias

High risk: 6 studies; unclear risk: 16 studies; low risk: 7 studies

Flow and timing selection bias

High risk: 9 studies; unclear risk: 2 studies; low risk: 18 studies

Applicability concerns

Concerns regarding patient selection: high concern ‐ 1 study, unclear concern ‐ 0 studies, low concern ‐ 28 studies

Concerns regarding index test: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 29 studies

Concerns regarding reference standard: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 29 studies

Diagnostic thresholds

Replacement test: sensitivity ≥ 94%; specificity ≥ 79%

SnNout triage test: sensitivity ≥ 95%; specificity ≥ 50%

SpPin triage test: sensitivity ≥ 50%; specificity ≥ 95%

Approaching criteria for 1 of the above tests: diagnostic estimates within 5% of set thresholds

Target condition

Test

N of participants;
N of studies;

N of data sets

Pooled estimates
(95% CI)

Outcomes

Implications

True positives

(endometriosis)

False positives (incorrectly

classified as endometriosis)

False negatives (incorrectly

classified as disease‐free)

True negatives (disease‐free)

Pelvic endometriosis (13 studies, 1535 participants)

TVUS

1222 participants in

5 studies

Sens = 0.65 (0.27 to 1.00)

Spec = 0.95 (0.89 to 1.00)

Meta‐analysis of 4 studies after removing 1 outlier study

Sens = 0.79 (0.36 to 1.00)

Spec = 0.91 (0.74 to 1.00)

257

24

372

569

Approaches the criteria for a SpPin triage test when 1 outlier study was excluded.

Wide confidence intervals (CIs)

MRI

303 participants in 7 studies;

396 participants in

10 data sets

Sens = 0.79 (0.70 to 0.88)

Spec = 0.72 (0.51 to 0.92)

253

21

70

52

Neither replacement nor triage test criteria met

Observation: 3.0T MRI (2 studies) demonstrated highest diagnostic accuracy

18FGD PET‐CT

10 participants in 1 study

Not availablea

0

0

9

1

Insufficient evidence to allow meaningful conclusions

Ovarian endometriosis (10 studies, 852 participants)

TVUS

765 participants in

8 studies

Sens = 0.93 (0.87 to 0.99)

Spec = 0.96 (0.92 to 0.99)

182

28

16

539

Meets the criteria for a SpPin triage test and approaches the criteria for a replacement and SnNout triage test

Observation: Studies published after 2006 (4 out of 5 studies) demonstrated highest diagnostic accuracy

TRUS

92 participants in 1 study

Not availableb

32

13

4

43

Insufficient evidence to allow meaningful conclusions

MRI

179 participants in

3 studies

Sens = 0.95 (0.90 to 1.00)

Spec = 0.91 (0.86 to 0.97)

72

9

4

94

Meets the criteria for a replacement and SnNout triage test, approaches the criteria for a SpPin triage test

Observation: 3.0T MRI (2 studies) demonstrated highest diagnostic accuracy

Insufficient evidence to allow meaningful conclusions

DIE/Posterior DIE

(15 studies, 1493 participants)

TVUS

934 participants in 9 studies;

1383 participants in

12 data sets

Sens = 0.79 (0.69 to 0.89)

Spec = 0.94 (0.88 to 1.00)

435

51

128

769

Approaches the criteria for a SpPin triage test

Observation: TVUS‐BP (1 study) demonstrated highest diagnostic accuracy

MRI

266 participants in 6 studies;

289 participants in

7 data sets

Sens = 0.94 (0.90 to 0.97)

Spec = 0.77 (0.44 to 1.00)

210

11

9

59

Approaches the criteria for a replacement and SnNout triage test

Observation: 3.0T MRI (2 studies) and MRI jelly method (1 study) demonstrated highest diagnostic accuracy

DCBE

69 participants in

1 study

Not availablec

24

0

43

2

Insufficient evidence to allow meaningful conclusions

aFor FGD PET‐CT in pelvic endometriosis, diagnostic estimates were sensitivity = 0.00 (0.00 to 0.34); specificity = 1.00 (0.03 to 1.00)

bFor TRUS in ovarian endometriosis, diagnostic estimates were sensitivity = 0.89 (0.74 to 0.97); specificity = 0.77 (0.64 to 0.87)

cFor DCBE in DIE, diagnostic estimates were sensitivity = 0.36 (0.24 to 0.48); specificity = 1.00 (0.16 to 1.00)

Figures and Tables -
Summary of findings 1. Summary of findings table: diagnostic tests for endometriosis
Summary of findings 2. Summary of findings table: surgical mapping of endometriosis to specific anatomical sites

Review question

What is the diagnostic performance of the imaging tests in mapping deep endometriotic lesions in the pelvis at specific anatomical sites?

USL endometriosis

RVS endometriosis

Vaginal wall endometriosis

POD obliteration

Anterior DIE

RS/Bowel endometriosis

Importance

Ability to diagnose DIE at specific anatomical sites at preoperative assessment helps optimise planning of surgery or guides referral to the most appropriate practice, with the potential to improve treatment outcomes

Participants

Women of reproductive age with suspected endometriosis or specifically suspected DIE

Settings

Hospitals (public or private of any level): outpatient clinics (general gynaecology, reproductive medicine, pelvic pain) and/or radiology departments

Reference standard

Visualisation of endometriosis at surgery (laparoscopy or laparotomy) with or without histological confirmation

Study design

Cross‐sectional of 'single‐gate' design (n = 33); prospective enrolment; 1 study could assess more than 1 test and/or more than 1 site of endometriosis

Risk of bias and applicability concerns

Overall judgement

Poor quality of most studies (only 1 study had 'low risk' assessment in all 4 domains; Thomeer 2014)

Patient selection bias

High risk: 16 studies; unclear risk: 6 studies; low risk: 11 studies

Index test interpretation bias

High risk: 8 studies; unclear risk: 4 studies; low risk: 21 studies

Reference standard interpretation bias

High risk: 14 studies; unclear risk: 14 studies; low risk: 5 studies

Flow and timing selection bias

High risk: 8 studies; unclear risk: 3 studies; low risk: 22 studies

Applicability concerns

Concerns regarding patient selection: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 33 studies

Concerns regarding index test: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 33 studies

Concerns regarding reference standard: high concern ‐ 0 studies, unclear concern ‐ 0 studies, low concern ‐ 33 studies

Diagnostic thresholds

Replacement test: sensitivity ≥ 94%; specificity ≥ 79%

SnNout triage test: sensitivity ≥ 95%; specificity ≥ 50%

SpPin triage test: sensitivity ≥ 50%; specificity ≥ 95%

Approaching criteria for 1 of the above tests: diagnostic estimates within 5% of set thresholds

Target condition

Test

N of participants;
N of studies;

N of data sets

Pooled estimates
(95% CI)

Outcomes

Implications

True positives

(endometriosis)

False positives (incorrectly

classified as endometriosis)

False negatives (incorrectly

classified as disease‐free)

True negatives (disease‐free)

USL endometriosis (11 studies, 997 participants)

TVUS

751 participants in 7 studies

Sens = 0.64 (0.50 to 0.79)

Spec = 0.97 (0.93 to 1.00)

136

18

63

534

Meets the criteria for a SpPin triage test

Observation: TVUS‐BP (1 study) demonstrated the highest diagnostic accuracy

TRUS

232 participants in 2 studies

Sens = 0.52 (0.29 to 0.74)

Spec = 0.94 (0.86 to 1.00)

48

8

45

131

Approchess the criteria for a SpPin triage test

Wide CIs

Insufficient evidence to allow meaningful conclusions

MRI

199 participants in 4 studies

221 participants in 5 data sets

Sens = 0.86 (0.80 to 0.92)

Spec = 0.84 (0.68 to 1.00)

136

13

22

50

Criteria for a triage test not met

Wide CIs

Observation: 3.0T MRI (1 out of 2 studies) demonstrated the highest diagnostic accuracy

RVS endometriosis (12 studies, 1215 participants)

TVUS

983 participants in 10 studies

1073 participants in 11 data sets

Sens = 0.88 (0.82 to 0.94)

Spec = 1.00 (0.98 to 1.00)

263

10

59

741

Meets the criteria for a SpPin triage test

Observation: TVUS‐BP (3 studies) and RWC‐TVS (1 study) demonstrated the highest diagnostic accuracy

TRUS

232 participants in 2 studies

Sens = 0.78 (0.51 to 1.00)

Spec = 0.96 (0.89 to 1.00)

35

8

10

179

Meets the criteria for a SpPin triage test

Insufficient evidence to allow meaningful conclusions

MRI

288 participants in 3 studies

Sens = 0.81 (0.70 to 0.93)

Spec = 0.86 (0.78 to 0.95)

96

23

22

147

Criteria for a triage test not met

Insufficient evidence to allow meaningful conclusions

Vaginal wall endometriosis

(10 studies, 981 participants)

TVUS

679 participants in 6 studies

Sens = 0.57 (0.21 to 0.94)

Spec = 0.99 (0.96 to 1.00)

70

11

44

554

Meets the criteria for a SpPin triage test

Wide CIs

Observation: tg‐TVUS (1 study) demonstrated the highest diagnostic accuracy

TRUS

232 participants in 2 studies

Sens = 0.39 (0.08 to 0.70)

Spec = 1.00 (1.00 to 1.00)

18

0

28

186

Criteria for a triage test not met

Wide CIs

Insufficient evidence to allow meaningful conclusions

MRI

248 participants in 4 studies

271 participants in 5 data sets

Sens = 0.77 (0.67 to 0.88)

Spec = 0.97 (0.92 to 1.00)

48

11

14

198

Meets the criteria for a SpPin triage test

Observation: 3.0T MRI (1 study) and 3D‐MRI demonstrated the highest diagnostic accuracy

POD obliteration

(11 studies, 909 participants)

TVUS

755 participants in 6 studies

Sens = 0.83 (0.77 to 0.88)

Spec = 0.97 (0.95 to 0.99)

152

17

32

554

Meets the criteria for a SpPin triage test

Observation: TVUS‐BP ( 2 studies) demonstrated the highest diagnostic accuracy

MRI

154 participants in 5 studies

177 participants in 6 data sets

Sens = 0.90 (0.76 to 1.00)

Spec = 0.98 (0.89 to 1.00)

84

3

12

78

Meets the criteria for a SpPin triage test and approaches the criteria for a SnNout triage test

Observation: 3.0T MRI (3 studies) demonstrated the highest diagnostic accuracy

Anterior DIE

(3 studies, 330 participants)

TVUS

289 participants in 2 studies

Sens = 0.41 (0.00 to 0.81)

Spec = 1.00 (1.00 to 1.00)

11

0

16

262

Criteria for a triage test not met

Wide CIs

Insufficient evidence to allow meaningful conclusions

MRI

41 participants in 1 study

Not availablea

6

0

2

33

Insufficient evidence to allow meaningful conclusions

Rectosigmoid endometriosis

(21 studies, 2222 participants)

TVUS

1616 participants in 14 studies

1817 participants in 15 data sets

Sens = 0.90 (0.82 to 0.97)

Spec = 0.96 (0.94 to 0.99)

648

47

100

1022

Meets the criteria for a SpPin triage test and approaches the criteria for a SnNout triage test

Observation: TVUS‐BP (2 studies) and RWC‐TVS (2 studies) demonstrated the highest diagnostic accuracy

TRUS

330 participants in 4 studies

Sens = 0.91 (0.85 to 0.98)

Spec = 0.96 (0.91 to 1.00)

137

8

13

172

Meets the criteria for a SpPin triage test and approaches the criteria for a SnNout triage test

MRI

612 participants in 6 studies

635 participants in 7 data sets

Sens = 0.92 (0.86 to 0.99)

Spec = 0.96 (0.93 to 0.98)

352

11

30

242

Meets the criteria for a SpPin triage test and approaches the criteria for a SnNout triage test

Observation: MRI jelly method (1 study) and 3.0T MRI (1 study) demonstrated the highest diagnostic accuracy

MDCT‐e

389 participants in 3 studies

Sens = 0.98 (0.94 to 1.00)

Spec = 0.99 (0.97 to 1.00)

241

1

6

141

Meets the criteria for a SpPin test and a SnNout triage test

Insufficient evidence to allow meaningful conclusions

DCBE

106 participants in 2 studies

Sens = 0.56 (0.32 to 0.80)

Spec = 0.77 (0.41 to 1.00)

45

6

35

20

Criteria for a triage test not met

Wide CIs

Insufficient evidence to allow meaningful conclusions

Bowel

(ileum ‐ rectum) endometriosis

(4 studies, 412 participants)

TVUS

314 participants in 3 studies

Sens = 0.89 (0.81 to 0.97)

Spec = 0.96 (0.91 to 1.00)

135

7

16

156

Meets the criteria for a SpPin triage test

Observation: TVUS, non‐modified method (1 study) demonstrated highest diagnostic estimates

Insufficient evidence to allow meaningful conclusions

TRUS

134 participants in 1 study

Not availableb

72

0

3

59

Insufficient evidence to allow meaningful conclusions

MDCT‐e

194 participants in 2 studies

Sens = 0.98 (0.92 to 1.00)

Spec = 1.00 (1.00 to 1.00)

124

0

3

67

Meets the criteria for a SpPin test and a SnNout triage test

Insufficient evidence to allow meaningful conclusions

aFor MRI in anterior DIE, diagnostic estimates were sensitivity = 0.75 (0.35 to 0.97); specificity = 1.00 (0.89 to 1.00)

bFor TRUS in bowel endometriosis, diagnostic estimates were sensitivity = 0.96 (0.89 to 0.99); specificity = 1.00 (0.94 to 1.00)

Figures and Tables -
Summary of findings 2. Summary of findings table: surgical mapping of endometriosis to specific anatomical sites
Table 1. Staging of endometriosis, rASRM classification

Peritoneum

Endometriosis

< 1 cm

1‐3 cm

> 3 cm

Superficial

1

2

4

Deep

2

4

6

Ovary

R Superficial

1

2

4

Deep

4

16

20

L Superficial

1

2

4

Deep

4

16

20

Posterior Cul‐de‐sac Obliteration

Partial Complete

4 40

Ovary

Adhesions

< 1/3 Enclosure

1/3‐2/3 Enclosure

> 2/3 Enclosure

R Filmy

1

2

4

Dense

4

8

16

L Filmy

1

2

4

Dense

4

8

16

Tube

R Filmy

1

2

4

Dense

4a

8a

16

L Filmy

1

2

4

Dense

4a

8a

16

aIf the fimbriated end of the fallopian tube is completely enclosed, change the point assignment to 16 American Society for Reproductive Medicine 1997

Figures and Tables -
Table 1. Staging of endometriosis, rASRM classification
Table 2. Index tests ‐ description and common abbreviations

Test name as presented in the review

Description

Alternative names presented in the included studies

MRI tests

MRI (magnetic resonance imaging)

Equipment: 1.5 Tesla magnet device with a parallel or phased array body or pelvic coil for signal excitation and reception

Participants’ preparation: Fasting for 3‐6 hours before the test and/or bowel preparation with oral laxatives was described by some investigators; an intravenous injection of anti‐peristaltic agent at the outset of the examination to decrease bowel peristalsis; supine position. Some groups performed MRI with full bladder to correct the angle of the ante‐flexed uterus; some groups described introducing of ultrasonographic gel (˜ 50 to 60 mL) into the vaginal canal to distend the vaginal fornices

Protocol: Imaging is performed in the axial plane with or without sagittal or coronal planes. Different types of sequences allow to image the same tissue in various ways, and combinations of sequences reveal important diagnostic information about the tissue in question. The imaging parameters (section thickness, field of view (FOV), matrix size) vary between protocols. Images are documented on radiographic film and in digital files and analysed at workstation

  • MRI T1/T2‐w

(conventional T1‐/T2‐weighted)

The protocol includes axial spin‐echo or gradient echo T1‐weighted (T1‐w) images followed by fast spin‐echo (FSE)/turbo spin‐echo (TSE) images or fast relaxation fast‐spin echo (FR‐FSE) T2‐w images

MRI;

CSE (conventlonal spin echo)

  • MRI fat‐suppressed

(T1‐weighted)

Protocol includes T1‐w imaging using chemical fat suppression, which aids in the differentiation of lipid and haemorrhagic pathologies. Fat suppression is a generic term that includes various techniques to suppress the signal from normal adipose tissue to reduce chemical shift artefact and can be achieved by various methods. This is commonly a part of the MRI protocol and is rarely used in isolation

Fat‐saturated MRI

  • MRI T1/T2‐w + fat‐suppressed/ Gd

(T1‐/T2‐weighted with fat‐suppression contrast enhanced)

Protocol includes gradient echo T1 images with and without fat suppression followed by FSE or FR‐FSE T2‐w images before and after intravenous injection of the paramagnetic contrast agent gadolinium

MRI;

CSE/TIFS (conventlonal spin echo in combination with T1‐w fat‐suppressed)

CSE/TIFS/Gd‐TIFS (conventlonal spin echo in combination with T1‐w fat‐suppressed and gadolinium‐enhanced TlFS)

  • MRI 'jelly method'

Protocol involves pretreatment of participants for MRI by simultaneous injection of ultrasonographic gel into the vagina (˜ 50 mL) and into the rectum (150 mL gel 50% diluted with water). Another technique evolves introduction of 300‐400 mL of diluted ultrasonographic gel (1:8 dilution) for rectosigmoid distension without use of intravaginal gel

MRI‐e (magnetic resonance enema)

3D‐MRI (3‐dimensional MRI)

Protocol includes 3D coronal single‐slab (containing all the slices) MRI, entitled 'CUBE' with FSE T2‐w images. The technique involves using variable flip angle refocusing, auto‐calibrating, 2D accelerated parallel imaging and nonlinear view ordering to produce high‐resolution volumetric image data sets and to reduce imaging time by using multi‐planar reformations

3.0T MRI

Equipment: 3.0Tesla Magnetom system with a multi‐channel phased‐array surface body‐coil

Participants’ preparation: Fasting for 3 hours before the test was reported by some but not all studies; intravenous injection of anti‐peristaltic agent at the outset of the examination to decrease bowel peristalsis; administration of a negative super‐paramagnetic oral contrast agent to reduce signal intensity of the bowels. Examination with the full bladder in a ‘feet first’ supine position

Protocol: combination of all or some of the following sequences: T‐w FSE, 2D‐T2‐w FR‐FSE/FSE, 3D‐T2‐w FR‐FSE CUBE, 3D‐T1‐w fat‐suppressed and/or LAVA‐flex (liver imaging with volume acceleration‐flexible) sequences. MRI images are acquired according to multiple scan planes, in particular axial, coronal and sagittal planes of the pelvis and sacral para‐coronal plane. Contrast agent (gadolinium) is administered in selected cases. Total acquisition time ˜ 20 min without or 30‐40 min with contrast injection

Ultrasound tests

TVUS

(transvaginal ultrasonography)

Equipment: any of the commercially available ultrasound machines equipped with a wide‐band high‐resolution vaginal transducer (brands of scanners and frequencies of transducers vary between studies)

Participants' preparation: Examination is performed in a dorsal lithotomy position with empty or half‐full bladder; no bowel preparation is routinely required

Protocol: An ultrasound gel is applied to the tip of the transducer probe to create a lubricating, acoustically correct interface with the tissue. Scans are obtained by inserting the transducer (protected by disposable thin cover) into the vagina, followed by sequential movement of the probe within the vaginal canal to allow systematic evaluation of pelvic structures (uterus and adnexal regions; attention paid to the ovaries, pouch of Douglas, vesicouterine pouch and uterosacral ligament). The technique involves longitudinal, transverse and angled movements of the probe with sliding up and down, back and forward to obtain both longitudinal and transversal scans of pelvic structures. Examination protocols vary between studies. Each examination is interpreted in real time and can be documented in printed photographs

TVS

'transvaginal ultrasound'

'transvaginal sonography'

  • TVUS‐BP

(transvaginal ultrasonography with bowel preparation)

Examination consists of TVUS combined with bowel preparation including the following: low‐residue diet for 1‐3 days, oral laxative on the eve of the examination, rectal enema within an hour before the examination or a combination of the above

  • RWC‐TVS

(rectal water contrast transvaginal ultrasonography)

Examination consists of TVUS combined with bowel preparation and instillation of water contrast in rectum during TVUS; procedure does not require general anaesthesia

Protocol: After the transducer is introduced into the vagina, a flexible thin catheter (18‐28 Ch) with a rubber balloon is inserted into the rectal lumen up to 20 cm from the anus (gel infused with lidocaine is used to facilitate passage of the catheter). Rectal water contrast of 100 to 300 mL of warm saline solution is instilled inside the balloon under ultrasonographic guidance to provide high‐definition images of the rectal wall and its layers. Back flow of the solution is prevented by placement of a Klemmer forceps on the catheter. Images are obtained before, during and after saline injection

'transvaginal sonography with water‐contrast in the rectum'

'water‐contrast in the rectum during transvaginal ultrasonography'

  • SVG

(sonovaginography)

Examination consists of TVUS combined with the introduction of saline solution or gel to the vagina to create an acoustical window between the transvaginal probe and surrounding structures and to distend the vaginal walls, permitting enhanced visualisation of pelvic structures

Protocol: Procedure involves introduction of a Foley catheter into the vagina followed by insertion of the transvaginal probe with further injection of 200‐400 mL of saline through the catheter by the assistant. To prevent reflux of saline solution from the vagina, the vaginal canal is closed with the operator’s hand. Alternative method involves placement of 20 mL of ultrasound gel into the posterior vaginal fornix with a plastic syringe, followed by insertion of a transvaginal probe. Reported procedure time ranges from 30 to 45 minutes

'transvaginal sonography and acoustic window with intravaginal gel'

  • tg‐TVUS

(tenderness‐guided TVUS)

Examination consists of TVUS combined with particular attention to the tender points evoked during examination

Protocol: Larger amount of ultrasound gel (˜ 12 mL instead of the usual 4 mL) is introduced into the probe cover to create a stand‐off for visualisation of the near‐field area. The probe is inserted gently to avoid the risk of squeezing out the gel. After the initial sonographic evaluation, the participant is asked to inform the operator about the onset and site of any tenderness experienced during probe pressure within the posterior fornix. When tenderness is evoked, the sliding movement is stopped, and particular attention is paid to the painful site via gentle pressure with the probe’s tip to detect endometriosis lesions. Reported procedure time is 15 to 20 minutes in cases of suspected lesions, but less time when the examination is negative

  • 3D‐TVUS

(3‐dimensional transvaginal ultrasonography)

Equipment: An ultrasound scanner equipped with 3D/4D imaging modes and a wide‐band high resolution volume transvaginal transducer. The method enables the acquisition of ultrasonographic volumetric data that can be assessed off‐line; in most institutions used as an adjunct to 2D US

Protocol: region‐of‐interest (ROI) is identified using a B‐mode scan and a transvaginal volume transducer. During the volumetric scan, the transducer carries out a series of parallel scans of varying speeds focusing on the ROI. The anatomical ROI is visualised on the monitor as a graphic containing the 3 orthogonal planes. During volumetric scans, the investigator adopts some expedients such as positioning the probe near the anatomical ROI and reducing or eliminating participant movements. The volume obtained is stored on a hard disk and displayed later using dedicated software

  • Introital 3D‐US

(introital 3‐dimensional ultrasound)

Examination is performed with the transducer placed on the perineum against the symphysis pubis (firmly but without causing significant discomfort). To acquire a correct volume, the symphysis pubis, urethra, vagina, and rectum should be visualised in the same image. Gain is adjusted and focal area is set to the region of interest, with the sweep angle set at 90 or 120 degrees to produce a multi‐planar image in 3 planes: longitudinal, transverse and coronal

TRUS (transrectal ultrasonography)

Equipment: An ultrasound scanner with a 2‐dimensional axial and sagittal convex high‐frequency probe with or without a rigid linear probe or a flexible endoscope with lateral view and a convex high frequency echo probe

Participants' preparation: A low‐residue diet for 3 days before the examination with or without laxatives and/or rectal enema is reported in some but not all studies; several groups described using general or local anaesthesia for the procedure, and some groups used no analgesia

Protocol: A gel‐filled rubber sheath or water‐filled balloon is placed over the tip of the transducer to obtain better visibility. The transducer is inserted into the rectum and is advanced until the midline image of the cervix is visualised in the longitudinal view. Pelvic structures are evaluated by moving the transducer along its longitudinal axis and rotating it 130° to 140° along the main axis in both axial and longitudinal planes. Alternative technique includes insertion of the flexible probe into the sigmoid colon, over the aortic bifurcation and/or the upper part of the body of the uterus, with subsequent slow withdrawal, allowing optimum imaging of rectal and sigmoid colon walls/pelvic structures, with instillation of water into the intestinal lumen and alternating use of several frequencies (e.g. 5, 7.5, 12 MHz)

TRS (transrectal sonograph)

Tr EUS (transrectal endoscopic ultrasonography)

RES (rectal endoscopic sonography)

REU (rectal endoscopic ultrasonography)

Other tests

MDCT‐e

(multi‐detector computerised tomography enema)

Equipment: multi‐detector computed tomograph, which has a 2‐dimensional array of detector elements that permits CT scanners to acquire multiple slices or sections simultaneously and greatly increase the speed of CT image acquisition (unlike the linear array of detector elements used in typical conventional and helical CT scanners)

Participants’ preparation: low‐residue diet for 3 days and bowel preparation with an oral laxative day before the examination; intravenous injection of anti‐peristaltic agent during the test

Protocol: colonic distension performed by introducing about 2000 mL of water at 37ºC into the left lateral decubitus position. All participants receive an intravenous injection of iodine‐containing contrast. Participants are scanned in supine position from the dome of the diaphragm to the pubic symphysis in the portal phase (40 seconds after the arterial peak). Scan parameters (collimation, rotation time, tube voltage, effective mAs) differ between studies. Estimated radiation exposure is calculated by the scanner using CT dose index and is saved to the dose report. Both axial plane and multi‐planar reconstructions (sagittal and coronal) are evaluated. Images are reviewed at a workstation

MSCTe (multi‐slice computed tomography combined with colon distension by water enteroclysis)

'Water enema CT'

18FDG‐PET (fluorodeoxyglucose positron emission tomography)

Equipment: PET‐computed tomograph

Participants’ preparation: Fasting for at least 6 hours before the test; 18FDG (a glucose analogue) injection 60 min before the test

Protocol: Acquisition is performed with the participant in supine position, from mid‐thigh to the base of the skull. No iodine‐based contrast is administered. CT parameters reported in a single included study are 120 kV, 120 mA, pitch 1.5:1, speed 15 mm/rot. The PET element operates in 2D mode for 4 minutes per bed position. Attenuation correction is based on CT data

DCBE (double‐contrast barium enema)

Equipment: motorised tilting radiographic table and standard equipment for fluoroscopic and radiological examination

Participants’ preparation: low‐residue diet for 1‐3 days before the examination with or without oral laxatives day before the procedure; an anti‐peristaltic agent is administered intravenously at the outset of the examination to decrease bowel peristalsis

Protocol: The procedure is performed in 2 steps to obtain double contrast and involves change of participant positions to ensure detailed visualisation of all intestinal segments. Barium sulphate contrast (600 to 800 mL) is instilled into rectum with a gravity pressure in the left lateral decubitus position. Once the barium reached the hepatic flexure, the colon was drained by gravity to remove as much barium as possible from the rectal ampulla without clearing completely the rectosigmoid colon of barium. Room air is then gently insufflated into the colon. Sequential views of the bowel are obtained. Each colonic segment is viewed in detail on spot radiographs and in magnification images. The procedure lasts 15 to 20 minutes

Figures and Tables -
Table 2. Index tests ‐ description and common abbreviations
Table 3. Target conditions ‐ types and anatomical distribution of endometriosis

Type of endometriosis

Description

Main clinical types of endometriosis

Pelvic endometriosis

Endometriotic lesions, deep or superficial, located at any site in pelvic/abdominal cavity: on the peritoneum, fallopian tubes, ovaries, uterus, bowel, bladder or PODa

Ovarian endometriosis

Ovarian cysts lined by endometrial tissue (endometrioma)

DIEb

Deep endometriotic lesions extending more than 5 mm under the peritoneum located at any site of pelvic/abdominal cavity

Subtypes of deep endometriosis per anatomical localisationc

Posterior DIE

Deep endometriotic lesions involve ≥ 1 site of the posterior pelvic compartment (USLd RVSe, vaginal wall, bowel) and/or obliterate PODa

USLd endometriosis

Endometriotic lesions infiltrate uterosacral ligaments unilaterally or bilaterally

RVSe endometriosis

Deep endometriotic implants infiltrate the retroperitoneal area between posterior wall of vaginal mucosa and anterior wall of rectal muscularis

Vaginal endometriosisf

Endometriotic lesions infiltrate vaginal wall, particularly posterior vaginal fornix

PODa obliteration

Defined when the peritoneum of the PODa is only partially or no longer visible during surgery, and occurs as a result of adhesion formation; can be partial or complete, respectively

Bowel endometriosis

Endometriotic lesions infiltrating at least the muscular layer of the intestinal wall ileum ‐ rectum; predominantly affects rectosigmoid colon

Rectosigmoid endometriosis

Endometriotic lesions infiltrating at least the muscular layer of the rectosigmoid colon; the most common form of bowel endometriosis

Anterior DIE

Deep endometriotic lesions located at any site of the anterior pelvic compartment (bladder ± anterior pouch)

Rare types of endometriosis (not included in this review)

Bladder endometriosis

Endometriotic lesions infiltrating bladder muscularis propria

Ureteral endometriosis

Endometriotic lesions involving ureters

Extrapelvic/Atypical endometriosis

Rare types of endometriosis involving various sites outside pelvic cavity, such as:

CNS: cerebral endometriosis, extradural spinal endometriosis

Thoracic: pleural endometriosis, pulmonary endometriosis, diaphragmatic endometriosis

Abdominal: hepatic endometriosis, renal endometriosis, appendix endometriosis, pancreas endometriosis

Musculoskeletal: abdominal wall endometriosis, umbilical endometriosis, pyramidalis muscle endometriosis, inguinal endometriosis, canal of Nuck endometriosis

Perianal endometriosis, perineal endometriosis, extrapelvic endometriosis of sciatic nerve

Subcutaneous endometriosis, operative scar endometriosis

aDIE: deep infiltrating endometriosis

bPOD: pouch of Douglas
cDefinitions of subtypes of DIE are adopted from Bazot 2007c. Additional definitions presented in the literature include 'Rectovaginal endometriosis (RVE)' defined as DIE that infiltrates the vagina, rectum and RVS and obliterates POD (Martin 2001) or 'deep retrocervical endometriosis' defined as involvement of USL, torus uterini, posterior vaginal fornix and/or RVS by endometriotic lesions (Abrao 2007).

dUSL: uterosacral ligament

eRVS: rectovaginal septum

fVaginal endometriosis also defined as 'lesions infiltrating the anterior rectovaginal pouch, posterior vaginal fornix and retroperitoneal area between anterior rectovaginal pouch and posterior vaginal fornix (Chapron 2003a)

Figures and Tables -
Table 3. Target conditions ‐ types and anatomical distribution of endometriosis
Table 4. Application of the QUADAS‐2 tool for assessment of methodological quality of included studies

Domain 1 ‐ Patient selection

Description

Describe methods of participant selection and characteristics of the included population

Type of bias assessed

Selection bias, spectrum bias

Review question

Women of reproductive age with clinically suspected endometriosis (symptoms, clinical examination ± presence of pelvic mass), scheduled for surgical exploration of pelvic/abdominal cavity for confirmation of the diagnosis ± treatment

Informaton collected

Study objectives, study population, selection (inclusion/exclusion criteria), study design, clinical presentation, age, number of enrolled and number available for analysis, setting, place and period of the study

Signalling question

Was a consecutive or random sample of participants enrolled?

Yes

If a consecutive sample or a random sample of eligible participants was included in the study

No

If a non‐consecutive sample or a non‐random sample of eligible participants was included in the study

Unclear

All studies that did not specify enrolment as a consecutive or random sample of patients were classified as 'no'; therefore none of the included studies were classified as 'unclear'

Signalling question

Did the study avoid inappropriate exclusions?

Yes

If all participants with suspected endometriosis were included, with an exception for those not able to undergo an index test (e.g. virgins or genital tract anomalies for transvaginal imaging, claustrophobia for MRI) or unfit for surgery

No

If the study selected participants on the basis of particular clinical features (e.g. only suspected bowel involvement, were referred for treatment of deep endometriosis) or excluded participants with any co‐morbidities, other than specified above

Unclear

If the study did not provide clear definition of selection (inclusion/exclusion) criteria and 'no' judgement was not applicable

Signalling question

Was a two‐gate design avoided?

Yes

If the study had a single set of inclusion criteria, defined by the clinical presentation (i.e. only participants in whom the target condition is suspected) ‐ a ‘single‐gate design’

No

If the study had more than 1 set of inclusion criteria with respect to clinical presentation (i.e. participants suspected of target condition, participants with alternative diagnosis in whom the target condition would not be suspected in clinical practice) ‐ a 'two‐gate' study design

Unclear

If it was unclear whether a 'two‐gate deign' was avoided

Risk of bias

Could the selection of participants have introduced bias?

High

If 'no' classification for any of the above 3 questions

Low

If 'yes' classification for 3 questions above

Unclear

If 'unclear' classification for any of the above questions and 'high risk' judgement were not applicable

Concerns about applicability

Are there concerns that included participants do not match the review question?

High

If the study population differed from the population defined in the review question in terms of demographic features and co‐morbidity (e.g. studies with multiple sets of inclusion criteria with respect to clinical presentation, including healthy controls or alternative diagnosis controls that would not have undergone index test in real practice). We excluded studies in which participants were not in the reproductive age group, and most included studies were of 'single‐gate' design; therefore, we expected few studies to be classified as 'high concern'

Low

If the study included only a clinically relevant population that would have undergone index test in real practice

Unclear

If this information was unclear

Domain 2 ‐ Index test

Description

Describe the index test, how it was conducted and interpreted

Type of bias assessed

Test review bias, clinical review bias, interobserver variation bias

Review question

Any type of imaging modality

Informaton collected

Index test name, description of positive case definition by index test as reported, examiners (numbers, level of expertise, blinding), interobserver variability, conflicts of interest

Signalling question

Were the index test results interpreted without knowledge of results of the reference standard?

Yes

We excluded studies in which the index test was performed retrospectively after execution of the reference standard; therefore, all included studies were classified 'yes'

No

Unclear

Signalling question

Did the study provide a clear prespecified definition of what was considered to be a 'positive result of the index test?

Yes

If study provided clear definition of positive findings, and this was defined before execution/interpretation of index test

No

If definition of the positive result was not provided, or if study described findings derived from the index test and not defined before its execution

Unclear

If it was unclear whether the criteria were prespecified

Signalling question

Was the index test performed by a single operator or interpreted by consensus in a joint session?

Yes

If test was performed/interpreted by single operator or was interpreted after collegial discussion of the case

No

If test was performed/interpreted by various operators for different participants

Unclear

If this information was unclear

Signalling question

Were the same clinical data available when the index test results were interpreted as would be available when the test is used in practice?

Yes

If operators performing/interpreting the test were aware of suspected endometriosis and/or of the clinical history but were not aware of results of other imaging tests or of a previous diagnosis of endometriosis, including the results of previous surgeries

No

If operators performing/interpreting the test were informed of previously or recently surgically diagnosed endometriosis or were not blinded to results of other imaging tests or tests raising suspicion for endometriosis

Unclear

If this information was unclear

Risk of bias

Could the conduct or interpretation of the index test have introduced bias?

High

If 'no' classification for any of the above 4 questions

Low

If 'yes' classification for all the above 4 questions, or if 'unclear' classification for question 'Was the index test performed by a single operator or interpreted by consensus in a joint session?' and ''yes' classification for the remaining 3 questions

Unclear

If 'unclear' classification at least for the question 'Did the study provide a clear pre‐specified definition of what was considered to be a 'positive' result of index test?' or for the question 'Were the same clinical data available when the index test results were interpreted as would be available when the test is used in practice?' and 'high risk' judgement was not applicable

Concerns about applicability

Are there concerns that the index test, its conduct or its interpretation differs from the review question?

High

We did not consider studies in which index tests other than imaging modalities were included (or that excluded information on other index tests reported in addition to imaging modalities), or in which the index test looked at other target conditions not specified in the review (e.g. studies aimed at classifying pelvic masses as benign and malignant); therefore, none of the included studies was classified as 'high concern'

Low

We considered all types of imaging modalities as eligible; therefore, all included studies were classified as 'low concern', as anticipated

Unclear

Only studies with sufficient information on the index test were included; therefore, none of the included studies was classified as 'unclear concern'

Domain 3 ‐ Reference standard

Description

Describe the reference standard and how it was conducted and interpreted

Type of bias assessed

Verification bias, bias in estimation of diagnostic accuracy due to inadequate reference standard

Review question

Target condition ‐ pelvic endometriosis, ovarian endometriosis, DIE overall or at specific anatomical sites; Reference standard ‐ visualisation of endometriosis at surgery (laparoscopy or laparotomy) with or without histological confirmation

Informaton collected

Target condition, prevalence of target condition in the sample, reference standard, description of positive case definition by reference test as reported, examiners (numbers, level of expertise, blinding)

Signalling question

Is the reference standard likely to correctly classify the target condition?

Yes

If the study reported at least 1 of the following: surgical procedure described in sufficient detail and/or criteria for positive reference standard stated and/or the procedure was performed by the team with a high level of expertise in diagnosis/surgical treatment of the target condition

No

If the reference standard did not classify the target condition correctly; in the light of inclusion criteria and the nature of the reference standard, no studies were classified as 'no' for this item

Unclear

If information on execution of the reference standard or its interpretation or on operators was unclear

Signalling question

Were reference standard results interpreted without knowledge of results of the index tests?

Yes

If operators performing the reference test were unaware of the results of the index test

No

If operators performing the reference test were aware of the results of the index test

Unclear

If this information was unclear

Risk of bias

Could the reference standard, its conduct or its interpretation have introduced bias?

High

If 'no' classification for either of the above 2 questions

Low

If 'yes' classification for both of the above 2 questions

Unclear

If 'unclear' classification for either of the above 2 questions and 'high risk' judgement was not applicable

Concerns about applicability

Are there concerns that the target condition as defined by the reference standard does not match the question?

High

We excluded studies in which participants did not undergo surgery for diagnosis of endometriosis; therefore, none of the included studies were classified as 'high concern'

Low

In the light of inclusion criteria, all studies were classified as 'low concern', as anticipated

Unclear

Only studies in which laparoscopy/laparotomy served as a reference test were included; therefore, no included studies were classified as 'unclear concern'

Domain 4 ‐ Flow and timing

Description

Describe any participants who did not receive the index tests or the reference standard, or who were excluded from the 2 × 2 table; describe the interval and any interventions between index tests and the reference standard

Type of bias assessed

Disease progression bias, bias of diagnostic performance due to missing data

Review question

Less than 12‐month interval between index test and reference standard ‐ endometriosis may progress over the time, so we had chosen an arbitrary time interval of 12 months as an acceptable time interval between the index test and surgical confirmation of the diagnosis

Informaton collected

Time interval between index test and reference standard, withdrawals (overall number reported and whether they were explained)

Signalling question

Was there an appropriate interval between index test and reference standard?

Yes

If time interval was reported and was less than 12 months

No

We excluded all studies for which the time interval was longer than 12 months; therefore, no included studies were classified as 'no' for this item

Unclear

If the time interval was not stated clearly but the study authors' description allowed one to assume that the interval was reasonably short

Signalling question

Did all participants receive the same reference standard?

Yes

In the light of inclusion criteria, all studies were classified as 'yes' for this item, as anticipated

No

Unclear

Signalling question

Were all participants included in the analysis?

Yes

If all participants were included in the analysis, or if participants were excluded because they did not meet inclusion criteria or if withdrawals were less than 5% of the enrolled population (arbitrary selected cut‐off)

No

If any participants were excluded from the analysis because of uninterpretable results, because of inability to undergo index test or reference standard or for unclear reasons

Unclear

No studies were classified as 'unclear' for this item

Risk of bias

Could the participant flow have introduced bias?

High

If 'no' classification for any of the above 3 questions

Low

If 'yes' classification for all of the above 3 questions

Unclear

If 'unclear' classification for any of the above 3 questions and 'high risk' judgement was not applicable

Figures and Tables -
Table 4. Application of the QUADAS‐2 tool for assessment of methodological quality of included studies
Table Tests. Data tables by test

Test

No. of studies

No. of participants

1 TVUS pelvic Show forest plot

5

1222

2 TVUS ovarian Show forest plot

8

765

3 TVUS DIE Show forest plot

3

282

4 TVUS posterior DIE Show forest plot

7

853

5 TVUS* posterior DIE Show forest plot

2

248

6 TVUS USL Show forest plot

7

751

7 TVUS RVS Show forest plot

10

983

8 TVUS* RVS Show forest plot

1

90

9 TVUS vaginal Show forest plot

6

679

10 TVUS POD Show forest plot

6

755

11 TVUS anterior DIE Show forest plot

2

289

12 TVUS rectosigmoid Show forest plot

14

1615

13 TVUS* rectosigmoid Show forest plot

1

202

14 TVUS bowel [ileum ‐ rectum] Show forest plot

3

314

15 TRUS ovarian Show forest plot

1

92

16 TRUS USL Show forest plot

2

232

17 TRUS RVS Show forest plot

2

232

18 TRUS vaginal Show forest plot

2

232

19 TRUS rectosigmoid Show forest plot

4

330

20 TRUS bowel [ileum ‐ rectum] Show forest plot

1

134

21 MRI pelvic Show forest plot

7

303

22 MRI* pelvic Show forest plot

2

62

23 MRI** pelvic Show forest plot

1

31

24 MRI ovarian Show forest plot

3

179

25 MRI DIE Show forest plot

4

212

26 MRI posterior DIE Show forest plot

2

54

27 MRI* posterior DIE Show forest plot

1

23

28 MRI USL Show forest plot

4

198

29 MRI* USL Show forest plot

1

23

30 MRI RVS Show forest plot

3

288

31 MRI vaginal Show forest plot

4

248

32 MRI* vaginal Show forest plot

1

23

33 MRI POD Show forest plot

5

154

34 MRI* POD Show forest plot

1

23

35 MRI anterior DIE Show forest plot

1

41

36 MRI rectosigmoid Show forest plot

6

612

37 MRI* rectosigmoid Show forest plot

1

23

38 MDCT‐e rectosigmoid Show forest plot

3

389

39 MDCT‐e bowel [ileum ‐ rectum] Show forest plot

2

194

40 18FDG PET–CT pelvic Show forest plot

1

10

41 DCBE DIE Show forest plot

1

69

42 DCBE rectosigmoid Show forest plot

2

106

43 MRI pelvic1 Show forest plot

1

35

Figures and Tables -
Table Tests. Data tables by test