TY - JOUR T1 - Development of the AGREE II, part 1: performance, usefulness and areas for improvement JF - Canadian Medical Association Journal JO - CMAJ SP - 1045 LP - 1052 DO - 10.1503/cmaj.091714 VL - 182 IS - 10 AU - Melissa C. Brouwers AU - Michelle E. Kho AU - George P. Browman AU - Jako S. Burgers AU - Francoise Cluzeau AU - Gene Feder AU - Béatrice Fervers AU - Ian D. Graham AU - Steven E. Hanna AU - Julie Makarski A2 - , Y1 - 2010/07/13 UR - http://www.cmaj.ca/content/182/10/1045.abstract N2 - Background: We undertook research to improve the AGREE instrument, a tool used to evaluate guidelines. We tested a new seven-point scale, evaluated the usefulness of the original items in the instrument, investigated evidence to support shorter, tailored versions of the tool, and identified areas for improvement. Method: We report on one component of a larger study that used a mixed design with four factors (user type, clinical topic, guideline and condition). For the analysis reported in this article, we asked participants to read a guideline and use the AGREE items to evaluate it based on a seven-point scale, to complete three outcome measures related to adoption of the guideline, and to provide feedback on the instrument’s usefulness and how to improve it. Results: Guideline developers gave lower-quality ratings than did clinicians or policy-makers. Five of six domains were significant predictors of participants’ outcome measures (p < 0.05). All domains and items were rated as useful by stakeholders (mean scores > 4.0) with no significant differences by user type (p > 0.05). Internal consistency ranged between 0.64 and 0.89. Inter-rater reliability was satisfactory. We received feedback on how to improve the instrument. Interpretation: Quality ratings of the AGREE domains were significant predictors of outcome measures associated with guideline adoption: guideline endorsements, overall intentions to use guidelines, and overall quality of guidelines. All AGREE items were assessed as useful in determining whether a participant would use a guideline. No clusters of items were found more useful by some users than others. The measurement properties of the seven-point scale were promising. These data contributed to the refinements and release of the AGREE II. ER -