RT Journal Article SR Electronic T1 Development of the AGREE II, part 1: performance, usefulness and areas for improvement JF Canadian Medical Association Journal JO CMAJ FD Canadian Medical Association SP 1045 OP 1052 DO 10.1503/cmaj.091714 VO 182 IS 10 A1 Melissa C. Brouwers A1 Michelle E. Kho A1 George P. Browman A1 Jako S. Burgers A1 Francoise Cluzeau A1 Gene Feder A1 Béatrice Fervers A1 Ian D. Graham A1 Steven E. Hanna A1 Julie Makarski A1 for the AGREE Next Steps Consortium YR 2010 UL http://www.cmaj.ca/content/182/10/1045.abstract AB Background: We undertook research to improve the AGREE instrument, a tool used to evaluate guidelines. We tested a new seven-point scale, evaluated the usefulness of the original items in the instrument, investigated evidence to support shorter, tailored versions of the tool, and identified areas for improvement. Method: We report on one component of a larger study that used a mixed design with four factors (user type, clinical topic, guideline and condition). For the analysis reported in this article, we asked participants to read a guideline and use the AGREE items to evaluate it based on a seven-point scale, to complete three outcome measures related to adoption of the guideline, and to provide feedback on the instrument’s usefulness and how to improve it. Results: Guideline developers gave lower-quality ratings than did clinicians or policy-makers. Five of six domains were significant predictors of participants’ outcome measures (p < 0.05). All domains and items were rated as useful by stakeholders (mean scores > 4.0) with no significant differences by user type (p > 0.05). Internal consistency ranged between 0.64 and 0.89. Inter-rater reliability was satisfactory. We received feedback on how to improve the instrument. Interpretation: Quality ratings of the AGREE domains were significant predictors of outcome measures associated with guideline adoption: guideline endorsements, overall intentions to use guidelines, and overall quality of guidelines. All AGREE items were assessed as useful in determining whether a participant would use a guideline. No clusters of items were found more useful by some users than others. The measurement properties of the seven-point scale were promising. These data contributed to the refinements and release of the AGREE II.