systematic literature review and evidence based guidelines

- Google Chrome

Intended for healthcare professionals

Access provided by Google Indexer
My email alerts
BMA member login
Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Search form

Advanced search
Search responses
Search blogs
The PRISMA 2020...

The PRISMA 2020 statement: an updated guideline for reporting systematic reviews

PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews

Related content
Peer review
Joanne E McKenzie , associate professor 1 ,
Patrick M Bossuyt , professor 2 ,
Isabelle Boutron , professor 3 ,
Tammy C Hoffmann , professor 4 ,
Cynthia D Mulrow , professor 5 ,
Larissa Shamseer , doctoral student 6 ,
Jennifer M Tetzlaff , research product specialist 7 ,
Elie A Akl , professor 8 ,
Sue E Brennan , senior research fellow 1 ,
Roger Chou , professor 9 ,
Julie Glanville , associate director 10 ,
Jeremy M Grimshaw , professor 11 ,
Asbjørn Hróbjartsson , professor 12 ,
Manoj M Lalu , associate scientist and assistant professor 13 ,
Tianjing Li , associate professor 14 ,
Elizabeth W Loder , professor 15 ,
Evan Mayo-Wilson , associate professor 16 ,
Steve McDonald , senior research fellow 1 ,
Luke A McGuinness , research associate 17 ,
Lesley A Stewart , professor and director 18 ,
James Thomas , professor 19 ,
Andrea C Tricco , scientist and associate professor 20 ,
Vivian A Welch , associate professor 21 ,
Penny Whiting , associate professor 17 ,
David Moher , director and professor 22
1 School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
2 Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Amsterdam University Medical Centres, University of Amsterdam, Amsterdam, Netherlands
3 Université de Paris, Centre of Epidemiology and Statistics (CRESS), Inserm, F 75004 Paris, France
4 Institute for Evidence-Based Healthcare, Faculty of Health Sciences and Medicine, Bond University, Gold Coast, Australia
5 University of Texas Health Science Center at San Antonio, San Antonio, Texas, USA; Annals of Internal Medicine
6 Knowledge Translation Program, Li Ka Shing Knowledge Institute, Toronto, Canada; School of Epidemiology and Public Health, Faculty of Medicine, University of Ottawa, Ottawa, Canada
7 Evidence Partners, Ottawa, Canada
8 Clinical Research Institute, American University of Beirut, Beirut, Lebanon; Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
9 Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
10 York Health Economics Consortium (YHEC Ltd), University of York, York, UK
11 Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Canada; School of Epidemiology and Public Health, University of Ottawa, Ottawa, Canada; Department of Medicine, University of Ottawa, Ottawa, Canada
12 Centre for Evidence-Based Medicine Odense (CEBMO) and Cochrane Denmark, Department of Clinical Research, University of Southern Denmark, Odense, Denmark; Open Patient data Exploratory Network (OPEN), Odense University Hospital, Odense, Denmark
13 Department of Anesthesiology and Pain Medicine, The Ottawa Hospital, Ottawa, Canada; Clinical Epidemiology Program, Blueprint Translational Research Group, Ottawa Hospital Research Institute, Ottawa, Canada; Regenerative Medicine Program, Ottawa Hospital Research Institute, Ottawa, Canada
14 Department of Ophthalmology, School of Medicine, University of Colorado Denver, Denver, Colorado, United States; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
15 Division of Headache, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA; Head of Research, The BMJ , London, UK
16 Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, Indiana, USA
17 Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
18 Centre for Reviews and Dissemination, University of York, York, UK
19 EPPI-Centre, UCL Social Research Institute, University College London, London, UK
20 Li Ka Shing Knowledge Institute of St. Michael's Hospital, Unity Health Toronto, Toronto, Canada; Epidemiology Division of the Dalla Lana School of Public Health and the Institute of Health Management, Policy, and Evaluation, University of Toronto, Toronto, Canada; Queen's Collaboration for Health Care Quality Joanna Briggs Institute Centre of Excellence, Queen's University, Kingston, Canada
21 Methods Centre, Bruyère Research Institute, Ottawa, Ontario, Canada; School of Epidemiology and Public Health, Faculty of Medicine, University of Ottawa, Ottawa, Canada
22 Centre for Journalology, Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Canada; School of Epidemiology and Public Health, Faculty of Medicine, University of Ottawa, Ottawa, Canada
Correspondence to: M J Page matthew.page{at}monash.edu
Accepted 4 January 2021

The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement, published in 2009, was designed to help systematic reviewers transparently report why the review was done, what the authors did, and what they found. Over the past decade, advances in systematic review methodology and terminology have necessitated an update to the guideline. The PRISMA 2020 statement replaces the 2009 statement and includes new reporting guidance that reflects advances in methods to identify, select, appraise, and synthesise studies. The structure and presentation of the items have been modified to facilitate implementation. In this article, we present the PRISMA 2020 27-item checklist, an expanded checklist that details reporting recommendations for each item, the PRISMA 2020 abstract checklist, and the revised flow diagrams for original and updated reviews.

Systematic reviews serve many critical roles. They can provide syntheses of the state of knowledge in a field, from which future research priorities can be identified; they can address questions that otherwise could not be answered by individual studies; they can identify problems in primary research that should be rectified in future studies; and they can generate or evaluate theories about how or why phenomena occur. Systematic reviews therefore generate various types of knowledge for different users of reviews (such as patients, healthcare providers, researchers, and policy makers). 1 2 To ensure a systematic review is valuable to users, authors should prepare a transparent, complete, and accurate account of why the review was done, what they did (such as how studies were identified and selected) and what they found (such as characteristics of contributing studies and results of meta-analyses). Up-to-date reporting guidance facilitates authors achieving this. 3

The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement published in 2009 (hereafter referred to as PRISMA 2009) 4 5 6 7 8 9 10 is a reporting guideline designed to address poor reporting of systematic reviews. 11 The PRISMA 2009 statement comprised a checklist of 27 items recommended for reporting in systematic reviews and an “explanation and elaboration” paper 12 13 14 15 16 providing additional reporting guidance for each item, along with exemplars of reporting. The recommendations have been widely endorsed and adopted, as evidenced by its co-publication in multiple journals, citation in over 60 000 reports (Scopus, August 2020), endorsement from almost 200 journals and systematic review organisations, and adoption in various disciplines. Evidence from observational studies suggests that use of the PRISMA 2009 statement is associated with more complete reporting of systematic reviews, 17 18 19 20 although more could be done to improve adherence to the guideline. 21

Many innovations in the conduct of systematic reviews have occurred since publication of the PRISMA 2009 statement. For example, technological advances have enabled the use of natural language processing and machine learning to identify relevant evidence, 22 23 24 methods have been proposed to synthesise and present findings when meta-analysis is not possible or appropriate, 25 26 27 and new methods have been developed to assess the risk of bias in results of included studies. 28 29 Evidence on sources of bias in systematic reviews has accrued, culminating in the development of new tools to appraise the conduct of systematic reviews. 30 31 Terminology used to describe particular review processes has also evolved, as in the shift from assessing “quality” to assessing “certainty” in the body of evidence. 32 In addition, the publishing landscape has transformed, with multiple avenues now available for registering and disseminating systematic review protocols, 33 34 disseminating reports of systematic reviews, and sharing data and materials, such as preprint servers and publicly accessible repositories. To capture these advances in the reporting of systematic reviews necessitated an update to the PRISMA 2009 statement.

Summary points

To ensure a systematic review is valuable to users, authors should prepare a transparent, complete, and accurate account of why the review was done, what they did, and what they found

The PRISMA 2020 statement provides updated reporting guidance for systematic reviews that reflects advances in methods to identify, select, appraise, and synthesise studies

The PRISMA 2020 statement consists of a 27-item checklist, an expanded checklist that details reporting recommendations for each item, the PRISMA 2020 abstract checklist, and revised flow diagrams for original and updated reviews

Development of PRISMA 2020

A complete description of the methods used to develop PRISMA 2020 is available elsewhere. 35 We identified PRISMA 2009 items that were often reported incompletely by examining the results of studies investigating the transparency of reporting of published reviews. 17 21 36 37 We identified possible modifications to the PRISMA 2009 statement by reviewing 60 documents providing reporting guidance for systematic reviews (including reporting guidelines, handbooks, tools, and meta-research studies). 38 These reviews of the literature were used to inform the content of a survey with suggested possible modifications to the 27 items in PRISMA 2009 and possible additional items. Respondents were asked whether they believed we should keep each PRISMA 2009 item as is, modify it, or remove it, and whether we should add each additional item. Systematic review methodologists and journal editors were invited to complete the online survey (110 of 220 invited responded). We discussed proposed content and wording of the PRISMA 2020 statement, as informed by the review and survey results, at a 21-member, two-day, in-person meeting in September 2018 in Edinburgh, Scotland. Throughout 2019 and 2020, we circulated an initial draft and five revisions of the checklist and explanation and elaboration paper to co-authors for feedback. In April 2020, we invited 22 systematic reviewers who had expressed interest in providing feedback on the PRISMA 2020 checklist to share their views (via an online survey) on the layout and terminology used in a preliminary version of the checklist. Feedback was received from 15 individuals and considered by the first author, and any revisions deemed necessary were incorporated before the final version was approved and endorsed by all co-authors.

The PRISMA 2020 statement

Scope of the guideline.

The PRISMA 2020 statement has been designed primarily for systematic reviews of studies that evaluate the effects of health interventions, irrespective of the design of the included studies. However, the checklist items are applicable to reports of systematic reviews evaluating other interventions (such as social or educational interventions), and many items are applicable to systematic reviews with objectives other than evaluating interventions (such as evaluating aetiology, prevalence, or prognosis). PRISMA 2020 is intended for use in systematic reviews that include synthesis (such as pairwise meta-analysis or other statistical synthesis methods) or do not include synthesis (for example, because only one eligible study is identified). The PRISMA 2020 items are relevant for mixed-methods systematic reviews (which include quantitative and qualitative studies), but reporting guidelines addressing the presentation and synthesis of qualitative data should also be consulted. 39 40 PRISMA 2020 can be used for original systematic reviews, updated systematic reviews, or continually updated (“living”) systematic reviews. However, for updated and living systematic reviews, there may be some additional considerations that need to be addressed. Where there is relevant content from other reporting guidelines, we reference these guidelines within the items in the explanation and elaboration paper 41 (such as PRISMA-Search 42 in items 6 and 7, Synthesis without meta-analysis (SWiM) reporting guideline 27 in item 13d). Box 1 includes a glossary of terms used throughout the PRISMA 2020 statement.

Glossary of terms

Systematic review —A review that uses explicit, systematic methods to collate and synthesise findings of studies that address a clearly formulated question 43

Statistical synthesis —The combination of quantitative results of two or more studies. This encompasses meta-analysis of effect estimates (described below) and other methods, such as combining P values, calculating the range and distribution of observed effects, and vote counting based on the direction of effect (see McKenzie and Brennan 25 for a description of each method)

Meta-analysis of effect estimates —A statistical technique used to synthesise results when study effect estimates and their variances are available, yielding a quantitative summary of results 25

Outcome —An event or measurement collected for participants in a study (such as quality of life, mortality)

Result —The combination of a point estimate (such as a mean difference, risk ratio, or proportion) and a measure of its precision (such as a confidence/credible interval) for a particular outcome

Report —A document (paper or electronic) supplying information about a particular study. It could be a journal article, preprint, conference abstract, study register entry, clinical study report, dissertation, unpublished manuscript, government report, or any other document providing relevant information

Record —The title or abstract (or both) of a report indexed in a database or website (such as a title or abstract for an article indexed in Medline). Records that refer to the same report (such as the same journal article) are “duplicates”; however, records that refer to reports that are merely similar (such as a similar abstract submitted to two different conferences) should be considered unique.

Study —An investigation, such as a clinical trial, that includes a defined group of participants and one or more interventions and outcomes. A “study” might have multiple reports. For example, reports could include the protocol, statistical analysis plan, baseline characteristics, results for the primary outcome, results for harms, results for secondary outcomes, and results for additional mediator and moderator analyses

PRISMA 2020 is not intended to guide systematic review conduct, for which comprehensive resources are available. 43 44 45 46 However, familiarity with PRISMA 2020 is useful when planning and conducting systematic reviews to ensure that all recommended information is captured. PRISMA 2020 should not be used to assess the conduct or methodological quality of systematic reviews; other tools exist for this purpose. 30 31 Furthermore, PRISMA 2020 is not intended to inform the reporting of systematic review protocols, for which a separate statement is available (PRISMA for Protocols (PRISMA-P) 2015 statement 47 48 ). Finally, extensions to the PRISMA 2009 statement have been developed to guide reporting of network meta-analyses, 49 meta-analyses of individual participant data, 50 systematic reviews of harms, 51 systematic reviews of diagnostic test accuracy studies, 52 and scoping reviews 53 ; for these types of reviews we recommend authors report their review in accordance with the recommendations in PRISMA 2020 along with the guidance specific to the extension.

How to use PRISMA 2020

The PRISMA 2020 statement (including the checklists, explanation and elaboration, and flow diagram) replaces the PRISMA 2009 statement, which should no longer be used. Box 2 summarises noteworthy changes from the PRISMA 2009 statement. The PRISMA 2020 checklist includes seven sections with 27 items, some of which include sub-items ( table 1 ). A checklist for journal and conference abstracts for systematic reviews is included in PRISMA 2020. This abstract checklist is an update of the 2013 PRISMA for Abstracts statement, 54 reflecting new and modified content in PRISMA 2020 ( table 2 ). A template PRISMA flow diagram is provided, which can be modified depending on whether the systematic review is original or updated ( fig 1 ).

Noteworthy changes to the PRISMA 2009 statement

Inclusion of the abstract reporting checklist within PRISMA 2020 (see item #2 and table 2 ).

Movement of the ‘Protocol and registration’ item from the start of the Methods section of the checklist to a new Other section, with addition of a sub-item recommending authors describe amendments to information provided at registration or in the protocol (see item #24a-24c).

Modification of the ‘Search’ item to recommend authors present full search strategies for all databases, registers and websites searched, not just at least one database (see item #7).

Modification of the ‘Study selection’ item in the Methods section to emphasise the reporting of how many reviewers screened each record and each report retrieved, whether they worked independently, and if applicable, details of automation tools used in the process (see item #8).

Addition of a sub-item to the ‘Data items’ item recommending authors report how outcomes were defined, which results were sought, and methods for selecting a subset of results from included studies (see item #10a).

Splitting of the ‘Synthesis of results’ item in the Methods section into six sub-items recommending authors describe: the processes used to decide which studies were eligible for each synthesis; any methods required to prepare the data for synthesis; any methods used to tabulate or visually display results of individual studies and syntheses; any methods used to synthesise results; any methods used to explore possible causes of heterogeneity among study results (such as subgroup analysis, meta-regression); and any sensitivity analyses used to assess robustness of the synthesised results (see item #13a-13f).

Addition of a sub-item to the ‘Study selection’ item in the Results section recommending authors cite studies that might appear to meet the inclusion criteria, but which were excluded, and explain why they were excluded (see item #16b).

Splitting of the ‘Synthesis of results’ item in the Results section into four sub-items recommending authors: briefly summarise the characteristics and risk of bias among studies contributing to the synthesis; present results of all statistical syntheses conducted; present results of any investigations of possible causes of heterogeneity among study results; and present results of any sensitivity analyses (see item #20a-20d).

Addition of new items recommending authors report methods for and results of an assessment of certainty (or confidence) in the body of evidence for an outcome (see items #15 and #22).

Addition of a new item recommending authors declare any competing interests (see item #26).

Addition of a new item recommending authors indicate whether data, analytic code and other materials used in the review are publicly available and if so, where they can be found (see item #27).

PRISMA 2020 item checklist

View inline

PRISMA 2020 for Abstracts checklist*

PRISMA 2020 flow diagram template for systematic reviews. The new design is adapted from flow diagrams proposed by Boers, 55 Mayo-Wilson et al. 56 and Stovold et al. 57 The boxes in grey should only be completed if applicable; otherwise they should be removed from the flow diagram. Note that a “report” could be a journal article, preprint, conference abstract, study register entry, clinical study report, dissertation, unpublished manuscript, government report or any other document providing relevant information.

Download figure
Open in new tab
Download powerpoint

We recommend authors refer to PRISMA 2020 early in the writing process, because prospective consideration of the items may help to ensure that all the items are addressed. To help keep track of which items have been reported, the PRISMA statement website ( http://www.prisma-statement.org/ ) includes fillable templates of the checklists to download and complete (also available in the data supplement on bmj.com). We have also created a web application that allows users to complete the checklist via a user-friendly interface 58 (available at https://prisma.shinyapps.io/checklist/ and adapted from the Transparency Checklist app 59 ). The completed checklist can be exported to Word or PDF. Editable templates of the flow diagram can also be downloaded from the PRISMA statement website.

We have prepared an updated explanation and elaboration paper, in which we explain why reporting of each item is recommended and present bullet points that detail the reporting recommendations (which we refer to as elements). 41 The bullet-point structure is new to PRISMA 2020 and has been adopted to facilitate implementation of the guidance. 60 61 An expanded checklist, which comprises an abridged version of the elements presented in the explanation and elaboration paper, with references and some examples removed, is available in the data supplement on bmj.com. Consulting the explanation and elaboration paper is recommended if further clarity or information is required.

Journals and publishers might impose word and section limits, and limits on the number of tables and figures allowed in the main report. In such cases, if the relevant information for some items already appears in a publicly accessible review protocol, referring to the protocol may suffice. Alternatively, placing detailed descriptions of the methods used or additional results (such as for less critical outcomes) in supplementary files is recommended. Ideally, supplementary files should be deposited to a general-purpose or institutional open-access repository that provides free and permanent access to the material (such as Open Science Framework, Dryad, figshare). A reference or link to the additional information should be included in the main report. Finally, although PRISMA 2020 provides a template for where information might be located, the suggested location should not be seen as prescriptive; the guiding principle is to ensure the information is reported.

Use of PRISMA 2020 has the potential to benefit many stakeholders. Complete reporting allows readers to assess the appropriateness of the methods, and therefore the trustworthiness of the findings. Presenting and summarising characteristics of studies contributing to a synthesis allows healthcare providers and policy makers to evaluate the applicability of the findings to their setting. Describing the certainty in the body of evidence for an outcome and the implications of findings should help policy makers, managers, and other decision makers formulate appropriate recommendations for practice or policy. Complete reporting of all PRISMA 2020 items also facilitates replication and review updates, as well as inclusion of systematic reviews in overviews (of systematic reviews) and guidelines, so teams can leverage work that is already done and decrease research waste. 36 62 63

We updated the PRISMA 2009 statement by adapting the EQUATOR Network’s guidance for developing health research reporting guidelines. 64 We evaluated the reporting completeness of published systematic reviews, 17 21 36 37 reviewed the items included in other documents providing guidance for systematic reviews, 38 surveyed systematic review methodologists and journal editors for their views on how to revise the original PRISMA statement, 35 discussed the findings at an in-person meeting, and prepared this document through an iterative process. Our recommendations are informed by the reviews and survey conducted before the in-person meeting, theoretical considerations about which items facilitate replication and help users assess the risk of bias and applicability of systematic reviews, and co-authors’ experience with authoring and using systematic reviews.

Various strategies to increase the use of reporting guidelines and improve reporting have been proposed. They include educators introducing reporting guidelines into graduate curricula to promote good reporting habits of early career scientists 65 ; journal editors and regulators endorsing use of reporting guidelines 18 ; peer reviewers evaluating adherence to reporting guidelines 61 66 ; journals requiring authors to indicate where in their manuscript they have adhered to each reporting item 67 ; and authors using online writing tools that prompt complete reporting at the writing stage. 60 Multi-pronged interventions, where more than one of these strategies are combined, may be more effective (such as completion of checklists coupled with editorial checks). 68 However, of 31 interventions proposed to increase adherence to reporting guidelines, the effects of only 11 have been evaluated, mostly in observational studies at high risk of bias due to confounding. 69 It is therefore unclear which strategies should be used. Future research might explore barriers and facilitators to the use of PRISMA 2020 by authors, editors, and peer reviewers, designing interventions that address the identified barriers, and evaluating those interventions using randomised trials. To inform possible revisions to the guideline, it would also be valuable to conduct think-aloud studies 70 to understand how systematic reviewers interpret the items, and reliability studies to identify items where there is varied interpretation of the items.

We encourage readers to submit evidence that informs any of the recommendations in PRISMA 2020 (via the PRISMA statement website: http://www.prisma-statement.org/ ). To enhance accessibility of PRISMA 2020, several translations of the guideline are under way (see available translations at the PRISMA statement website). We encourage journal editors and publishers to raise awareness of PRISMA 2020 (for example, by referring to it in journal “Instructions to authors”), endorsing its use, advising editors and peer reviewers to evaluate submitted systematic reviews against the PRISMA 2020 checklists, and making changes to journal policies to accommodate the new reporting recommendations. We recommend existing PRISMA extensions 47 49 50 51 52 53 71 72 be updated to reflect PRISMA 2020 and advise developers of new PRISMA extensions to use PRISMA 2020 as the foundation document.

We anticipate that the PRISMA 2020 statement will benefit authors, editors, and peer reviewers of systematic reviews, and different users of reviews, including guideline developers, policy makers, healthcare providers, patients, and other stakeholders. Ultimately, we hope that uptake of the guideline will lead to more transparent, complete, and accurate reporting of systematic reviews, thus facilitating evidence based decision making.

Acknowledgments

We dedicate this paper to the late Douglas G Altman and Alessandro Liberati, whose contributions were fundamental to the development and implementation of the original PRISMA statement.

We thank the following contributors who completed the survey to inform discussions at the development meeting: Xavier Armoiry, Edoardo Aromataris, Ana Patricia Ayala, Ethan M Balk, Virginia Barbour, Elaine Beller, Jesse A Berlin, Lisa Bero, Zhao-Xiang Bian, Jean Joel Bigna, Ferrán Catalá-López, Anna Chaimani, Mike Clarke, Tammy Clifford, Ioana A Cristea, Miranda Cumpston, Sofia Dias, Corinna Dressler, Ivan D Florez, Joel J Gagnier, Chantelle Garritty, Long Ge, Davina Ghersi, Sean Grant, Gordon Guyatt, Neal R Haddaway, Julian PT Higgins, Sally Hopewell, Brian Hutton, Jamie J Kirkham, Jos Kleijnen, Julia Koricheva, Joey SW Kwong, Toby J Lasserson, Julia H Littell, Yoon K Loke, Malcolm R Macleod, Chris G Maher, Ana Marušic, Dimitris Mavridis, Jessie McGowan, Matthew DF McInnes, Philippa Middleton, Karel G Moons, Zachary Munn, Jane Noyes, Barbara Nußbaumer-Streit, Donald L Patrick, Tatiana Pereira-Cenci, Ba’ Pham, Bob Phillips, Dawid Pieper, Michelle Pollock, Daniel S Quintana, Drummond Rennie, Melissa L Rethlefsen, Hannah R Rothstein, Maroeska M Rovers, Rebecca Ryan, Georgia Salanti, Ian J Saldanha, Margaret Sampson, Nancy Santesso, Rafael Sarkis-Onofre, Jelena Savović, Christopher H Schmid, Kenneth F Schulz, Guido Schwarzer, Beverley J Shea, Paul G Shekelle, Farhad Shokraneh, Mark Simmonds, Nicole Skoetz, Sharon E Straus, Anneliese Synnot, Emily E Tanner-Smith, Brett D Thombs, Hilary Thomson, Alexander Tsertsvadze, Peter Tugwell, Tari Turner, Lesley Uttley, Jeffrey C Valentine, Matt Vassar, Areti Angeliki Veroniki, Meera Viswanathan, Cole Wayant, Paul Whaley, and Kehu Yang. We thank the following contributors who provided feedback on a preliminary version of the PRISMA 2020 checklist: Jo Abbott, Fionn Büttner, Patricia Correia-Santos, Victoria Freeman, Emily A Hennessy, Rakibul Islam, Amalia (Emily) Karahalios, Kasper Krommes, Andreas Lundh, Dafne Port Nascimento, Davina Robson, Catherine Schenck-Yglesias, Mary M Scott, Sarah Tanveer and Pavel Zhelnov. We thank Abigail H Goben, Melissa L Rethlefsen, Tanja Rombey, Anna Scott, and Farhad Shokraneh for their helpful comments on the preprints of the PRISMA 2020 papers. We thank Edoardo Aromataris, Stephanie Chang, Toby Lasserson and David Schriger for their helpful peer review comments on the PRISMA 2020 papers.

Contributors: JEM and DM are joint senior authors. MJP, JEM, PMB, IB, TCH, CDM, LS, and DM conceived this paper and designed the literature review and survey conducted to inform the guideline content. MJP conducted the literature review, administered the survey and analysed the data for both. MJP prepared all materials for the development meeting. MJP and JEM presented proposals at the development meeting. All authors except for TCH, JMT, EAA, SEB, and LAM attended the development meeting. MJP and JEM took and consolidated notes from the development meeting. MJP and JEM led the drafting and editing of the article. JEM, PMB, IB, TCH, LS, JMT, EAA, SEB, RC, JG, AH, TL, EMW, SM, LAM, LAS, JT, ACT, PW, and DM drafted particular sections of the article. All authors were involved in revising the article critically for important intellectual content. All authors approved the final version of the article. MJP is the guarantor of this work. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Funding: There was no direct funding for this research. MJP is supported by an Australian Research Council Discovery Early Career Researcher Award (DE200101618) and was previously supported by an Australian National Health and Medical Research Council (NHMRC) Early Career Fellowship (1088535) during the conduct of this research. JEM is supported by an Australian NHMRC Career Development Fellowship (1143429). TCH is supported by an Australian NHMRC Senior Research Fellowship (1154607). JMT is supported by Evidence Partners Inc. JMG is supported by a Tier 1 Canada Research Chair in Health Knowledge Transfer and Uptake. MML is supported by The Ottawa Hospital Anaesthesia Alternate Funds Association and a Faculty of Medicine Junior Research Chair. TL is supported by funding from the National Eye Institute (UG1EY020522), National Institutes of Health, United States. LAM is supported by a National Institute for Health Research Doctoral Research Fellowship (DRF-2018-11-ST2-048). ACT is supported by a Tier 2 Canada Research Chair in Knowledge Synthesis. DM is supported in part by a University Research Chair, University of Ottawa. The funders had no role in considering the study design or in the collection, analysis, interpretation of data, writing of the report, or decision to submit the article for publication.

Competing interests: All authors have completed the ICMJE uniform disclosure form at http://www.icmje.org/conflicts-of-interest/ and declare: EL is head of research for the BMJ ; MJP is an editorial board member for PLOS Medicine ; ACT is an associate editor and MJP, TL, EMW, and DM are editorial board members for the Journal of Clinical Epidemiology ; DM and LAS were editors in chief, LS, JMT, and ACT are associate editors, and JG is an editorial board member for Systematic Reviews . None of these authors were involved in the peer review process or decision to publish. TCH has received personal fees from Elsevier outside the submitted work. EMW has received personal fees from the American Journal for Public Health , for which he is the editor for systematic reviews. VW is editor in chief of the Campbell Collaboration, which produces systematic reviews, and co-convenor of the Campbell and Cochrane equity methods group. DM is chair of the EQUATOR Network, IB is adjunct director of the French EQUATOR Centre and TCH is co-director of the Australasian EQUATOR Centre, which advocates for the use of reporting guidelines to improve the quality of reporting in research articles. JMT received salary from Evidence Partners, creator of DistillerSR software for systematic reviews; Evidence Partners was not involved in the design or outcomes of the statement, and the views expressed solely represent those of the author.

Provenance and peer review: Not commissioned; externally peer reviewed.

Patient and public involvement: Patients and the public were not involved in this methodological research. We plan to disseminate the research widely, including to community participants in evidence synthesis organisations.

This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/licenses/by/4.0/ .

Gurevitch J ,
Koricheva J ,
Nakagawa S ,
Liberati A ,
Tetzlaff J ,
Altman DG ,
PRISMA Group
Tricco AC ,
Sampson M ,
Shamseer L ,
Leoncini E ,
de Belvis G ,
Ricciardi W ,
Fowler AJ ,
Leclercq V ,
Beaudart C ,
Ajamieh S ,
Rabenda V ,
Tirelli E ,
O’Mara-Eves A ,
McNaught J ,
Ananiadou S
Marshall IJ ,
Noel-Storr A ,
Higgins JPT ,
Chandler J ,
McKenzie JE ,
López-López JA ,
Becker BJ ,
Campbell M ,
Sterne JAC ,
Savović J ,
Sterne JA ,
Hernán MA ,
Reeves BC ,
Whiting P ,
Higgins JP ,
ROBIS group
Hultcrantz M ,
Stewart L ,
Bossuyt PM ,
Flemming K ,
McInnes E ,
France EF ,
Cunningham M ,
Rethlefsen ML ,
Kirtley S ,
Waffenschmidt S ,
PRISMA-S Group
↵ Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions : Version 6.0. Cochrane, 2019. Available from https://training.cochrane.org/handbook .
Dekkers OM ,
Vandenbroucke JP ,
Cevallos M ,
Renehan AG ,
↵ Cooper H, Hedges LV, Valentine JV, eds. The Handbook of Research Synthesis and Meta-Analysis. Russell Sage Foundation, 2019.
IOM (Institute of Medicine)
PRISMA-P Group
Salanti G ,
Caldwell DM ,
Stewart LA ,
PRISMA-IPD Development Group
Zorzela L ,
Ioannidis JP ,
PRISMAHarms Group
McInnes MDF ,
Thombs BD ,
and the PRISMA-DTA Group
Beller EM ,
Glasziou PP ,
PRISMA for Abstracts Group
Mayo-Wilson E ,
Dickersin K ,
MUDS investigators
Stovold E ,
Beecher D ,
Noel-Storr A
McGuinness LA
Sarafoglou A ,
Boutron I ,
Giraudeau B ,
Porcher R ,
Chauvin A ,
Schulz KF ,
Schroter S ,
Stevens A ,
Weinstein E ,
Macleod MR ,
IICARus Collaboration
Kirkham JJ ,
Petticrew M ,
Tugwell P ,
PRISMA-Equity Bellagio group

The guidelines manual

NICE process and methods [PMG6] Published: 30 November 2012

Tools and resources
1 Introduction
2 The scope
3 The Guideline Development Group
4 Developing review questions and planning the systematic review
5 Identifying the evidence: literature searching and evidence submission

6 Reviewing the evidence

7 Assessing cost effectiveness
8 Linking clinical guidelines to other NICE guidance
9 Developing and wording guideline recommendations
10 Writing the clinical guideline and the role of the NICE editors
11 The consultation process and dealing with stakeholder comments
12 Finalising and publishing the guideline
13 Implementation support for clinical guidelines
14 Updating published clinical guidelines and correcting errors
Summary of main changes from the 2009 guidelines manual
Update information
About this manual

NICE process and methods

6.1 selecting relevant studies, 6.2 questions about interventions, 6.3 questions about diagnosis, 6.4 questions about prognosis, 6.5 using patient experience to inform review questions, 6.6 published guidelines, 6.7 further reading.

Studies identified during literature searches (see chapter 5 ) need to be reviewed to identify the most appropriate data to help address the review questions, and to ensure that the guideline recommendations are based on the best available evidence. A systematic review process should be used that is explicit and transparent. This involves five major steps:

writing the review protocol (see section 4.4 )

selecting relevant studies

assessing their quality

synthesising the results

interpreting the results.

The process of selecting relevant studies is common to all systematic reviews; the other steps are discussed below in relation to the major types of questions. The same rigour should be applied to reviewing fully and partially published studies, as well as unpublished data supplied by stakeholders.

The study selection process for clinical studies and economic evaluations should be clearly documented, giving details of the inclusion and exclusion criteria that were applied.

6.1.1 Clinical studies

Before acquiring papers for assessment, the information specialist or systematic reviewer should sift the evidence identified in the search in order to discard irrelevant material. First, the titles of the retrieved citations should be scanned and those that fall outside the topic of the guideline should be excluded. A quick check of the abstracts of the remaining papers should identify those that are clearly not relevant to the review questions and hence can be excluded.

Next, the remaining abstracts should be scrutinised against the inclusion and exclusion criteria agreed by the Guideline Development Group (GDG). Abstracts that do not meet the inclusion criteria should be excluded. Any doubts about inclusion should be resolved by discussion with the GDG before the results of the study are considered. Once the sifting is complete, full versions of the selected studies can be acquired for assessment. Studies that fail to meet the inclusion criteria once the full version has been checked should be excluded; those that meet the criteria can be assessed. Because there is always a potential for error and bias in selecting the evidence, double sifting (that is, sifting by two people) of a random selection of abstracts should be performed periodically (Edwards et al. 2002).

6.1.2 Conference abstracts

Conference abstracts can be a good source of information in systematic reviews. For example, conference abstracts can be important in pointing to published trials that may be missed, in estimating the amount of not-fully-published evidence (and hence guiding calls for evidence and judgements about publication bias), or in identifying ongoing trials that are due to be published. These sources of information are important in interpreting systematic reviews, and so conference abstracts should not be excluded in the search strategy.

However, the following should be considered when deciding whether to include conference abstracts as a source of evidence:

Conference abstracts on their own seldom have sufficient information to allow confident judgements to be made about the quality and results of a study.

It could be very time consuming to trace the original studies or additional data relating to the conference abstracts, and the information found may not always be useful.

If sufficient evidence has been identified from full published studies, it may be reasonable not to trace the original studies or additional data related to conference abstracts.

If there is a lack of or limited evidence identified from full published studies, the systematic reviewer may consider an additional process for tracing the original studies or additional data relating to the conference abstracts, in order to allow full critical appraisal and to make judgements on their inclusion in or exclusion from the systematic review.

6.1.3 Economic evaluations

The process for sifting and selecting economic evaluations for assessment is essentially the same as for clinical studies. Consultation between the information specialist, the health economist and the systematic reviewer is essential when deciding the inclusion criteria; these decisions should be discussed and agreed with the GDG. The review should be targeted to identify the papers that are most relevant to current NHS practice and hence likely to inform GDG decision-making. The review should also usually focus on 'full' economic evaluations that compare both the costs and health consequences of the alternative interventions and any services under consideration.

Inclusion criteria for filtering and selection of papers for review by the health economist should specify relevant populations and interventions for the review question. They should also specify the following:

An appropriate date range, as older studies may reflect outdated practices.

The country or setting, as studies conducted in other healthcare systems might not be relevant to the NHS. In some cases it may be appropriate to limit consideration to UK-based or OECD (Organisation for Economic Co-operation and Development) studies.

The type of economic evaluation. This may include cost–utility, cost–benefit, cost-effectiveness, cost-minimisation or cost–consequence analyses. Non-comparative costing studies, 'burden of disease' studies and 'cost of illness' studies should usually be excluded.

These questions concern the relative effects of an intervention, as described in section 4.3.1 . The consideration of cost effectiveness is integral to the process of reviewing evidence and making recommendations about interventions. However, the quality criteria and ways of summarising the data are slightly different from those for clinical effectiveness, so these are discussed in separate subsections.

6.2.1 Assessing study quality for clinical effectiveness

Study quality can be defined as the degree of confidence about the estimate of a treatment effect.

The first stage is to determine the study design so that the appropriate criteria can be applied in the assessment. A study design checklist can be obtained from the Cochrane handbook for systematic reviews of interventions (Higgins and Green 2011). Tables 13.2.a and 13.2.b in the Cochrane handbook are lists of study design features for studies with allocation to interventions at the individual and group levels respectively, and box 13.4.a provides useful notes for completing the checklist.

Once a study has been classified, it should be assessed using the methodology checklist for that type of study (see appendices B–E ). To minimise errors and any potential bias in the assessment, two reviewers should independently assess the quality of a random selection of studies. Any differences arising from this should be discussed fully at a GDG meeting.

The quality of a study can vary depending on which of its measured outcomes is being considered. Well-conducted randomised controlled trials (RCTs) are more likely than non-randomised studies to produce similar comparison groups, and are therefore particularly suited to estimating the effects of interventions. However, short-term outcomes may be less susceptible to bias than long-term outcomes because of greater loss to follow-up with the latter. It is therefore important when summarising evidence that quality is considered according to outcome.

6.2.1.1 The GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach to assessing the quality of evidence

The GRADE approach for questions about interventions has been used in the development of NICE clinical guidelines since 2009. For more details about GRADE, see the Journal of Clinical Epidemiology series, appendix K and the GRADE working group website .

GRADE is a system developed by an international working group for rating the quality of evidence across outcomes in systematic reviews and guidelines; it can also be used to grade the strength of recommendations in guidelines. The system is designed for reviews and guidelines that examine alternative management strategies or interventions, and these may include no intervention or current best management. The key difference from other assessment systems is that GRADE rates the quality of evidence for a particular outcome across studies and does not rate the quality of individual studies.

In order to apply GRADE, the evidence must clearly specify the relevant setting, population, intervention, comparator(s) and outcomes.

Before starting an evidence review, the GDG should apply an initial rating to the importance of outcomes, in order to identify which outcomes of interest are both 'critical' to decision-making and 'important' to patients. This rating should be confirmed or, if absolutely necessary, revised after completing the evidence review.

Box 6.1 summarises the GRADE approach to rating the quality of evidence.

Box 6.1 The GRADE approach to assessing the quality of evidence for intervention studies

The approach taken by NICE differs from the standard GRADE system in two ways:

It also integrates a review of the quality of cost-effectiveness studies.

It has no 'overall summary' labels for the quality of the evidence across all outcomes or for the strength of a recommendation, but uses the wording of recommendations to reflect the strength of the recommendation (see section 9.3.3 ).

6.2.2 Summarising and presenting results for clinical effectiveness

Characteristics of data should be extracted to a standard template for inclusion in an evidence table (see appendix J1 ). Evidence tables help to identify the similarities and differences between studies, including the key characteristics of the study population and interventions or outcome measures. This provides a basis for comparison.

Meta-analysis may be needed to pool treatment estimates from different studies. Recognised approaches to meta-analysis should be used, as described in the manual from NHS Centre for Reviews and Dissemination (2009) and in Higgins and Green (2011).

The body of evidence addressing a question should then be presented within the text of the full guideline as an evidence profile as described in the GRADE system (see appendix K ). GRADEpro software can be used to prepare these profiles. Evidence profiles contain a 'quality assessment' section that summarises the quality of the evidence and a 'summary of findings' section that presents the outcome data for each critical and each important clinical outcome. The 'summary of findings' section includes a limited description of the quality of the evidence and can be presented alone in the text of the guideline (in which case full GRADE profiles should be presented in an appendix).

Short evidence statements for outcomes should be presented after the GRADE profiles, summarising the key features of the evidence on clinical effectiveness (including adverse events as appropriate) and cost effectiveness. The evidence statements should include the number of studies and participants, the quality of the evidence and the direction of estimate of the effect (see box 6.2 for examples of evidence statements). An evidence statement may be needed even if no evidence is identified for a critical or important outcome. Evidence statements may also note the presence of relevant ongoing research.

Box 6.2 Examples of evidence statements

6.2.3 indirect treatment comparisons and mixed treatment comparisons.

NICE has a preference for data from head-to-head RCTs, and these should be presented in the reference case analysis if available. However, there may be situations when data from head-to-head studies of the options (and/or comparators) of interest are not available. In these circumstances, indirect treatment comparison analyses should be considered.

An 'indirect treatment comparison' refers to the synthesis of data from trials in which the interventions of interest have been compared indirectly using data from a network of trials that compare the interventions with other interventions. A 'mixed treatment comparison' refers to an analysis that includes both trials that compare the interventions of interest head-to-head and trials that compare them indirectly.

The principles of good practice for systematic reviews and meta-analyses should be carefully followed when conducting indirect treatment comparisons or mixed treatment comparisons. The rationale for identifying and selecting the RCTs should be explained, including the rationale for selecting the treatment comparisons that have been included. A clear description of the methods of synthesis is required. The methods and results of the individual trials should be documented. If there is doubt about the relevance of particular trials, a sensitivity analysis in which these trials are excluded should also be presented. The heterogeneity between the results of pairwise comparisons and inconsistencies between the direct and indirect evidence on the interventions should be reported.

There may be circumstances in which data from head-to-head RCTs are less than ideal (for example, the sample size may be small or there may be concerns about the external validity). In such cases, additional evidence from mixed treatment comparisons can be considered. In these cases, mixed treatment comparisons should be presented separately from the reference-case analysis and a rationale for their inclusion provided. Again, the principles of good practice apply.

When multiple options are being appraised, data from RCTs (when available) that compare each of the options head-to-head should be presented in a series of pairwise comparisons. Consideration may be given to presenting an additional analysis using a mixed treatment comparison framework.

When evidence is combined using indirect or mixed treatment comparison frameworks, trial randomisation should be preserved. A comparison of the results from single treatment arms from different randomised trials is not acceptable unless the data are treated as observational and appropriate steps are taken to adjust for possible bias and increased uncertainty.

Analyses using indirect or mixed treatment comparison frameworks may include comparator interventions (including placebo) that have not been defined in the scope of the guideline if they are relevant to the development of the network of evidence. The rationale for the inclusion and exclusion of comparator interventions should be clearly reported. Again, the principles of good practice apply.

Evidence from a mixed treatment comparison can be presented in a variety of ways. The network of evidence can be presented as tables. It may also be presented diagrammatically as long as the direct and indirect treatment comparisons are clearly identified and the number of trials in each comparison is stated.

If sufficient relevant and valid data are not available to include in meta-analyses of head-to-head trials, or mixed or indirect treatment comparisons, the analysis may have to be restricted to a qualitative overview that critically appraises individual studies and presents their results. The results of this type of analysis should be approached with particular caution.

Further information on evidence synthesis is provided by the technical support documents developed by the NICE Decision Support Unit (DSU) .

6.2.4 Assessing study quality for cost effectiveness

Estimates of resource use obtained from clinical studies should be treated like other clinical outcomes and reviewed using the processes described above. Reservations about the applicability of these estimates to routine NHS practice should be noted in the economics evidence profile, in the same way as in a GRADE profile (see section 6.2.1.1 ), and taken into consideration by the GDG.

However, the criteria for appraising other economic estimates – such as costs, cost-effectiveness ratios and net benefits – are rather different, because these estimates are usually obtained using some form of modelling. In addition to formal decision-analytic models, this includes economic evaluations conducted alongside clinical trials. These usually require some external sources of information (for example, unit costs, health-state valuations or long-term prognostic data) and estimation procedures to predict long-term costs and outcomes. These considerations also apply to relatively simple cost calculations based on expert judgement or on observed resource use and unit cost data.

All economic estimates used to inform guideline recommendations should be appraised using the methodology checklist for economic evaluations ( appendix G ). This should be used to appraise unpublished economic evaluations, such as studies submitted by stakeholders and academic papers that are not yet published, as well as published papers. The same criteria should be applied to any new economic evaluations conducted for the guideline (see chapter 7 ).

The checklist ( appendix G ) includes a section on the applicability of the study to the specific question and the context for NICE decision-making (analogous to the GRADE 'directness' criterion). This checklist is designed to determine whether an economic evaluation provides evidence that is useful to inform GDG decision-making, analogous to the assessment of study limitations in GRADE.

The checklist includes an overall judgement on the applicability of the study to the guideline context, as follows:

Directly applicable – the study meets all applicability criteria, or fails to meet one or more applicability criteria but this is unlikely to change the conclusions about cost effectiveness.

Partially applicable – the study fails to meet one or more applicability criteria, and this could change the conclusions about cost effectiveness.

Not applicable – the study fails to meet one or more applicability criteria, and this is likely to change the conclusions about cost effectiveness. Such studies would usually be excluded from further consideration.

The checklist also includes an overall summary judgement on the methodological quality of economic evaluations, as follows:

Minor limitations – the study meets all quality criteria, or fails to meet one or more quality criteria but this is unlikely to change the conclusions about cost effectiveness.

Potentially serious limitations – the study fails to meet one or more quality criteria, and this could change the conclusions about cost effectiveness.

Very serious limitations – the study fails to meet one or more quality criteria, and this is highly likely to change the conclusions about cost effectiveness. Such studies should usually be excluded from further consideration.

The robustness of the study results to methodological limitations may sometimes be apparent from reported sensitivity analyses. If not, judgement will be needed to assess whether a limitation would be likely to change the results and conclusions.

If necessary, the health technology assessment checklist for decision-analytic models (Philips et al. 2004) may also be used to give a more detailed assessment of the methodological quality of modelling studies.

The judgements that the health economist makes using the checklist for economic evaluations (and the health technology assessment modelling checklist, if appropriate) should be recorded and presented in an appendix to the full guideline. The 'comments' column in the checklist should be used to record reasons for these judgements, as well as additional details about the studies where necessary.

6.2.5 Summarising and presenting results for cost effectiveness

Cost, cost effectiveness or net benefit estimates from published or unpublished studies, or from economic analyses conducted for the guideline, should be presented in an 'economic evidence profile' adapted from the GRADE profile (see appendix K ). Whenever a GRADE profile is presented in the full version of a NICE clinical guideline, it should be accompanied by relevant economic information (resource use, costs, cost effectiveness and/or net benefit estimates as appropriate). It should be explicitly stated if economic information is not available or if it is not thought to be relevant to the question.

The economic evidence profile includes columns for the overall assessments of study limitations and applicability described above. There is also a comments column where the health economist can note any particular issues that the GDG should consider when assessing the economic evidence. Footnotes should be used to explain the reasons for quality assessments, as in the standard GRADE profile.

The results of the economic evaluations included should be presented in the form of a best-available estimate or range for the incremental cost, the incremental effect and, where relevant, the incremental cost-effectiveness ratio or net benefit estimate. A summary of the extent of uncertainty about the estimates should also be presented in the economic evidence profile. This should reflect the results of deterministic or probabilistic sensitivity analyses or stochastic analyses of trial data, as appropriate.

Each economic evaluation included should usually be presented in a separate row of the economic evidence profile. If large numbers of economic evaluations of sufficiently high quality and applicability are available, a single row could be used to summarise a number of studies based on shared characteristics; this should be explicitly justified in a footnote.

Inconsistency between the results of economic evaluations will be shown by differences between rows of the economic evidence profile (a separate column examining 'consistency' is therefore unnecessary). The GDG should consider the implications of any unexplained differences between model results when assessing the body of clinical and economic evidence and drawing up recommendations. This includes clearly explaining the GDG's preference for certain results when forming recommendations.

If results are available for two or more patient subgroups, these should be presented in separate economic evidence profile tables or as separate rows within a single table.

Costs and cost-effectiveness estimates should be presented only for the appropriate incremental comparisons – where an intervention is compared with the next most expensive non-dominated option (a clinical strategy is said to 'dominate' the alternatives when it is both more effective and less costly; see section 7.3 ). If comparisons are relevant only for some groups of the population (for example, patients who cannot tolerate one or more of the other options, or for whom one or more of the options is contraindicated), this should be stated in a footnote to the economic evidence profile table.

A short evidence statement should be presented alongside the GRADE and economic evidence profile tables, summarising the key features of the evidence on clinical and cost effectiveness.

Questions about diagnosis are concerned with the performance of a diagnostic test or test strategy (see section 4.3.2 ). Note that 'test and treat' studies (in which the outcomes of patients who undergo a new diagnostic test in combination with a management strategy are compared with the outcomes of patients who receive the usual diagnostic test and management strategy) should be addressed in the same way as intervention studies (see section 6.2 ).

6.3.1 Assessing study quality

Studies of diagnostic test accuracy should be assessed using the methodology checklist for QUADAS-2 (Quality Assessment of Studies of Diagnostic Accuracy included in Systematic Reviews) ( appendix F ). Characteristics of data should be extracted to a standard template for inclusion in an evidence table (see appendix J2 ). Questions relating to diagnostic test accuracy are usually best answered by cross-sectional studies. Case–control studies can also be used, but these are more prone to bias and often result in inflated estimates of diagnostic test accuracy.

There is currently a lack of empirical evidence about the size and direction of bias contributed by specific aspects of the design and conduct of studies on diagnostic test accuracy. Making judgements about the overall quality of studies can therefore be difficult. Before starting the review, an assessment should be made to determine which quality appraisal criteria (from the QUADAS-2 checklist) are likely to be the most important indicators of quality for the particular question about diagnostic test accuracy being addressed. These criteria will be useful in guiding decisions about the overall quality of individual studies and whether to exclude certain studies, and when summarising and presenting the body of evidence for the question about diagnostic test accuracy as a whole (see section 6.3.2). Clinical input (for example, from a GDG member) may be needed to identify the most appropriate quality criteria.

6.3.2 Summarising and presenting results

No well designed and validated approach currently exists for summarising a body of evidence for studies on diagnostic test accuracy. In the absence of such a system, a narrative summary of the quality of the evidence should be given, based on the quality appraisal criteria from QUADAS-2 ( appendix F ) that were considered to be most important for the question being addressed (see section 6.3.1).

Numerical summaries of diagnostic test accuracy may be presented as tables to help summarise the available evidence. Meta-analysis of such estimates from different studies is possible, but is not widely used. If this is attempted, relevant published technical advice should be used to guide reviewers.

Numerical summaries and analyses should be followed by a short evidence statement summarising what the evidence shows.

These questions are described in section 4.3.3 .

6.4.1 Assessing study quality

Studies that are reviewed for questions about prognosis should be assessed using the methodology checklist for prognostic studies ( appendix I ). There is currently a lack of empirical evidence about the size and direction of bias contributed by specific aspects of the design and conduct of studies on prognosis. Making judgements about the overall quality of studies can therefore be difficult. Before starting the review, an assessment should be made to determine which quality appraisal criteria (from the checklist in appendix I ) are likely to be the most important indicators of quality for the particular question about prognosis being addressed. These criteria will be useful in guiding decisions about the overall quality of individual studies and whether to exclude certain studies, and when summarising and presenting the body of evidence for the question about prognosis as a whole (see section 6.4.2). Clinical input (for example, from a GDG member) may be needed to identify the most appropriate quality criteria.

6.4.2 Summarising and presenting results

No well designed and validated approach currently exists for summarising a body of evidence for studies on prognosis. A narrative summary of the quality of the evidence should therefore be given, based on the quality appraisal criteria from appendix I that were considered to be most important for the question being addressed (see section 6.4.1). Characteristics of data should be extracted to a standard template for inclusion in an evidence table (see appendix J3 ).

Results from the studies included may be presented as tables to help summarise the available evidence. Reviewers should be wary of using meta-analysis as a tool to summarise large observational studies, because the results obtained may give a spurious sense of confidence in the study results.

The narrative summary should be followed by a short evidence statement summarising what the evidence shows.

These questions are described in section 4.3.4 .

6.5.1 Assessing study quality

Studies about patient experience are likely to be qualitative studies or cross-sectional surveys. Qualitative studies should be assessed using the methodology checklist for qualitative studies ( appendix H ). It is important to consider which quality appraisal criteria from this checklist are likely to be the most important indicators of quality for the specific research question being addressed. These criteria may be helpful in guiding decisions about the overall quality of individual studies and whether to exclude certain studies, and when summarising and presenting the body of evidence for the research question about patient experience as a whole.

There is no methodology checklist for the quality appraisal of cross-sectional surveys. Such surveys should be assessed for the rigour of the process used to develop the questions and their relevance to the population under consideration, and for the existence of significant bias (for example, non-response bias).

6.5.2 Summarising and presenting results

A description of the quality of the evidence should be given, based on the quality appraisal criteria from appendix H that were considered to be the most important for the research question being addressed. If appropriate, the quality of the cross-sectional surveys included should also be summarised.

Consider presenting the quality assessment of included studies in tables (see table 1 in appendix H for an example). Methods to synthesise qualitative studies (for example, meta-ethnography) are evolving, but the routine use of such methods in guidelines is not currently recommended.

The narrative summary should be followed by a short evidence statement summarising what the evidence shows. Characteristics of data should be extracted to a standard template for inclusion in an evidence table (see appendix J4 ).

Relevant published guidelines from other organisations may be identified in the search for evidence. These should be assessed for quality using the AGREE II [ 10 ] (Appraisal of Guidelines Research and Evaluation II) instrument (Brouwers et al. 2010) to ensure that they have sufficient documentation to be considered. There is no cut-off point for accepting or rejecting a guideline, and each GDG will need to set its own parameters. These should be documented in the methods section of the full guideline, along with a summary of the assessment. The results should be presented as an appendix to the full guideline.

Reviews of evidence from other guidelines that cover questions formulated by the GDG may be considered as evidence if:

they are assessed using the appropriate methodology checklist from this manual and are judged to be of high quality

they are accompanied by an evidence statement and evidence table(s)

the evidence is updated according to the methodology for exceptional updates of NICE clinical guidelines (see section 14.4 ).

The GDG should create its own evidence summaries or statements. Evidence tables from other guidelines should be referenced with a direct link to the source website or a full reference of the published document. The GDG should formulate its own recommendations, taking into consideration the whole body of evidence.

Recommendations from other guidelines should not be quoted verbatim, except for recommendations from NHS policy or legislation (for example, Health and Social Care Act 2008).

Altman DG (2001) Systematic reviews of evaluations of prognostic variables. British Medical Journal 323: 224–8

Balshem H, Helfand M, Schünemann HJ et al. (2011) GRADE guidelines: 3. Rating the quality of evidence. Journal of Clinical Epidemiology 64: 401–6

Brouwers M, Kho ME, Browman GP et al. for the AGREE Next Steps Consortium (2010) AGREE II: advancing guideline development, reporting and evaluation in healthcare . Canadian Medical Association Journal 182: E839–42

Centre for Reviews and Dissemination (2009) Systematic reviews: CRD's guidance for undertaking reviews in health care . University of York: Centre for Reviews and Dissemination

Chiou CF, Hay JW, Wallace JF et al. (2003) Development and validation of a grading system for the quality of cost-effectiveness studies. Medical Care 41: 32–44

Drummond MF, O'Brien B, Stoddart GL et al. (1997) Critical assessment of economic evaluation. In: Methods for the economic evaluation of health care programmes, 2nd edition. Oxford: Oxford Medical Publications

Eccles M, Mason J (2001) How to develop cost-conscious guidelines. Health Technology Assessment 5: 1–69

Edwards P, Clarke M, DiGuiseppi C et al. (2002) Identification of randomized trials in systematic reviews: accuracy and reliability of screening records. Statistics in Medicine 21: 1635–40

Evers SMAA, Goossens M, de Vet H et al. (2005) Criteria list for assessment of methodological quality of economic evaluations: Consensus on Health Economic Criteria. International Journal of Technology Assessment in Health Care 21: 240–5

Guyatt GH, Oxman AD, Schünemann HJ et al. (2011) GRADE guidelines: a new series of articles in the Journal of Clinical Epidemiology. Journal of Clinical Epidemiology 64: 380–2

Guyatt GH, Oxman AD, Akl EA et al. (2011) GRADE guidelines: 1. Introduction – GRADE evidence profiles and summary of findings tables. Journal of Clinical Epidemiology 64: 383–94

Guyatt GH, Oxman AD, Kunz R et al. (2011) GRADE guidelines: 2. Framing the question and deciding on important outcomes. Journal of Clinical Epidemiology 64: 395–400

Guyatt GH, Oxman AD, Vist G et al. (2011) GRADE guidelines: 4. Rating the quality of evidence – study limitations (risk of bias). Journal of Clinical Epidemiology 64: 407–15

Guyatt GH, Oxman AD, Montori V et al. (2011) GRADE guidelines 5: Rating the quality of evidence – publication bias. Journal of Clinical Epidemiology 64: 1277–82

Guyatt G, Oxman AD, Kunz R et al. (2011) GRADE guidelines 6: Rating the quality of evidence – imprecision. Journal of Clinical Epidemiology 64: 1283–93

Guyatt GH, Oxmand AD, Kunz R et al. (2011) GRADE guidelines 7: Rating the quality of evidence – inconsistency. Journal of Clinical Epidemiology 64: 1294–302

Guyatt GH, Oxman AD, Kunz R et al. (2011) GRADE guidelines 8: Rating the quality of evidence – indirectness. Journal of Clinical Epidemiology 64: 1303–10

Guyatt GH, Oxman AD, Sultan S et al. (2011) GRADE guidelines 9: Rating up the quality of evidence. Journal of Clinical Epidemiology 64: 1311–6

Harbord RM, Deeks JJ, Egger M et al. (2007) A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics 8: 239–51

Higgins JPT, Green S, editors (2011) Cochrane handbook for systematic reviews of interventions. Version 5.1.0 (updated March 2011) [online]

Khan KS, Kunz R, Kleijnen J et al. (2003) Systematic reviews to support evidence-based medicine. How to review and apply findings of healthcare research. London: Royal Society of Medicine Press

Oxman AD, Guyatt GH (1992) A consumer's guide to subgroup analyses. Annals of Internal Medicine 116: 78–84

Philips Z, Ginnelly L, Sculpher M et al. (2004) Review of guidelines for good practice in decision-analytic modelling in health technology assessment. Health Technology Assessment 8: iii–iv, ix–xi, 1–158

Schünemann HJ, Best D, Vist G et al. for the GRADE Working Group (2003) Letters, numbers, symbols and words: how to communicate grades of evidence and recommendations. Canadian Medical Association Journal 169: 677–80

Schünemann HJ, Oxman AD, Brozek J et al. for the GRADE Working Group (2008) Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. British Medical Journal 336: 1106–10

Scottish Intercollegiate Guidelines Network (2008) SIGN 50. A guideline developer's handbook, revised edition, January. Edinburgh: Scottish Intercollegiate Guidelines Network

Sharp SJ, Thompson SG (2000) Analysing the relationship between treatment effect and underlying risk in meta-analysis: comparison and development of approaches. Statistics in Medicine 19: 3251–74

Whiting PF, Rutjes AWS, Westwood ME et al. and the QUADAS-2 group (2011) QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Annals of Internal Medicine 155: 529–36

[ 10 ] For more details about AGREE II, see the AGREE Enterprise website .

An official website of the United States government

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Publications
Account settings
Advanced Search
Journal List

PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews

Matthew j page, david moher, patrick m bossuyt, isabelle boutron, tammy c hoffmann, cynthia d mulrow, larissa shamseer, jennifer m tetzlaff, sue e brennan, julie glanville, jeremy m grimshaw, asbjørn hróbjartsson, manoj m lalu, tianjing li, elizabeth w loder, evan mayo-wilson, steve mcdonald, luke a mcguinness, lesley a stewart, james thomas, andrea c tricco, vivian a welch, penny whiting, joanne e mckenzie.

Author information
Article notes
Copyright and License information

Correspondence to: M Page [email protected]

Corresponding author.

Accepted 2021 Jan 4; Collection date 2021.

The methods and results of systematic reviews should be reported in sufficient detail to allow users to assess the trustworthiness and applicability of the review findings. The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement was developed to facilitate transparent and complete reporting of systematic reviews and has been updated (to PRISMA 2020) to reflect recent advances in systematic review methodology and terminology. Here, we present the explanation and elaboration paper for PRISMA 2020, where we explain why reporting of each item is recommended, present bullet points that detail the reporting recommendations, and present examples from published reviews. We hope that changes to the content and structure of PRISMA 2020 will facilitate uptake of the guideline and lead to more transparent, complete, and accurate reporting of systematic reviews.

Systematic reviews are essential for healthcare providers, policy makers, and other decision makers, who would otherwise be confronted by an overwhelming volume of research on which to base their decisions. To allow decision makers to assess the trustworthiness and applicability of review findings, reports of systematic reviews should be transparent and complete. Furthermore, such reporting should allow others to replicate or update reviews. The Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement published in 2009 (hereafter referred to as PRISMA 2009) 1 2 3 4 5 6 7 8 9 10 11 12 was designed to help authors prepare transparent accounts of their reviews, and its recommendations have been widely endorsed and adopted. 13 We have updated the PRISMA 2009 statement (to PRISMA 2020) to ensure currency and relevance and to reflect advances in systematic review methodology and terminology.

Summary points.

The PRISMA 2020 statement includes a checklist of 27 items to guide reporting of systematic reviews

In this article we explain why reporting of each item is recommended, present bullet points that detail the reporting recommendations, and present examples from published reviews

We hope that uptake of the PRISMA 2020 statement will lead to more transparent, complete, and accurate reporting of systematic reviews, thus facilitating evidence based decision making

Scope of this guideline

The PRISMA 2020 statement has been designed primarily for systematic reviews of studies that evaluate the effects of health interventions, irrespective of the design of the included studies. However, the checklist items are applicable to reports of systematic reviews evaluating other non-health-related interventions (for example, social or educational interventions), and many items are applicable to systematic reviews with objectives other than evaluating interventions (such as evaluating aetiology, prevalence, or prognosis). PRISMA 2020 is intended for use in systematic reviews that include synthesis (such as pairwise meta-analysis or other statistical synthesis methods) or do not include synthesis (for example, because only one eligible study is identified). The PRISMA 2020 items are relevant for mixed-methods systematic reviews (which include quantitative and qualitative studies), but reporting guidelines addressing the presentation and synthesis of qualitative data should also be consulted. 14 15 PRISMA 2020 can be used for original systematic reviews, updated systematic reviews, or continually updated (“living”) systematic reviews. However, for updated and living systematic reviews, there may be some additional considerations that need to be addressed. Extensions to the PRISMA 2009 statement have been developed to guide reporting of network meta-analyses, 16 meta-analyses of individual participant data, 17 systematic reviews of harms, 18 systematic reviews of diagnostic test accuracy studies, 19 and scoping reviews 20 ; for these types of reviews we recommend authors report their review in accordance with the recommendations in PRISMA 2020 along with the guidance specific to the extension. Separate guidance for items that should be described in protocols of systematic reviews is available (PRISMA-P 2015 statement). 21 22

PRISMA 2020 explanation and elaboration

PRISMA 2020 is published as a suite of three papers: a statement paper (consisting of the 27-item checklist, an expanded checklist that details reporting recommendations for each item, the PRISMA 2020 abstract checklist, and the revised flow diagram 23 ); a development paper (which outlines the steps taken to update the PRISMA 2009 statement and provides rationale for modifications to the original items 24 ); and this paper, the updated explanation and elaboration for PRISMA 2020. In this paper, for each item, we explain why reporting of the item is recommended and present bullet points that detail the reporting recommendations. This structure is new to PRISMA 2020 and has been adopted to facilitate implementation of the guidance. 25 26 Authors familiar with PRISMA 2020 may opt to use the standalone statement paper 23 ; however, for those who are new to or unfamiliar with PRISMA 2020, we encourage use of this explanation and elaboration document. Box 1 includes a glossary of terms used throughout the PRISMA 2020 explanation and elaboration paper.

Box 1. Glossary of terms.

Systematic review —A review that uses explicit, systematic methods to collate and synthesize findings of studies that address a clearly formulated question 27

Meta-analysis of effect estimates —A statistical technique used to synthesize results when study effect estimates and their variances are available, yielding a quantitative summary of results 28

Outcome —An event or measurement collected for participants in a study (such as quality of life, mortality)

Result —The combination of a point estimate (such as a mean difference, risk ratio or proportion) and a measure of its precision (such as a confidence/credible interval) for a particular outcome

We use standardised language in the explanation and elaboration to indicate whether the reporting recommendations for each item (which we refer to as “elements” throughout) are essential or additional. Essential elements should be reported in the main report or as supplementary material for all systematic reviews (except for those preceded by “If…,” which should only be reported where applicable). These have been selected as essential because we consider their reporting important for users to assess the trustworthiness and applicability of a review’s findings, or their reporting would aid in reproducing the findings. Additional elements are those which are not essential but provide supplementary information that may enhance the completeness and usability of systematic review reports. The essential and additional elements are framed in terms of reporting the “presence” of a method or result (such as reporting if individuals were contacted to identify studies) rather than reporting on their absence. In some instances, however, reporting the absence of a method may be helpful (for example, “We did not contact individuals to identify studies”). We leave these decisions to the judgment of authors. Finally, although PRISMA 2020 provides a template for where information might be located, the suggested location should not be seen as prescriptive; the guiding principle is to ensure the information is reported.

We sought examples of good reporting for each checklist item from published systematic reviews and present one for each item below; more examples are available in table S1 in the data supplement on bmj.com. We have edited the examples by removing all citations within them (to avoid potential confusion with the citation for each example), and we spelled out abbreviations to aid comprehension. We encourage readers to submit evidence that informs any of the recommendations in PRISMA 2020 and any examples that could be added to our bank of examples of good reporting (via the PRISMA statement website http://www.prisma-statement.org/ ).

Item 1. Identify the report as a systematic review

Explanation: Inclusion of “systematic review” in the title facilitates identification by potential users (patients, healthcare providers, policy makers, etc) and appropriate indexing in databases. Terms such as “review,” “literature review,” “evidence synthesis,” or “knowledge synthesis” are not recommended because they do not distinguish systematic and non-systematic approaches. We also discourage using the terms “systematic review” and “meta-analysis” interchangeably because a systematic review refers to the entire set of processes used to identify, select, and synthesise evidence, whereas meta-analysis refers only to the statistical synthesis. Furthermore, a meta-analysis can be done outside the context of a systematic review (for example, when researchers meta-analyse results from a limited set of studies that they have conducted).

Essential elements

Identify the report as a systematic review in the title.

Report an informative title that provides key information about the main objective or question that the review addresses (for reviews of interventions, this usually includes the population and the intervention(s) that the review addresses).

Additional elements

Consider providing additional information in the title, such as the method of analysis used (for example, “a systematic review with meta-analysis”), the designs of included studies (for example, “a systematic review of randomised trials”), or an indication that the review is an update of an existing review or a continually updated (“living”) systematic review.

Example of item 1 of PRISMA 2020 checklist.

“Comparison of the therapeutic effects of rivaroxaban versus warfarin in antiphospholipid syndrome: a systematic review” 167

Item 2. See the PRISMA 2020 for Abstracts checklist ( box 2 )

Box 2. items in the prisma 2020 for abstracts checklist..

The PRISMA 2020 for Abstracts checklist retains the same items as those included in the PRISMA for Abstracts statement published in 2013 29 but has been revised to make the wording consistent with the PRISMA 2020 statement and includes a new item recommending authors specify the methods used to present and synthesize results (item #6). The checklist includes the following 12 items:

Identify the report as a systematic review

Provide an explicit statement of the main objective(s) or question(s) the review addresses

Specify the inclusion and exclusion criteria for the review

Specify the information sources (such as databases, registers) used to identify studies and the date when each was last searched

Specify the methods used to assess risk of bias in the included studies

Specify the methods used to present and synthesise results

Give the total number of included studies and participants and summarise relevant characteristics of studies

Present results for main outcomes, preferably indicating the number of included studies and participants for each. If meta-analysis was done, report the summary estimate and confidence/credible interval. If comparing groups, indicate the direction of the effect (that is, which group is favoured)

Provide a brief summary of the limitations of the evidence included in the review (such as study risk of bias, inconsistency, and imprecision)

Provide a general interpretation of the results and important implications

Specify the primary source of funding for the review

Provide the register name and registration number

Explanation: An abstract providing key information about the main objective(s) or question(s) that the review addresses, methods, results, and implications of the findings should help readers decide whether to access the full report. 29 For some readers, the abstract may be all that they have access to. Therefore, it is critical that results are presented for all main outcomes for the main review objective(s) or question(s) regardless of the statistical significance, magnitude, or direction of effect. Terms presented in the abstract will be used to index the systematic review in bibliographic databases. Therefore, reporting keywords that accurately describe the review question (such as population, interventions, outcomes) is recommended.

Report an abstract addressing each item in the PRISMA 2020 for Abstracts checklist (see box 2 ).

Example of item 2 of PRISMA 2020 checklist.

“Title: Psychological interventions for common mental disorders in women experiencing intimate partner violence in low-income and middle-income countries: a systematic review and meta-analysis.

Background: Evidence on the effectiveness of psychological interventions for women with common mental disorders (CMDs) who also experience intimate partner violence is scarce. We aimed to test our hypothesis that exposure to intimate partner violence would reduce intervention effectiveness for CMDs in low-income and middle-income countries (LMICs).

Methods: For this systematic review and meta-analysis, we searched MEDLINE, Embase, PsycINFO, Web of Knowledge, Scopus, CINAHL, LILACS, ScieELO, Cochrane, PubMed databases, trials registries, 3ie, Google Scholar, and forward and backward citations for studies published between database inception and Aug 16, 2019. All randomised controlled trials (RCTs) of psychological interventions for CMDs in LMICs which measured intimate partner violence were included, without language or date restrictions. We approached study authors to obtain unpublished aggregate subgroup data for women who did and did not report intimate partner violence. We did separate random-effects meta-analyses for anxiety, depression, post-traumatic stress disorder (PTSD), and psychological distress outcomes. Evidence from randomised controlled trials was synthesised as differences between standardised mean differences (SMDs) for change in symptoms, comparing women who did and who did not report intimate partner violence via random-effects meta-analyses. The quality of the evidence was assessed with the Cochrane risk of bias tool. This study is registered on PROSPERO, number CRD42017078611.

Findings: Of 8122 records identified, 21 were eligible and data were available for 15 RCTs, all of which had a low to moderate risk of overall bias. Anxiety (five interventions, 728 participants) showed a greater response to intervention among women reporting intimate partner violence than among those who did not (difference in standardised mean differences [dSMD] 0.31, 95% CI 0.04 to 0.57, I 2 =49.4%). No differences in response to intervention were seen in women reporting intimate partner violence for PTSD (eight interventions, n=1436; dSMD 0.14, 95% CI −0.06 to 0.33, I 2 =42.6%), depression (12 interventions, n=2940; 0.10, −0.04 to 0.25, I 2 =49.3%), and psychological distress (four interventions, n=1591; 0.07, −0.05 to 0.18, I 2 =0.0%, p=0.681).

Interpretation: Psychological interventions treat anxiety effectively in women with current or recent intimate partner violence exposure in LMICs when delivered by appropriately trained and supervised health-care staff, even when not tailored for this population or targeting intimate partner violence directly. Future research should investigate whether adapting evidence-based psychological interventions for CMDs to address intimate partner violence enhances their acceptability, feasibility, and effectiveness in LMICs.

Funding: UK National Institute for Health Research ASSET and King's IoPPN Clinician Investigator Scholarship.” 168

Item 3. Describe the rationale for the review in the context of existing knowledge

Explanation: Describing the rationale should help readers understand why the review was conducted and what the review might add to existing knowledge.

Describe the current state of knowledge and its uncertainties.

Articulate why it is important to do the review.

If other systematic reviews addressing the same (or a largely similar) question are available, explain why the current review was considered necessary (for example, previous reviews are out of date or have discordant results; new review methods are available to address the review question; existing reviews are methodologically flawed; or the current review was commissioned to inform a guideline or policy for a particular organisation). If the review is an update or replication of a particular systematic review, indicate this and cite the previous review.

If the review examines the effects of interventions, also briefly describe how the intervention(s) examined might work.

If there is complexity in the intervention or context of its delivery, or both (such as multi-component interventions, interventions targeting the population and individual level, equity considerations 30 ), consider presenting a logic model (sometimes referred to as a conceptual framework or theory of change) to visually display the hypothesised relationship between intervention components and outcomes. 31 32

Example of item 3 of PRISMA 2020 checklist.

“To contain widespread infection and to reduce morbidity and mortality among health-care workers and others in contact with potentially infected people, jurisdictions have issued conflicting advice about physical or social distancing. Use of face masks with or without eye protection to achieve additional protection is debated in the mainstream media and by public health authorities, in particular the use of face masks for the general population; moreover, optimum use of face masks in health-care settings, which have been used for decades for infection prevention, is facing challenges amid personal protective equipment (PPE) shortages. Any recommendations about social or physical distancing, and the use of face masks, should be based on the best available evidence. Evidence has been reviewed for other respiratory viral infections, mainly seasonal influenza, but no comprehensive review is available of information on SARS-CoV-2 or related betacoronaviruses that have caused epidemics, such as severe acute respiratory syndrome (SARS) or Middle East respiratory syndrome (MERS). We, therefore, systematically reviewed the effect of physical distance, face masks, and eye protection on transmission of SARS-CoV-2, SARS-CoV, and MERS-CoV.” 169

Item 4. Provide an explicit statement of the objective(s) or question(s) the review addresses

Explanation: An explicit and concise statement of the review objective(s) or question(s) will help readers understand the scope of the review and assess whether the methods used in the review (such as eligibility criteria, search methods, data items, and the comparisons used in the synthesis) adequately address the objective(s). Such statements may be written in the form of objectives (“the objectives of the review were to examine the effects of…”) or as questions (“what are the effects of…?”). 31

Provide an explicit statement of all objective(s) or question(s) the review addresses, expressed in terms of a relevant question formulation framework (see Booth et al 33 and Munn et al 34 for various frameworks).

If the purpose is to evaluate the effects of interventions, use the Population, Intervention, Comparator, Outcome (PICO) framework or one of its variants to state the comparisons that will be made.

Example of item 4 of PRISMA 2020 checklist.

“Objectives: To evaluate the benefits and harms of down‐titration (dose reduction, discontinuation, or disease activity‐guided dose tapering) of anti‐tumour necrosis factor-blocking agents (adalimumab, certolizumab pegol, etanercept, golimumab, infliximab) on disease activity, functioning, costs, safety, and radiographic damage compared with usual care in people with rheumatoid arthritis and low disease activity.” 170

Eligibility criteria

Item 5. specify the inclusion and exclusion criteria for the review and how studies were grouped for the syntheses.

Explanation: Specifying the criteria used to decide what evidence was eligible or ineligible in sufficient detail should enable readers to understand the scope of the review and verify inclusion decisions. 35 The PICO framework is commonly used to structure the reporting of eligibility criteria for reviews of interventions. 36 In addition to specifying the review PICO, the intervention, outcome, and population groups that were used in the syntheses need to be identified and defined. 37 For example, in a review examining the effects of psychological interventions for smoking cessation in pregnancy, the authors specified intervention groups (counselling, health education, feedback, incentive-based interventions, social support, and exercise) and the defining components of each group. 38

Specify all study characteristics used to decide whether a study was eligible for inclusion in the review, that is, components described in the PICO framework or one of its variants, 33 34 and other characteristics, such as eligible study design(s) and setting(s) and minimum duration of follow-up.

Specify eligibility criteria with regard to report characteristics, such as year of dissemination, language, and report status (for example, whether reports such as unpublished manuscripts and conference abstracts were eligible for inclusion).

Clearly indicate if studies were ineligible because the outcomes of interest were not measured, or ineligible because the results for the outcome of interest were not reported. Reporting that studies were excluded because they had “no relevant outcome data” is ambiguous and should be avoided. 39

Specify any groups used in the synthesis (such as intervention, outcome, and population groups) and link these to the comparisons specified in the objectives (item #4).

Consider providing rationales for any notable restrictions to study eligibility. For example, authors might explain that the review was restricted to studies published from 2000 onward because that was the year the device was first available.

Example of item 5 of PRISMA 2020 checklist.

“Population: We included randomized controlled trials of adult (age ≥18 years) patients undergoing non-cardiac surgery, excluding organ transplantation surgery (as findings in patients who need immunosuppression may not be generalisable to others).

“Intervention: We considered all perioperative care interventions identified by the search if they were protocolised (therapies were systematically provided to patients according to pre-defined algorithm or plan) and were started and completed during the perioperative pathway (that is, during preoperative preparation for surgery, intraoperative care, or inpatient postoperative recovery). Examples of interventions that we did or did not deem perioperative in nature included long term preoperative drug treatment (not included, as not started and completed during the perioperative pathway) and perioperative physiotherapy interventions (included, as both started and completed during the perioperative pathway). We excluded studies in which the intervention was directly related to surgical technique.

Outcomes: To be included, a trial had to use a defined clinical outcome relating to postoperative pulmonary complications, such as “pneumonia” diagnosed according to the Centers for Disease Control and Prevention’s definition. Randomized controlled trials reporting solely physiological (for example, lung volumes and flow measurements) or biochemical (for example, lung inflammatory markers) outcomes are valuable but neither patient centric nor necessarily clinically relevant, and we therefore excluded them. We applied no language restrictions. Our primary outcome measure was the incidence of postoperative pulmonary complications, with postoperative pulmonary complications being defined as the composite of any of respiratory infection, respiratory failure, pleural effusion, atelectasis, or pneumothorax…Where a composite postoperative pulmonary complication was not reported, we contacted corresponding authors via email to request additional information, including primary data.” 171

Information sources

Item 6. specify all databases, registers, websites, organisations, reference lists, and other sources searched or consulted to identify studies. specify the date when each source was last searched or consulted.

Explanation: Authors should provide a detailed description of the information sources, such as bibliographic databases, registers and reference lists that were searched or consulted, including the dates when each source was last searched, to allow readers to assess the completeness and currency of the systematic review, and facilitate updating. 40 Authors should fully report the “what, when, and how” of the sources searched; the “what” and “when” are covered in item #6, and the “how” is covered in item #7. Further guidance and examples about searching can be found in PRISMA-Search, an extension to the PRISMA statement for reporting literature searches in systematic reviews. 41

Specify the date when each source (such as database, register, website, organisation) was last searched or consulted.

If bibliographic databases were searched, specify for each database its name (such as MEDLINE, CINAHL), the interface or platform through which the database was searched (such as Ovid, EBSCOhost), and the dates of coverage (where this information is provided).

If study registers (such as ClinicalTrials.gov), regulatory databases (such as Drugs@FDA), and other online repositories (such as SIDER Side Effect Resource) were searched, specify the name of each source and any date restrictions that were applied.

If websites, search engines, or other online sources were browsed or searched, specify the name and URL (uniform resource locator) of each source.

If organisations or manufacturers were contacted to identify studies, specify the name of each source.

If individuals were contacted to identify studies, specify the types of individuals contacted (such as authors of studies included in the review or researchers with expertise in the area).

If reference lists were examined, specify the types of references examined (such as references cited in study reports included in the systematic review, or references cited in systematic review reports on the same or a similar topic).

If cited or citing reference searches (also called backwards and forward citation searching) were conducted, specify the bibliographic details of the reports to which citation searching was applied, the citation index or platform used (such as Web of Science), and the date the citation searching was done.

If journals or conference proceedings were consulted, specify the names of each source, the dates covered and how they were searched (such as handsearching or browsing online).

Example of item 6 of PRISMA 2020 checklist.

“On 21 December 2017, MAJ searched 16 health, social care, education, and legal databases, the names and date coverage of which are given in the Table 1 …We also carried out a ‘snowball’ search to identify additional studies by searching the reference lists of publications eligible for full-text review and using Google Scholar to identify and screen studies citing them…On 26 April 2018, we conducted a search of Google Scholar and additional supplementary searches for publications on websites of 10 relevant organisations (including government departments, charities, think-tanks, and research institutes). Full details of these supplementary searches can be found in the Additional file. Finally, we updated the database search on 7 May 2019, and the snowball and additional searches on 10 May 2019 as detailed in the Additional file. We used the same search method, except that we narrowed the searches to 2017 onwards.” 172

The table displays for each database consulted its name (such as MEDLINE), the interface or platform through which the database was searched (such as Ovid), and the dates of coverage (reproduced from Jay et al 172 )

Search strategy

Item 7. present the full search strategies for all databases, registers, and websites, including any filters and limits used.

Explanation: Reporting the full details of all search strategies (such as the full, line by line search strategy as run in each database) should enhance the transparency of the systematic review, improve replicability, and enable a review to be more easily updated. 40 42 Presenting only one search strategy from among several hinders readers’ ability to assess how comprehensive the searchers were and does not provide them with the opportunity to detect any errors. Furthermore, making only one search strategy available limits replication or updating of the searches in the other databases, as the search strategies would need to be reconstructed through adaptation of the one(s) made available. As well as reporting the search strategies, a description of the search strategy development process can help readers judge how far the strategy is likely to have identified all studies relevant to the review’s inclusion criteria. The description of the search strategy development process might include details of the approaches used to identify keywords, synonyms, or subject indexing terms used in the search strategies, or any processes used to validate or peer review the search strategies. Empirical evidence suggests that peer review of search strategies is associated with improvements to search strategies, leading to retrieval of additional relevant records. 43 Further guidance and examples of reporting search strategies can be found in PRISMA-Search. 41

Provide the full line by line search strategy as run in each database with a sophisticated interface (such as Ovid), or the sequence of terms that were used to search simpler interfaces, such as search engines or websites.

Describe any limits applied to the search strategy (such as date or language) and justify these by linking back to the review’s eligibility criteria.

If published approaches such as search filters designed to retrieve specific types of records (for example, filter for randomised trials) 44 or search strategies from other systematic reviews, were used, cite them. If published approaches were adapted—for example, if existing search filters were amended—note the changes made.

If natural language processing or text frequency analysis tools were used to identify or refine keywords, synonyms, or subject indexing terms to use in the search strategy, 45 46 specify the tool(s) used.

If a tool was used to automatically translate search strings for one database to another, 47 specify the tool used.

If the search strategy was validated—for example, by evaluating whether it could identify a set of clearly eligible studies—report the validation process used and specify which studies were included in the validation set. 40

If the search strategy was peer reviewed, report the peer review process used and specify any tool used, such as the Peer Review of Electronic Search Strategies (PRESS) checklist. 48

If the search strategy structure adopted was not based on a PICO-style approach, describe the final conceptual structure and any explorations that were undertaken to achieve it (for example, use of a multi-faceted approach that uses a series of searches, with different combinations of concepts, to capture a complex research question, or use of a variety of different search approaches to compensate for when a specific concept is difficult to define). 40

Example of item 7 of PRISMA 2020 checklist.

Note: the following is an abridged version of an example presented in full in supplementary table S1 on bmj.com.

“MEDLINE(R) In-Process & Other Non-Indexed Citations and Ovid MEDLINE were searched via OvidSP. The database coverage was 1946 to present and the databases were searched on 29 August 2013.

Urinary Bladder, Overactive/

((overactiv$ or over-activ$ or hyperactiv$ or hyper-activ$ or unstable or instability or incontinen$) adj3 bladder$).ti,ab.

(OAB or OABS or IOAB or IOABS).ti,ab.

(urge syndrome$ or urge frequenc$).ti,ab.

((overactiv$ or over-activ$ or hyperactiv$ or hyper-activ$ or unstable or instability) adj3 detrusor$).ti,ab.

Urination Disorders/

exp Urinary Incontinence/

Urinary Bladder Diseases/

(urge$ adj3 incontinen$).ti,ab.

(urin$ adj3 (incontinen$ or leak$ or urgen$ or frequen$)).ti,ab.

(urin$ adj3 (disorder$ or dysfunct$)).ti,ab.

(detrusor$ adj3 (hyperreflexia$ or hyper-reflexia$ or hypertoni$ or hyper-toni$)).ti,ab.

(void$ adj3 (disorder$ or dysfunct$)).ti,ab.

(micturition$ adj3 (disorder$ or dysfunct$)).ti,ab.

exp Enuresis/

(nocturia or nycturia or enuresis).ti,ab.

(mirabegron or betmiga$ or myrbetriq$ or betanis$ or YM-178 or YM178 or 223673-61-8 or “223673618” or MVR3JL3B2V).ti,ab,rn.

exp Electric Stimulation Therapy/

Electric Stimulation/

((sacral or S3) adj3 (stimulat$ or modulat$)).ti,ab.

(neuromodulat$ or neuro-modulat$ or neural modulat$ or electromodulat$ or electro-modulat$ or neurostimulat$ or neuro-stimulat$ or neural stimulat$ or electrostimulat$ or electro-stimulat$).ti,ab.

(InterStim or SNS).ti,ab.

((electric$ or nerve$1) adj3 (stimulat$ or modulat$)).ti,ab.

(electric$ therap$ or electrotherap$ or electro-therap$).ti,ab.

TENS.ti,ab.

exp Electrodes/

electrode$1.ti,ab.

((implant$ or insert$) adj3 pulse generator$).ti,ab.

((implant$ or insert$) adj3 (neuroprosthe$ or neuro-prosthe$ or neural prosthe$)).ti,ab.

PTNS.ti,ab.

(SANS or Stoller Afferent or urosurg$).ti,ab.

(evaluat$ adj3 peripheral nerve$).ti,ab.

exp Botulinum Toxins/

(botulinum$ or botox$ or onabotulinumtoxin$ or 1309378-01-5 or “1309378015”).ti,ab,rn.

randomized controlled trial.pt.

controlled clinical trial.pt.

random$.ti,ab.

placebo.ti,ab.

drug therapy.fs.

trial.ti,ab.

animals/ not humans/

limit 49 to english language

Search strategy development process: Five known relevant studies were used to identify records within databases. Candidate search terms were identified by looking at words in the titles, abstracts and subject indexing of those records. A draft search strategy was developed using those terms and additional search terms were identified from the results of that strategy. Search terms were also identified and checked using the PubMed PubReMiner word frequency analysis tool. The MEDLINE strategy makes use of the Cochrane RCT filter reported in the Cochrane Handbook v5.2. As per the eligibility criteria the strategy was limited to English language studies. The search strategy was validated by testing whether it could identify the five known relevant studies and also three further studies included in two systematic reviews identified as part of the strategy development process. All eight studies were identified by the search strategies in MEDLINE and Embase. The strategy was developed by an information specialist and the final strategies were peer reviewed by an experienced information specialist within our team. Peer review involved proofreading the syntax and spelling and overall structure, but did not make use of the PRESS checklist.” 173

Selection process

Item 8. specify the methods used to decide whether a study met the inclusion criteria of the review, including how many reviewers screened each record and each report retrieved, whether they worked independently, and, if applicable, details of automation tools used in the process.

Explanation: Study selection is typically a multi-stage process in which potentially eligible studies are first identified from screening titles and abstracts, then assessed through full text review and, where necessary, contact with study investigators. Increasingly, a mix of screening approaches might be applied (such as automation to eliminate records before screening or prioritise records during screening). In addition to automation, authors increasingly have access to screening decisions that are made by people independent of the author team (such as crowdsourcing) (see box 3 ). Authors should describe in detail the process for deciding how records retrieved by the search were considered for inclusion in the review, to enable readers to assess the potential for errors in selection. 49 50 51 52

Box 3. Study selection methods.

Several approaches to selecting studies exist. Here we comment on the advantages and disadvantages of each.

Assessment of each record by one reviewer— Single screening is an efficient use of time and resources, but there is a higher risk of missing relevant studies 49 50 51

Assessment of records by more than one reviewer— Double screening can vary from duplicate checking of all records (by two or more reviewers independently) to a second reviewer checking a sample only (for example, a random sample of screened records, or all excluded records). This approach may be more reliable than single screening but at the expense of increased reviewer time, given the time needed to resolve discrepancies 49 50 51

Priority screening to focus early screening effort on most relevant records— Instead of screening records in year, title, author or random order, machine learning is used to identify relevant studies earlier in the screening process than would otherwise be the case. Priority screening is an iterative process in which the machine continually reassesses unscreened records for relevance. This approach can increase review efficiency by enabling the review team to start on subsequent steps of the review while less relevant records are still being screened. Both single and multiple reviewer assessments can be combined with priority screening 52 53

Priority screening with the automatic elimination of less relevant records— Once the most relevant records have been identified using priority screening, teams may choose to stop screening based on the assumption that the remaining records are unlikely to be relevant. However, there is a risk of erroneously excluding relevant studies because of uncertainty about when it is safe to stop screening; the balance between efficiency gains and risk tolerance will be review-specific 52 53

Machine learning classifiers— Machine learning classifiers are statistical models that use training data to rank records according to their relevance. They can be calibrated to achieve a given level of recall, thus enabling reviewers to implement screening rules, such as eliminating records or replacing double with single screening. Because the performance of classifiers is highly dependent on the data used to build them, classifiers should only be used to classify records for which they are designed 53 54

Previous “known” assessments— Screening decisions for records that have already been manually checked can be reused to exclude the same records from being reassessed, provided the eligibility criteria are the same. For example, groups that maintain registers of controlled trials to facilitate systematic reviews can avoid continually rescreening the same records by matching and then including/excluding those records from further consideration.

Crowdsourcing— Crowdsourcing involves recruiting (usually via the internet) a large group of individuals to contribute to a task or project, such as screening records. If crowdsourcing is integrated with other study selection approaches, the specific platforms used should have well established and documented agreement algorithms, and data on crowd accuracy and reliability 55 56

Essential elements for systematic reviews regardless of the selection processes used

Report how many reviewers screened each record (title/abstract) and each report retrieved, whether multiple reviewers worked independently (that is, were unaware of each other’s decisions) at each stage of screening or not (for example, records screened by one reviewer and exclusions verified by another), and any processes used to resolve disagreements between screeners (for example, referral to a third reviewer or by consensus).

Report any processes used to obtain or confirm relevant information from study investigators.

If abstracts or articles required translation into another language to determine their eligibility, report how these were translated (for example, by asking a native speaker or by using software programs).

Essential elements for systematic reviews using automation tools in the selection process

Report how automation tools were integrated within the overall study selection process; for example, whether records were excluded based solely on a machine assessment or whether machine assessments were used to double-check human decisions.

If an externally derived machine learning classifier was applied (such as Cochrane RCT Classifier), either to eliminate records or to replace a single screener, include a reference or URL to the version used. If the classifier was used to eliminate records before screening, report the number eliminated in the PRISMA flow diagram as “Records marked as ineligible by automation tools.”

If an internally derived machine learning classifier was used to assist with the screening process, identify the software/classifier and version, describe how it was used (such as to remove records or replace a single screener) and trained (if relevant), and what internal or external validation was done to understand the risk of missed studies or incorrect classifications. For example, authors might state that the classifier was trained on the set of records generated for the review in question (as may be the case when updating reviews) and specify which thresholds were applied to remove records.

If machine learning algorithms were used to prioritise screening (whereby unscreened records are continually re-ordered based on screening decisions), state the software used and provide details of any screening rules applied (for example, screening stopped altogether leaving some records to be excluded based on automated assessment alone, or screening switched from double to single screening once a pre-specified number or proportion of consecutive records was eliminated).

Essential elements for systematic reviews using crowdsourcing or previous “known” assessments in the selection process

If crowdsourcing was used to screen records, provide details of the platform used and specify how it was integrated within the overall study selection process.

If datasets of already-screened records were used to eliminate records retrieved by the search from further consideration, briefly describe the derivation of these datasets. For example, if prior work has already determined that a given record does not meet the eligibility criteria, it can be removed without manual checking. This is the case for Cochrane’s Screen4Me service, in which an increasingly large dataset of records that are known not to represent randomised trials can be used to eliminate any matching records from further consideration.

Example of item 8 of PRISMA 2020 checklist.

“Three researchers (AP, HB-R, FG) independently reviewed titles and abstracts of the first 100 records and discussed inconsistencies until consensus was obtained. Then, in pairs, the researchers independently screened titles and abstracts of all articles retrieved. In case of disagreement, consensus on which articles to screen full-text was reached by discussion. If necessary, the third researcher was consulted to make the final decision. Next, two researchers (AP, HB-R) independently screened full-text articles for inclusion. Again, in case of disagreement, consensus was reached on inclusion or exclusion by discussion and if necessary, the third researcher (FG) was consulted.” 174

For examples of systematic reviews using automation tools, crowdsourcing, or previous “known” assessments in the selection process, see supplementary table S1 on bmj.com

Data collection process

Item 9. specify the methods used to collect data from reports, including how many reviewers collected data from each report, whether they worked independently, any processes for obtaining or confirming data from study investigators, and, if applicable, details of automation tools used in the process.

Explanation: Authors should report the methods used to collect data from reports of included studies, to enable readers to assess the potential for errors in the data presented. 57 58 59

Report how many reviewers collected data from each report, whether multiple reviewers worked independently or not (for example, data collected by one reviewer and checked by another), 60 and any processes used to resolve disagreements between data collectors.

Report any processes used to obtain or confirm relevant data from study investigators (such as how they were contacted, what data were sought, and success in obtaining the necessary information).

If any automation tools were used to collect data, report how the tool was used (such as machine learning models to extract sentences from articles relevant to the PICO characteristics), 61 62 how the tool was trained, and what internal or external validation was done to understand the risk of incorrect extractions.

If articles required translation into another language to enable data collection, report how these articles were translated (for example, by asking a native speaker or by using software programs). 63

If any software was used to extract data from figures, 64 specify the software used.

If any decision rules were used to select data from multiple reports corresponding to a study, and any steps were taken to resolve inconsistencies across reports, report the rules and steps used. 65

Example of item 9 of PRISMA 2020 checklist.

“We designed a data extraction form based on that used by Lumley 2009, which two review authors (RC and TC) used to extract data from eligible studies. Extracted data were compared, with any discrepancies being resolved through discussion. RC entered data into Review Manager 5 software (Review Manager 2014), double checking this for accuracy. When information regarding any of the above was unclear, we contacted authors of the reports to provide further details.” 175

Item 10a. List and define all outcomes for which data were sought. Specify whether all results that were compatible with each outcome domain in each study were sought (for example, for all measures, time points, analyses), and, if not, the methods used to decide which results to collect

Explanation: Defining outcomes in systematic reviews generally involves specifying outcome domains (such as pain, quality of life, adverse events such as nausea) and the time frame of measurement (such as less than six months). 37 Included studies may report multiple results that are eligible for inclusion within the review outcome definition. 66 67 For example, a study may report results for two measures of pain (such as the McGill Pain Questionnaire and the Brief Pain Inventory), at two time points (such as four weeks and eight weeks), all of which are compatible with a review outcome defined as “pain <6 months.” Multiple results compatible with an outcome domain in a study might also arise when study investigators report results based on multiple analysis populations (such as all participants randomised, all participants receiving a specific amount of treatment), methods for handling missing data (such as multiple imputation, last-observation-carried-forward), or methods for handling confounding (such as adjustment for different covariates). 67 68 69

Reviewers might seek all results that were compatible with each outcome definition from each study or use a process to select a subset of the results. 65 69 Examples of processes to select results include selecting the outcome definition that ( a ) was most common across studies, ( b ) the review authors considered “best” according to a prespecified hierarchy (for example, which prioritises measures included in a core outcome measurement set), or ( c ) the study investigators considered most important (such as the study’s primary outcome). It is important to specify the methods that were used to select the results when multiple results were available so that users are able to judge the appropriateness of those methods and whether there is potential for bias in the selection of results.

Reviewers may make changes to the inclusion or definition of the outcome domains or to the importance given to them in the review (for example, an outcome listed as “important” in the protocol is considered “critical” in the review). Providing a rationale for the change allows readers to assess the legitimacy of the change and whether it has potential to introduce bias in the review process. 70

List and define the outcome domains and time frame of measurement for which data were sought.

Specify whether all results that were compatible with each outcome domain in each study were sought, and, if not, what process was used to select results within eligible domains.

If any changes were made to the inclusion or definition of the outcome domains or to the importance given to them in the review, specify the changes, along with a rationale.

If any changes were made to the processes used to select results within eligible outcome domains, specify the changes, along with a rationale.

Consider specifying which outcome domains were considered the most important for interpreting the review’s conclusions (such as “critical” versus “important” outcomes) and provide rationale for the labelling (such as “a recent core outcome set identified the outcomes labelled ‘critical’ as being the most important to patients”).

Example of item 10a of PRISMA 2020 checklist.

“Eligible outcomes were broadly categorised as follows:

Cognitive function

Global cognitive function

Domain-specific cognitive function (especially domains that reflect specific alcohol-related neuropathologies, such as psychomotor speed and working memory)

Clinical diagnoses of cognitive impairment

Mild cognitive impairment (also referred to as mild neurocognitive disorders)

Any measure of cognitive function was eligible for inclusion. The tests or diagnostic criteria used in each study should have had evidence of validity and reliability for the assessment of mild cognitive impairment, but studies were not excluded on this basis…Results could be reported as an overall test score that provides a composite measure across multiple areas of cognitive ability (i.e. global cognitive function), sub-scales that provide a measure of domain-specific cognitive function or cognitive abilities (such as processing speed, memory), or both…Studies with a minimum follow-up of 6 months were eligible, a time frame chosen to ensure that studies were designed to examine more persistent effects of alcohol consumption…No restrictions were placed on the number of points at which the outcome was measured, but the length of follow-up and number of measurement points (including a baseline measure of cognition) was considered when interpreting study findings and in deciding which outcomes were similar enough to combine for synthesis.

We anticipated that individual studies would report data for multiple cognitive outcomes. Specifically, a single study may report results:

For multiple constructs related to cognitive function, for example, global cognitive function and cognitive ability on specific domains (e.g. memory, attention, problem-solving, language);

Using multiple methods or tools to measure the same or similar outcome, for example reporting measures of global cognitive function using both the Mini-Mental State Examination and the Montreal Cognitive Assessment;

At multiple time points, for example, at 1, 5, and 10 years.

Where multiple cognition outcomes were reported, we selected one outcome for inclusion in analyses and for reporting the main outcomes (e.g. for GRADEing), choosing the result that provided the most complete information for analysis. Where multiple results remained, we listed all available outcomes (without results) and asked our content expert to independently rank these based on relevance to the review question, and the validity and reliability of the measures used. Measures of global cognitive function were prioritised, followed by measures of memory, then executive function. In the circumstance where results from multiple multivariable models were presented, we extracted associations from the most fully adjusted model, except in the case where an analysis adjusted for a possible intermediary along the causal pathway (i.e. post-baseline measures of prognostic factors (e.g. smoking, drug use, hypertension)).” 176

Item 10b. List and define all other variables for which data were sought (such as participant and intervention characteristics, funding sources). Describe any assumptions made about any missing or unclear information

Explanation: Authors should report the data and information collected from the studies so that readers can understand the type of the information sought and to inform data collection in other similar reviews. Variables of interest might include characteristics of the study (such as countries, settings, number of centres, funding sources, registration status), characteristics of the study design (such as randomised or non-randomised), characteristics of participants (such as age, sex, socioeconomic status), number of participants enrolled and included in analyses, the results (such as summary statistics, estimates of effect and measures of precision, factors adjusted for in analyses), and competing interests of study authors. For reviews of interventions, authors may also collect data on characteristics of the interventions (such as what interventions and comparators were delivered, how they were delivered, by whom, where, and for how long).

List and define all other variables for which data were sought. It may be sufficient to report a brief summary of information collected if the data collection and dictionary forms are made available (for example, as additional files or deposited in a publicly available repository).

Describe any assumptions made about any missing or unclear information from the studies. For example, in a study that includes “children and adolescents,” for which the investigators did not specify the age range, authors might assume that the oldest participants would be 18 years, based on what was observed in similar studies included in the review, and should report that assumption.

If a tool was used to inform which data items to collect (such as the Tool for Addressing Conflicts of Interest in Trials (TACIT) 71 72 or a tool for recording intervention details 73 74 75 ), cite the tool used.

Example of item 10b of PRISMA 2020 checklist.

“We collected data on:

the report: author, year, and source of publication;

the study: sample characteristics, social demography, and definition and criteria used for depression;

the participants: stroke sequence (first ever vs recurrent), social situation, time elapsed since stroke onset, history of psychiatric illness, current neurological status, current treatment for depression, and history of coronary artery disease;

the research design and features: sampling mechanism, treatment assignment mechanism, adherence, non‐response, and length of follow up;

the intervention: type, duration, dose, timing, and mode of delivery.” 177

Study risk of bias assessment

Item 11. specify the methods used to assess risk of bias in the included studies, including details of the tool(s) used, how many reviewers assessed each study and whether they worked independently, and, if applicable, details of automation tools used in the process.

Explanation: Users of reviews need to know the risk of bias in the included studies to appropriately interpret the evidence. Numerous tools have been developed to assess study limitations for various designs. 76 However, many tools have been criticised because of their content (which may extend beyond assessing study limitations that have the potential to bias findings) and the way in which the items are combined (such as scales where items are combined to yield a numerical score) (see box 4 ). 72 Reporting details of the selected tool enables readers to assess whether the tool focuses solely on items that have the potential to bias findings. Reporting details of how studies were assessed (such as by one or two authors) allows readers to assess the potential for errors in the assessments. 58 Reporting how risk of bias assessments were incorporated into the analysis is addressed in Items #13e and #13f.

Box 4. Assessment of risk of bias in studies and bias due to missing results.

Terminology.

The terms “quality assessment” and “critical appraisal” are often used to describe the process of evaluating the methodological conduct or reporting of studies. 76 In PRISMA 2020, we distinguish “quality” from “risk of bias” and have focused the relevant items and elaborations on the latter. Risk of bias refers to the potential for study findings to systematically deviate from the truth due to methodological flaws in the design, conduct or analysis. 72 Quality is not well defined, but has been shown to encompass constructs beyond those that may bias the findings, including, for example, imprecision, reporting completeness, ethics, and applicability. 77 78 79 In systematic reviews, focus should be given to the design, conduct, and analysis features that may lead to important bias in the findings.

Different types of risk of bias

In PRISMA 2020, two aspects of risk of bias are considered. The first aspect is risk of bias in the results of the individual studies included in a systematic review. Empirical evidence and theoretical considerations suggest that several features of study design are associated with larger intervention effect estimates in studies; these features include inadequate generation and concealment of a random sequence to assign participants to groups, substantial loss to follow-up of participants, and unblinded outcome assessment. 80

The second aspect is risk of bias in the result of a synthesis (such as meta-analysis) due to missing studies or results within studies. Missing studies/results may introduce bias when the decision to publish a study/result is influenced by the observed P value or magnitude or direction of the effect. 81 For example, studies with statistically non-significant results may not have been submitted for publication (publication bias), or particular results that were statistically non-significant may have been omitted from study reports (selective non-reporting bias). 82 83

Tools for assessing risk of bias

Many tools have been developed to assess the risk of bias in studies 76 78 79 or bias due to missing results. 84 Existing tools typically take the form of composite scales and domain-based tools. 78 85 Composite scales include multiple items which each have a numeric score attached, from which an overall summary score might be calculated. Domain-based tools require users to judge risk of bias within specific domains, and to record the information on which each judgment was based. 72 86 87 Specifying the components/domains in the tool used in the review can help readers determine whether the tool focuses on risk of bias only or addresses other “quality” constructs. Presenting assessments for each component/domain in the tool is preferable to reporting a single “quality score” because it enables users to understand the specific components/domains that are at risk of bias in each study.

Incorporating assessments of risk of bias in studies into the analysis

The risk of bias in included studies should be considered in the presentation and interpretation of results of individual studies and syntheses. Different analytic strategies may be used to examine whether the risks of bias of the studies may influence the study results: (i) restricting the primary analysis to studies judged to be at low risk of bias (sensitivity analysis); (ii) stratifying studies according to risk of bias using subgroup analysis or meta-regression; or (iii) adjusting the result from each study in an attempt to remove the bias. Further details about each approach are available elsewhere. 72

Specify the tool(s) (and version) used to assess risk of bias in the included studies.

Specify the methodological domains/components/items of the risk of bias tool(s) used.

Report whether an overall risk of bias judgment that summarised across domains/components/items was made, and if so, what rules were used to reach an overall judgment.

If any adaptations to an existing tool to assess risk of bias in studies were made (such as omitting or modifying items), specify the adaptations.

If a new risk of bias tool was developed for use in the review, describe the content of the tool and make it publicly accessible.

Report how many reviewers assessed risk of bias in each study, whether multiple reviewers worked independently (such as assessments performed by one reviewer and checked by another), and any processes used to resolve disagreements between assessors.

If an automation tool was used to assess risk of bias in studies, report how the automation tool was used (such as machine learning models to extract sentences from articles relevant to risk of bias 88 ), how the tool was trained, and details on the tool’s performance and internal validation.

Example of item 11 of PRISMA 2020 checklist.

“We assessed risk of bias in the included studies using the revised Cochrane ‘Risk of bias’ tool for randomised trials (RoB 2.0) (Higgins 2016a), employing the additional guidance for cluster-randomised and cross-over trials (Eldridge 2016; Higgins 2016b). RoB 2.0 addresses five specific domains: (1) bias arising from the randomisation process; (2) bias due to deviations from intended interventions; (3) bias due to missing outcome data; (4) bias in measurement of the outcome; and (5) bias in selection of the reported result. Two review authors independently applied the tool to each included study, and recorded supporting information and justifications for judgements of risk of bias for each domain (low; high; some concerns). Any discrepancies in judgements of risk of bias or justifications for judgements were resolved by discussion to reach consensus between the two review authors, with a third review author acting as an arbiter if necessary. Following guidance given for RoB 2.0 (Section 1.3.4) (Higgins 2016a), we derived an overall summary 'Risk of bias' judgement (low; some concerns; high) for each specific outcome, whereby the overall RoB for each study was determined by the highest RoB level in any of the domains that were assessed.” 178

Effect measures

Item 12. specify for each outcome the effect measure(s) (such as risk ratio, mean difference) used in the synthesis or presentation of results.

Explanation: To interpret a synthesised or study result, users need to know what effect measure was used. Effect measures refer to statistical constructs that compare outcome data between two groups. For instance, a risk ratio is an example of an effect measure that might be used for dichotomous outcomes. 89 The chosen effect measure has implications for interpretation of the findings and might affect the meta-analysis results (such as heterogeneity 90 ). Authors might use one effect measure to synthesise results and then re-express the synthesised results using another effect measure. For example, for meta-analyses of standardised mean differences, authors might re-express the combined results in units of a well known measurement scale, and for meta-analyses of risk ratios or odds ratios, authors might re-express results in absolute terms (such as risk difference). 91 Furthermore, authors need to interpret effect estimates in relation to whether the effect is of importance to decision makers. For a particular outcome and effect measure, this requires specification of thresholds (or ranges) used to interpret the size of effect (such as minimally important difference; ranges for no/trivial, small, moderate, and large effects). 91

Specify for each outcome or type of outcome (such as binary, continuous) the effect measure(s) (such as risk ratio, mean difference) used in the synthesis or presentation of results.

State any thresholds or ranges used to interpret the size of effect (such as minimally important difference; ranges for no/trivial, small, moderate, and large effects) and the rationale for these thresholds.

If synthesised results were re-expressed to a different effect measure, report the methods used to re-express results (such as meta-analysing risk ratios and computing an absolute risk reduction based on an assumed comparator risk).

Consider providing justification for the choice of effect measure. For example, a standardised mean difference may have been chosen because multiple instruments or scales were used across studies to measure the same outcome domain (such as different instruments to assess depression).

Example of item 12 of PRISMA 2020 checklist.

“We planned to analyse dichotomous outcomes by calculating the risk ratio (RR) of a successful outcome (i.e. improvement in relevant variables) for each trial…Because the included resilience‐training studies used different measurement scales to assess resilience and related constructs, we used standardised mean difference (SMD) effect sizes (Cohen's d) and their 95% confidence intervals (CIs) for continuous data in pair‐wise meta‐analyses.” 179

Synthesis methods

Item 13a. describe the processes used to decide which studies were eligible for each synthesis (such as tabulating the study intervention characteristics and comparing against the planned groups for each synthesis (item #5)).

Explanation: Before undertaking any statistical synthesis (item #13d), decisions must be made about which studies are eligible for each planned synthesis (item #5). These decisions will likely involve subjective judgments that could alter the result of a synthesis, yet the processes used and information to support the decisions are often absent from reviews. Reporting the processes (whether formal or informal) and any supporting information is recommended for transparency of the decisions made in grouping studies for synthesis. Structured approaches may involve the tabulation and coding of the main characteristics of the populations, interventions, and outcomes. 92 For example, in a review examining the effects of psychological interventions for smoking cessation in pregnancy, the main intervention component of each study was coded as one of the following based on pre-specified criteria: counselling, health education, feedback, incentive-based interventions, social support, and exercise. 38 This coding provided the basis for determining which studies were eligible for each planned synthesis (such as incentive-based interventions versus usual care). Similar coding processes can be applied to populations and outcomes.

Describe the processes used to decide which studies were eligible for each synthesis.

Example of item 13a of PRISMA 2020 checklist.

“Given the complexity of the interventions being investigated, we attempted to categorize the included interventions along four dimensions: (1) was housing provided to the participants as part of the intervention; (2) to what degree was the tenants’ residence in the provided housing dependent on, for example, sobriety, treatment attendance, etc.; (3) if housing was provided, was it segregated from the larger community, or scattered around the city; and (4) if case management services were provided as part of the intervention, to what degree of intensity. We created categories of interventions based on the above dimensions:

Case management only

Abstinence-contingent housing

Non-abstinence-contingent housing

Housing vouchers

Residential treatment with case management

Some of the interventions had multiple components (e.g. abstinence-contingent housing with case management). These interventions were categorized according to the main component (the component that the primary authors emphasized). They were also placed in separate analyses. We then organized the studies according to which comparison intervention was used (any of the above interventions, or usual services).” 180

Item 13b. Describe any methods required to prepare the data for presentation or synthesis, such as handling of missing summary statistics or data conversions

Explanation: Authors may need to prepare the data collected from studies so that it is suitable for presentation or to be included in a synthesis. This could involve algebraic manipulation to convert reported statistics to required statistics (such as converting standard errors to standard deviations), 89 transforming effect estimates (such as converting standardised mean differences to odds ratios 93 ), or imputing missing summary data (such as missing standard deviations for continuous outcomes, intra-cluster correlations in cluster randomised trials). 94 95 96 Reporting the methods required to prepare the data will allow readers to judge the appropriateness of the methods used and the assumptions made and aid in attempts to replicate the synthesis.

Report any methods required to prepare the data collected from studies for presentation or synthesis, such as handling of missing summary statistics or data conversions.

Example of item 13b of PRISMA 2020 checklist.

“We used cluster-adjusted estimates from cluster randomised controlled trials (c-RCTs) where available. If the studies had not adjusted for clustering, we attempted to adjust their standard errors using the methods described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2019), using an estimate of the intra-cluster correlation coefficient (ICC) derived from the trial. If the trial did not report the cluster-adjusted estimated or the ICC, we imputed an ICC from a similar study included in the review, adjusting if the nature or size of the clusters was different (e.g. households compared to classrooms). We assessed any imputed ICCs using sensitivity analysis.” 181

Item 13c. Describe any methods used to tabulate or visually display results of individual studies and syntheses

Explanation: Presentation of study results using tabulation and visual display is important for transparency (particularly so for reviews or outcomes within reviews where a meta-analysis has not been undertaken) and facilitates the identification of patterns in the data. Tables may be used to present results from individual studies or from a synthesis (such as Summary of Findings table 97 98 ; see item #22). The purpose of tabulating data varies but commonly includes the complete and transparent reporting of the results or comparing the results across study characteristics. 28 Different purposes will likely lead to different table structures. Reporting the chosen structure(s), along with details of the data presented (such as effect estimates), can aid users in understanding the basis and rationale for the structure (such as, “Table have been structured by outcome domain, within which studies are ordered from low to high risk of bias to increase the prominence of the most trustworthy evidence.”).

The principal graphical method for meta-analysis is the forest plot, which displays the effect estimates and confidence intervals of each study and often the summary estimate. 99 100 Similar to tabulation, ordering the studies in the forest plot based on study characteristics (such as by size of the effect estimate, year of publication, study weight, or overall risk of bias) rather than alphabetically (as is often done) can reveal patterns in the data. 101 Other graphs that aim to display information about the magnitude or direction of effects might be considered when a forest plot cannot be used due to incompletely reported effect estimates (such as no measure of precision reported). 28 102 Careful choice and design of graphs is required so that they effectively and accurately represent the data. 99

Report chosen tabular structure(s) used to display results of individual studies and syntheses, along with details of the data presented.

Report chosen graphical methods used to visually display results of individual studies and syntheses.

If studies are ordered or grouped within tables or graphs based on study characteristics (such as by size of the study effect, year of publication), consider reporting the basis for the chosen ordering/grouping.

If non-standard graphs were used, consider reporting the rationale for selecting the chosen graph.

Example of item 13c of PRISMA 2020 checklist.

“Meta-analyses could not be undertaken due to the heterogeneity of interventions, settings, study designs and outcome measures. Albatross plots were created to provide a graphical overview of the data for interventions with more than five data points for an outcome. Albatross plots are a scatter plot of p-values against the total number of individuals in each study. Small p-values from negative associations appear at the left of the plot, small p-values from positive associations at the right, and studies with null results towards the middle. The plot allows p-values to be interpreted in the context of the study sample size; effect contours show a standardised effect size (expressed as relative risk—RR) for a given p-value and study size, providing an indication of the overall magnitude of any association. We estimated an overall magnitude of association from these contours, but this should be interpreted cautiously.” 182

Item 13d. Describe any methods used to synthesise results and provide a rationale for the choice(s). If meta-analysis was performed, describe the model(s), method(s) to identify the presence and extent of statistical heterogeneity, and software package(s) used

Explanation: Various statistical methods are available to synthesise results, the most common of which is meta-analysis of effect estimates (see box 5 ). Meta-analysis is used to synthesise effect estimates across studies, yielding a summary estimate. Different meta-analysis models are available, with the random-effects and fixed-effect models being in widespread use. Model choice can importantly affect the summary estimate and its confidence interval; hence the rationale for the selected model should be provided (see box 5 ). For random-effects models, many methods are available, and their performance has been shown to differ depending on the characteristics of the meta-analysis (such as the number and size of the included studies 113 114 ).

Box 5. Meta-analysis and its extensions.

Meta-analysis is a statistical technique used to synthesise results when study effect estimates and their variances are available, yielding a quantitative summary of results. 103 The method facilitates interpretation that would otherwise be difficult to achieve if, for example, a narrative summary of each result was presented, particularly as the number of studies increases. Furthermore, meta-analysis increases the chance of detecting a clinically important effect as statistically significant, if it exists, and increases the precision of the estimated effect. 104

Meta-analysis models and methods

The summary estimate is a weighted average of the study effect estimates, where the study weights are determined primarily by the meta-analysis model. The two most common meta-analysis models are the “fixed-effect” and “random-effects” models. 103 The assumption underlying the fixed-effect model is that there is one true (common) intervention effect and that the observed differences in results across studies reflect random variation only. This model is sometimes referred to as the “common-effects” or “equal-effects” model. 103 A fixed-effect model can also be interpreted under a different assumption, that the true intervention effects are different and unrelated. This model is referred to as the “fixed-effects” model. 105 The random-effects model assumes that there is not one true intervention effect but, rather, a distribution of true intervention effects and that the observed differences in results across studies reflect real differences in the effects of an intervention. 104 The random-effects and fixed-effects models are similar in that they assume the true intervention effects are different, but they differ in that the random-effects model assumes the effects are related through a distribution, whereas the fixed-effects model does not make this assumption.

Many considerations may influence an author’s choice of meta-analysis model. For example, their choice may be based on the clinical and methodological diversity of the included studies and the expectation that the underlying intervention effects will differ (potentially leading to selection of a random-effects model) or concern about small-study effects (the tendency for smaller studies to show different effects to larger ones, 106 potentially leading to fitting of both a random-effects and fixed-effect model). Sometimes authors select a model based on the heterogeneity statistics observed (for example, switch from a fixed-effect to a random-effects model if the I 2 statistic was >50%). 107 However, this practice is strongly discouraged.

There are different methods available to assign weights in fixed-effect or random-effects meta-analyses (such as Mantel-Haenszel, inverse-variance). 103 For random-effects meta-analyses, there are also different ways to estimate the between-study variance (such as DerSimonian and Laird, restricted maximum likelihood (REML)) and calculate the confidence interval for the summary effect (such as Wald-type confidence interval, Hartung-Knapp-Sidik-Jonkman 108 ). Readers are referred to Deeks et al 103 for further information on how to select a particular meta-analysis model and method.

Subgroup analyses, meta-regression, and sensitivity analyses

Extensions to meta-analysis, including subgroup analysis and meta-regression, are available to explore causes of variation of results across studies (that is, statistical heterogeneity). 103 Subgroup analyses involve splitting studies or participant data into subgroups and comparing the effects of the subgroups. Meta-regression is an extension of subgroup analysis that allows for the effect of continuous and categorical variables to be investigated. 109 Authors might use either type of analysis to explore, for example, whether the intervention effect estimate varied with different participant characteristics (such as mild versus severe disease) or intervention characteristics (such as high versus low dose of a drug).

Sensitivity analyses are undertaken to examine the robustness of findings to decisions made during the review process. This involves repeating an analysis but using different decisions from those originally made and informally comparing the findings. 103 For example, sensitivity analyses might have been done to examine the impact on the meta-analysis of including results from conference abstracts that have never been published in full, including studies where most (but not all) participants were in a particular age range, including studies at high risk of bias, or using a fixed-effect versus random-effects meta-analysis model.

Sensitivity analyses differ from subgroup analyses. Sensitivity analyses consist of making informal comparisons between different ways of estimating the same effect, whereas subgroup analyses consist of formally undertaking a statistical comparison across the subgroups. 103

Extensions to meta-analysis that model or account for dependency

In most meta-analyses, effect estimates from independent studies are combined. Standard meta-analysis methods are appropriate for this situation, since an underlying assumption is that the effect estimates are independent. However, standard meta-analysis methods are not appropriate when the effect estimates are correlated. Correlated effect estimates arise when multiple effect estimates from a single study are calculated using some or all of the same participants and are included in the same meta-analysis. For example, where multiple effect estimates from a multi-arm trial are included in the same meta-analysis, or effect estimates for multiple outcomes from the same study are included. For this situation, a range of methods are available that appropriately model or account for the dependency of the effect estimates. These methods include multivariate meta-analysis, 110 multilevel models, 111 or robust variance estimation. 112 See Lopez-Lopez for further discussion. 69

When study data are not amenable to meta-analysis of effect estimates, alternative statistical synthesis methods (such as calculating the median effect across studies, combining P values) or structured summaries might be used. 28 115 Additional guidance for reporting alternative statistical synthesis methods is available (see Synthesis Without Meta-analysis (SWiM) reporting guideline 116 ).

Regardless of the chosen synthesis method(s), authors should provide sufficient detail such that readers are able to assess the appropriateness of the selected methods and could reproduce the reported results (with access to the data).

If statistical synthesis methods were used, reference the software, packages, and version numbers used to implement synthesis methods (such as metan in Stata 16, 117 metafor (version 2.1-0) in R 118 ).

If it was not possible to conduct a meta-analysis, describe and justify the synthesis methods (such as combining P values was used because no or minimal information beyond P values and direction of effect was reported in the studies) or summary approach used.

If meta-analysis was done, specify:

the meta-analysis model (fixed-effect, fixed-effects, or random-effects) and provide rationale for the selected model.

the method used (such as Mantel-Haenszel, inverse-variance). 103

any methods used to identify or quantify statistical heterogeneity (such as visual inspection of results, a formal statistical test for heterogeneity, 103 heterogeneity variance (τ 2 ), inconsistency (such as I 2 119 ), and prediction intervals 120 ).

If a random-effects meta-analysis model was used, specify:

the between-study (heterogeneity) variance estimator used (such as DerSimonian and Laird, restricted maximum likelihood (REML)).

the method used to calculate the confidence interval for the summary effect (such as Wald-type confidence interval, Hartung-Knapp-Sidik-Jonkman 108 ).

If a Bayesian approach to meta-analysis was used, describe the prior distributions about quantities of interest (such as intervention effect being analysed, amount of heterogeneity in results across studies). 103

If multiple effect estimates from a study were included in a meta-analysis (as may arise, for example, when a study reports multiple outcomes eligible for inclusion in a particular meta-analysis), describe the method(s) used to model or account for the statistical dependency (such as multivariate meta-analysis, multilevel models, or robust variance estimation). 37 69

If a planned synthesis was not considered possible or appropriate, report this and the reason for that decision.

If a random-effects meta-analysis model was used, consider specifying other details about the methods used, such as the method for calculating confidence limits for the heterogeneity variance.

Examples of item 13d of PRISMA 2020 checklist.

Example 1: meta-analysis.

“As the effects of functional appliance treatment were deemed to be highly variable according to patient age, sex, individual maturation of the maxillofacial structures, and appliance characteristics, a random-effects model was chosen to calculate the average distribution of treatment effects that can be expected. A restricted maximum likelihood random-effects variance estimator was used instead of the older DerSimonian-Laird one, following recent guidance. Random-effects 95% prediction intervals were to be calculated for meta-analyses with at least three studies to aid in their interpretation by quantifying expected treatment effects in a future clinical setting. The extent and impact of between-study heterogeneity were assessed by inspecting the forest plots and by calculating the tau-squared and the I-squared statistics, respectively. The 95% CIs (uncertainty intervals) around tau-squared and the I-squared were calculated to judge our confidence about these metrics. We arbitrarily adopted the I-squared thresholds of >75% to be considered as signs of considerable heterogeneity, but we also judged the evidence for this heterogeneity (through the uncertainty intervals) and the localization on the forest plot…All analyses were run in Stata SE 14.0 (StataCorp, College Station, TX) by one author.” 183

Example 2: calculating the median effect across studies

“We based our primary analyses upon consideration of dichotomous process adherence measures (for example, the proportion of patients managed according to evidence-based recommendations). In order to provide a quantitative assessment of the effects associated with reminders without resorting to numerous assumptions or conveying a misleading degree of confidence in the results, we used the median improvement in dichotomous process adherence measures across studies…With each study represented by a single median outcome, we calculated the median effect size and interquartile range across all included studies for that comparison.” 184

Item 13e. Describe any methods used to explore possible causes of heterogeneity among study results (such as subgroup analysis, meta-regression)

Explanation: If authors used methods to explore possible causes of variation of results across studies (that is, statistical heterogeneity) such as subgroup analysis or meta-regression (see box 5 ), they should provide sufficient details so that readers are able to assess the appropriateness of the selected methods and could reproduce the reported results (with access to the data). Such methods might be used to explore whether, for example, participant or intervention characteristics or risk of bias of the included studies explain variation in results.

If methods were used to explore possible causes of statistical heterogeneity, specify the method used (such as subgroup analysis, meta-regression).

If subgroup analysis or meta-regression was performed, specify for each:

which factors were explored, levels of those factors, and which direction of effect modification was expected and why (where possible).

whether analyses were conducted using study-level variables (where each study is included in one subgroup only), within-study contrasts (where data on subsets of participants within a study are available, allowing the study to be included in more than one subgroup), or some combination of the above. 121

how subgroup effects were compared (such as statistical test for interaction for subgroup analyses 103 ).

If other methods were used to explore heterogeneity because data were not amenable to meta-analysis of effect estimates, describe the methods used (such as structuring tables to examine variation in results across studies based on subpopulation, key intervention components, or contextual factors) along with the factors and levels. 28 116

If any analyses used to explore heterogeneity were not pre-specified, identify them as such.

Example of item 13e of PRISMA 2020 checklist.

“Given a sufficient number of trials, we used unadjusted and adjusted mixed-effects meta-regression analyses to assess whether variation among studies in smoking cessation effect size was moderated by tailoring of the intervention for disadvantaged groups. The resulting regression coefficient indicates how the outcome variable (log risk ratio (RR) for smoking cessation) changes when interventions take a socioeconomic-position-tailored versus non-socioeconomic-tailored approach. A statistically significant (p<0.05) coefficient indicates that there is a linear association between the effect estimate for smoking cessation and the explanatory variable. More moderators (study-level variables) can be included in the model, which might account for part of the heterogeneity in the true effects. We pre-planned an adjusted model to include important study covariates related to the intensity and delivery of the intervention (number of sessions delivered (above median vs below median), whether interventions involved a trained smoking cessation specialist (yes vs no), and use of pharmacotherapy in the intervention group (yes vs no). These covariates were included a priori as potential confounders given that programmes tailored to socioeconomic position might include more intervention sessions or components or be delivered by different professionals with varying experience. The regression coefficient estimates how the intervention effect in the socioeconomic-position-tailored subgroup differs from the reference group of non-socioeconomic-position-tailored interventions.” 185

Item 13f. Describe any sensitivity analyses conducted to assess robustness of the synthesised results

Explanation: If authors performed sensitivity analyses to assess robustness of the synthesised results to decisions made during the review process (see box 5 ), they should provide sufficient details so that readers are able to assess the appropriateness of the analyses and could reproduce the reported results (with access to the data). Ideally, sensitivity analyses should be pre-specified in the protocol, but unexpected issues may emerge during the review process that necessitate their use.

If sensitivity analyses were performed, provide details of each analysis (such as removal of studies at high risk of bias, use of an alternative meta-analysis model).

If any sensitivity analyses were not pre-specified, identify them as such.

Example of item 13f of PRISMA 2020 checklist.

“We conducted sensitivity meta-analyses restricted to trials with recent publication (2000 or later); overall low risk of bias (low risk of bias in all seven criteria); and enrolment of generally healthy women (rather than those with a specific clinical diagnosis). To incorporate trials with zero events in both intervention and control arms (which are automatically dropped from analyses of pooled relative risks), we also did sensitivity analyses for dichotomous outcomes in which we added a continuity correction of 0.5 to zero cells.” 186

Reporting bias assessment

Item 14. describe any methods used to assess risk of bias due to missing results in a synthesis (arising from reporting biases).

Explanation: The validity of a synthesis may be threatened when the available results differ systematically from the missing results. This is known as “bias due to missing results” and arises from “reporting biases” such as selective non-publication and selective non-reporting of results (see box 4 ). 81 Direct methods for assessing the risk of bias due to missing results include comparing outcomes and analyses pre-specified in study registers, protocols, and statistical analysis plans with results that were available in study reports. Statistical and graphical methods exist to assess whether the observed data suggest potential for missing results (such as contour enhanced funnel plots, Egger’s test) and how robust the synthesis is to different assumptions about the nature of potentially missing results (such as selection models). 84 122 123 124 Tools (such as checklists, scales, or domain-based tools) that prompt users to consider some or all of these approaches are available. 81 84 Therefore, reporting methods (tools, graphical, statistical, or other) used to assess risk of bias due to missing results is recommended so that readers are able to assess how appropriate the methods were. The process by which assessments were conducted should also be reported to enable readers to assess the potential for errors and facilitate replicability.

Specify the methods (tool, graphical, statistical, or other) used to assess the risk of bias due to missing results in a synthesis (arising from reporting biases).

If risk of bias due to missing results was assessed using an existing tool, specify the methodological components/domains/items of the tool, and the process used to reach a judgment of overall risk of bias.

If any adaptations to an existing tool to assess risk of bias due to missing results were made (such as omitting or modifying items), specify the adaptations.

If a new tool to assess risk of bias due to missing results was developed for use in the review, describe the content of the tool and make it publicly accessible.

Report how many reviewers assessed risk of bias due to missing results in a synthesis, whether multiple reviewers worked independently, and any processes used to resolve disagreements between assessors.

If an automation tool was used to assess risk of bias due to missing results, report how the tool was used, how the tool was trained, and details on the tool’s performance and internal validation.

Example of item 14 of PRISMA 2020 checklist.

“To assess small-study effects, we planned to generate funnel plots for meta-analyses including at least 10 trials of varying size. If asymmetry in the funnel plot was detected, we planned to review the characteristics of the trials to assess whether the asymmetry was likely due to publication bias or other factors such as methodological or clinical heterogeneity of the trials. To assess outcome reporting bias, we compared the outcomes specified in trial protocols with the outcomes reported in the corresponding trial publications; if trial protocols were unavailable, we compared the outcomes reported in the methods and results sections of the trial publications.” 187

Certainty assessment

Item 15. describe any methods used to assess certainty (or confidence) in the body of evidence for an outcome.

Explanation: Authors typically use some criteria to decide how certain (or confident) they are in the body of evidence for each important outcome. Common factors considered include precision of the effect estimate (or sample size), consistency of findings across studies, study design limitations and missing results (risk of bias), and how directly the studies address the question. Tools and frameworks can be used to provide a systematic, explicit approach to assessing these factors and provide a common approach and terminology for communicating certainty. 125 126 127 128 For example, using the GRADE approach, authors will first apply criteria to assess each GRADE domain (imprecision, inconsistency, risk of bias, and so forth) and then make an overall judgment of whether the evidence supporting a result is of high, moderate, low, or very low certainty. Reporting the factors considered and the criteria used to assess each factor enables readers to determine which factors fed into reviewers’ assessment of certainty. Reporting the process by which assessments were conducted enables readers to assess the potential for errors and facilitates replication.

Specify the tool or system (and version) used to assess certainty in the body of evidence.

Report the factors considered (such as precision of the effect estimate, consistency of findings across studies) and the criteria used to assess each factor when assessing certainty in the body of evidence.

Describe the decision rules used to arrive at an overall judgment of the level of certainty (such as high, moderate, low, very low), together with the intended interpretation (or definition) of each level of certainty. 125

If applicable, report any review-specific considerations for assessing certainty, such as thresholds used to assess imprecision and ranges of magnitude of effect that might be considered trivial, moderate or large, and the rationale for these thresholds and ranges (item #12). 129

If any adaptations to an existing tool or system to assess certainty were made, specify the adaptations in sufficient detail that the approach is replicable.

Report how many reviewers assessed the certainty of evidence, whether multiple reviewers worked independently, and any processes used to resolve disagreements between assessors.

Report any processes used to obtain or confirm relevant information from investigators.

If an automation tool was used to support the assessment of certainty, report how the automation tool was used, how the tool was trained, and details on the tool’s performance and internal validation.

Describe methods for reporting the results of assessments of certainty, such as the use of Summary of Findings tables (see item #22).

If standard phrases that incorporate the certainty of evidence were used (such as “hip protectors probably reduce the risk of hip fracture slightly”), 130 report the intended interpretation of each phrase and the reference for the source guidance.

Where a published system is adhered to, it may be sufficient to briefly describe the factors considered and the decision rules for reaching an overall judgment and reference the source guidance for full details of assessment criteria.

Example of item 15 of PRISMA 2020 checklist.

“Two people (AM, JS) independently assessed the certainty of the evidence. We used the five GRADE considerations (study limitations, consistency of effect, imprecision, indirectness, and publication bias) to assess the certainty of the body of evidence as it related to the studies that contributed data to the meta-analyses for the prespecified outcomes. We assessed the certainty of evidence as high, moderate, low, or very low. We considered the following criteria for upgrading the certainty of evidence, if appropriate: large effect, dose-response gradient, and plausible confounding effect. We used the methods and recommendations described in sections 8.5 and 8.7, and chapters 11 and 12, of the Cochrane Handbook for Systematic Reviews of Interventions. We used GRADEpro GDT software to prepare the 'Summary of findings' tables (GRADEpro GDT 2015). We justified all decisions to down- or up-grade the certainty of studies using footnotes, and we provided comments to aid the reader’s understanding of the results where necessary.” 188

Study selection

Item 16a. describe the results of the search and selection process, from the number of records identified in the search to the number of studies included in the review, ideally using a flow diagram (see fig 1 ).

PRISMA 2020 flow diagram template for systematic reviews (adapted from flow diagrams proposed by Boers 131 and Mayo-Wilson et al. 65 and Stovold et al. 132 ). The boxes in grey should only be completed if applicable; otherwise they should be removed from the flow diagram. Note that a “report” could be a journal article, preprint, conference abstract, study register entry, clinical study report, dissertation, unpublished manuscript, government report or any other document providing relevant information.

Explanation: Review authors should report, ideally with a flow diagram (see fig 1 ), the results of the search and selection process so that readers can understand the flow of retrieved records through to inclusion in the review. Such information is useful for future systematic review teams seeking to estimate resource requirements and for information specialists in evaluating their searches. 133 134 Specifying the number of records yielded per database will make it easier for others to assess whether they have successfully replicated a search. The flow diagram in figure 1 provides a template of the flow of records through the review separated by source, although other layouts may be preferable depending on the information sources consulted. 65

Report, ideally using a flow diagram, the number of: records identified; records excluded before screening (for example, because they were duplicates or deemed ineligible by machine classifiers); records screened; records excluded after screening titles or titles and abstracts; reports retrieved for detailed evaluation; potentially eligible reports that were not retrievable; retrieved reports that did not meet inclusion criteria and the primary reasons for exclusion (such as ineligible study design, ineligible population); and the number of studies and reports included in the review. If applicable, authors should also report the number of ongoing studies and associated reports identified.

If the review is an update of a previous review, report results of the search and selection process for the current review and specify the number of studies included in the previous review. An additional box could be added to the flow diagram indicating the number of studies included in the previous review (see fig 1 ). 132

If applicable, indicate in the PRISMA flow diagram how many records were excluded by a human and how many by automation tools.

Example of item 16a of PRISMA 2020 checklist.

“We found 1,333 records in databases searching. After duplicates removal, we screened 1,092 records, from which we reviewed 34 full-text documents, and finally included six papers [each cited]. Later, we searched documents that cited any of the initially included studies as well as the references of the initially included studies. However, no extra articles that fulfilled inclusion criteria were found in these searches (a flow diagram is available at https://doi.org/10.1371/journal.pone.0233220 ).” 189

Item 16b. Cite studies that might appear to meet the inclusion criteria, but which were excluded, and explain why they were excluded

Explanation: Identifying the excluded records allows readers to make an assessment of the validity and applicability of the systematic review. 40 135 At a minimum, a list of studies that might appear to meet the inclusion criteria but which were excluded, with citation and a reason for exclusion, should be reported. This would include studies meeting most inclusion criteria (such as those with appropriate intervention and population but an ineligible control or study design). It is also useful to list studies that were potentially relevant but for which the full text or data essential to inform eligibility were not accessible. This information can be reported in the text or as a list/table in the report or in an online supplement. Potentially contentious exclusions should be clearly stated in the report.

Cite studies that might appear to meet the inclusion criteria, but which were excluded, and explain why they were excluded.

Example of item 16b of PRISMA 2020 checklist.

“We excluded seven studies from our review (Bosiers 2015; ConSeQuent; DEBATE‐ISR; EXCITE ISR; NCT00481780 ; NCT02832024 ; RELINE), and we listed reasons for exclusion in the Characteristics of excluded studies tables. We excluded studies because they compared stenting in Bosiers 2015 and RELINE, laser atherectomy in EXCITE ISR, or cutting balloon angioplasty in NCT00481780 versus uncoated balloon angioplasty for in‐stent restenosis. The ConSeQuent trial compared DEB versus uncoated balloon angioplasty for native vessel restenosis rather than in‐stent restenosis. The DEBATE‐ISR study compared a prospective cohort of patients receiving DEB therapy for in‐stent restenosis against a historical cohort of diabetic patients. Finally, the NCT02832024 study compared stent deployment versus atherectomy versus uncoated balloon angioplasty alone for in‐stent restenosis.” 190

Study characteristics

Item 17. cite each included study and present its characteristics.

Explanation: Reporting the details of the included studies allows readers to understand the characteristics of studies that have addressed the review question(s) and is therefore important for understanding the applicability of the review. Characteristics of interest might include study design features, characteristics of participants, how outcomes were ascertained (such as smoking cessation self reported or biochemically validated, or specific harms systematically assessed or reported by participants as they emerged), funding source, and competing interests of study authors. Presenting the key characteristics of each study in a table or figure can facilitate comparison of characteristics across the studies. 92 Citing each study enables retrieval of relevant reports if desired.

For systematic reviews of interventions, presenting an additional table that summarises the intervention details for each study (such as using the template based on the Template for Intervention Description and Replication (TIDieR) 73 ) has several benefits. An intervention summary table helps readers compare the characteristics of the interventions and consider those that may be feasible for implementation in their setting; highlights missing or unavailable details; shows which studies did not specify certain characteristics as part of the intervention; and highlights characteristics that have not been investigated in existing studies. 73 75

Cite each included study.

Present the key characteristics of each study in a table or figure (considering a format that will facilitate comparison of characteristics across the studies).

If the review examines the effects of interventions, consider presenting an additional table that summarises the intervention details for each study.

Example of item 17 of PRISMA 2020 checklist.

In a review examining the association between aspirin use and fracture risk, the authors included a table presenting for each included study the citation, study design, country, sample size, setting, mean age, percentage of females, number of years follow-up, exposure details, and outcomes assessed ( table 2 ). 191

The table displays for each included study the citation, study design, country, sample size, setting, mean age, percentage of females, number of years follow-up, exposure details and outcomes assessed. Reproduced from Barker et al. 191

Risk of bias in studies

Item 18. present assessments of risk of bias for each included study.

Explanation: For readers to understand the internal validity of a systematic review’s results, they need to know the risk of bias in results of each included study. Reporting only summary data (such as “two of eight studies successfully blinded participants”) is inadequate because it fails to inform readers which studies had each particular methodological shortcoming. A more informative approach is to present tables or figures indicating for each study the risk of bias in each domain/component/item assessed (such as blinding of outcome assessors, missing outcome data), so that users can understand what factors led to the overall study-level risk of bias judgment. 72 136

Present tables or figures indicating for each study the risk of bias in each domain/component/item assessed and overall study-level risk of bias.

Present justification for each risk of bias judgment—for example, in the form of relevant quotations from reports of included studies.

If assessments of risk of bias were done for specific outcomes or results in each study, consider displaying risk of bias judgments on a forest plot, next to the study results, so that the limitations of studies contributing to a particular meta-analysis are evident (see Sterne et al 86 for an example forest plot).

Example of item 18 of PRISMA 2020 checklist.

“We used the RoB 2.0 tool to assess risk of bias for each of the included studies. A summary of these assessments is provided in table 3 . In terms of overall risk of bias, there were concerns about risk of bias for the majority of studies (20/24), with two of these assessed as at high risk of bias (Musher‐Eizenman 2010; Wansink 2013a). A text summary is provided below for each of the six individual components of the ‘Risk of bias’ assessment. Justifications for assessments are available at the following ( https://dx.doi.org/10.6084/m9.figshare.9159824 ).” 178

The table displays for each included study the risk-of-bias judgment for each of six domains of bias, and for the overall risk of bias in two results (selection of a product, consumption of a product); the following is an abridged version of the table presented in the review. Reproduced from Hollands et al. 178

CRCT: cluster-randomised controlled trials. Justifications for assessments are available at the following ( https://dx.doi.org/10.6084/m9.figshare.9159824 ).

Results of individual studies

Item 19. for all outcomes, present for each study ( a ) summary statistics for each group (where appropriate) and ( b ) an effect estimate and its precision (such as confidence/credible interval), ideally using structured tables or plots.

Explanation: Presenting data from individual studies facilitates understanding of each study’s contribution to the findings and reuse of the data by others seeking to perform additional analyses or perform an update of the review. There are different ways of presenting results of individual studies (such as table, forest plot). 28 115 Visual display of results supports interpretation by readers, while tabulation of the results makes it easier for others to reuse the data.

Displaying summary statistics by group is helpful, because it allows an assessment of the severity of the problem in the studies (such as level of depression symptoms), which is not available from between-group results (that is, effect estimates). 137 However, there are some scenarios where presentation of simple summary statistics for each group may be misleading. For example, in the case of cluster-randomised designs, the observed number of events and sample size in each group does not reflect the effective sample size (that is, the sample size adjusted for correlation among observations). However, providing the estimated proportion of events (or another summary statistic) per group will be helpful. 138 The effect estimates from models that appropriately adjust for clustering (and other design features) should be reported and included in the meta-analysis in such instances.

For all outcomes, irrespective of whether statistical synthesis was undertaken, present for each study summary statistics for each group (where appropriate). For dichotomous outcomes, report the number of participants with and without the events for each group; or the number with the event and the total for each group (such as 12/45). For continuous outcomes, report the mean, standard deviation, and sample size of each group.

For all outcomes, irrespective of whether statistical synthesis was undertaken, present for each study an effect estimate and its precision (such as standard error or 95% confidence/credible interval). For example, for time-to-event outcomes, present a hazard ratio and its confidence interval.

If study-level data are presented visually or reported in the text (or both), also present a tabular display of the results.

If results were obtained from multiple sources (such as journal article, study register entry, clinical study report, correspondence with authors), report the source of the data. This need not be overly burdensome. For example, a statement indicating that, unless otherwise specified, all data came from the primary reference for each included study would suffice. Alternatively, this could be achieved by, for example, presenting the origin of each data point in footnotes, in a column of the data table, or as a hyperlink to relevant text highlighted in reports (such as using SRDR Data Abstraction Assistant 139 ).

If applicable, indicate which results were not reported directly and had to be computed or estimated from other information (see item #13b).

Example of item 19 of PRISMA 2020 checklist.

For an example of individual study results presented for a dichotomous outcome, see figure 2 . For an example of individual study results presented for a continuous outcome, see figure 3 . 192

The figure displays for each study included in the meta-analysis the summary statistics (number of events and sample size) for the quadruple and triple combination antiretroviral therapies (cART) groups, and the risk ratio and its 95% confidence interval for the dichotomous outcome, undetectable HIV-1 RNA. Reproduced from Feng et al. 192

The figure displays for each study included in the meta-analysis the summary statistics (mean, standard deviation, and sample size) for the quadruple and triple combination antiretroviral therapies (cART) groups, and the mean difference and its 95% confidence interval for the continuous outcome, CD4 T cell count (cells/μL). Reproduced from Feng et al. 192

Results of syntheses

Item 20a. for each synthesis, briefly summarise the characteristics and risk of bias among contributing studies.

Explanation: Many systematic review reports include narrative summaries of the characteristics and risk of bias across all included studies. 36 However, such general summaries are not useful when the studies contributing to each synthesis vary, and particularly when there are many studies. For example, one meta-analysis might include three studies of participants aged 30 years on average, whereas another meta-analysis might include 10 studies of participants aged 60 years on average; in this case, knowing the mean age per synthesis is more meaningful than the overall mean age across all 13 studies. Providing a brief summary of the characteristics and risk of bias among studies contributing to each synthesis (meta-analysis or other) should help readers understand the applicability and risk of bias in the synthesised result. Furthermore, a summary at the level of the synthesis is more usable since it obviates the need for readers to refer to multiple sections of the review in order to interpret results. 92

Provide a brief summary of the characteristics and risk of bias among studies contributing to each synthesis (meta-analysis or other). The summary should focus only on study characteristics that help in interpreting the results (especially those that suggest the evidence addresses only a restricted part of the review question, or indirectly addresses the question). If the same set of studies contribute to more than one synthesis, or if the same risk of bias issues are relevant across studies for different syntheses, such a summary need be provided once only.

Indicate which studies were included in each synthesis (such as by listing each study in a forest plot or table or citing studies in the text).

Example of item 20a of PRISMA 2020 checklist.

“Nine randomized controlled trials (RCTs) directly compared delirium incidence between haloperidol and placebo groups [9 studies cited]. These RCTs enrolled 3,408 patients in both surgical and medical intensive care and non-intensive care unit settings and used a variety of validated delirium detection instruments. Five of the trials were low risk of bias [5 studies cited], three had unclear risk of bias [3 studies cited], and one had high risk of bias owing to lack of blinding and allocation concealment [1 study cited]. Intravenous haloperidol was administered in all except two trials; in those two exceptions, oral doses were given [two studies cited]. These nine trials were pooled, as they each identified new onset of delirium (incidence) within the week after exposure to prophylactic haloperidol or placebo.” 193

Item 20b. Present results of all statistical syntheses conducted. If meta-analysis was done, present for each the summary estimate and its precision (such as confidence/credible interval) and measures of statistical heterogeneity. If comparing groups, describe the direction of the effect

Explanation: Users of reviews rely on the reporting of all statistical syntheses conducted so that they have complete and unbiased evidence on which to base their decisions. Studies examining selective reporting of results in systematic reviews have found that 11% to 22% of reviews did not present results for at least one pre-specified outcome of the review. 140 141 142 143

Report results of all statistical syntheses described in the protocol and all syntheses conducted that were not pre-specified.

If meta-analysis was conducted, report for each:

the summary estimate and its precision (such as standard error or 95% confidence/credible interval).

measures of statistical heterogeneity (such as τ 2 , I 2 , prediction interval).

If other statistical synthesis methods were used (such as summarising effect estimates, combining P values), report the synthesised result and a measure of precision (or equivalent information, for example, the number of studies and total sample size).

If the statistical synthesis method does not yield an estimate of effect (such as when P values are combined), report the relevant statistics (such as P value from the statistical test), along with an interpretation of the result that is consistent with the question addressed by the synthesis method (for example, “There was strong evidence of benefit of the intervention in at least one study (P < 0.001, 10 studies)” when P values have been combined). 28

If comparing groups, describe the direction of effect (such as fewer events in the intervention group, or higher pain in the comparator group).

If synthesising mean differences, specify for each synthesis, where applicable, the unit of measurement (such as kilograms or pounds for weight), the upper and lower limits of the measurement scale (for example, anchors range from 0 to 10), direction of benefit (for example, higher scores denote higher severity of pain), and the minimally important difference, if known. If synthesising standardised mean differences and the effect estimate is being re-expressed to a particular instrument, details of the instrument, as per the mean difference, should be reported.

Example of item 20b of PRISMA 2020 checklist.

“Twelve studies, including a total of 159,086 patients, reported on the rate of major bleeding complications. Aspirin use was associated with a 46% relative risk increase of major bleeding complications (risk ratio 1.46; 95% CI, 1.30-1.64; p < 0.00001; I 2 = 31%; absolute risk increase 0.077%; number needed to treat to harm 1295)” 194

Item 20c. Present results of all investigations of possible causes of heterogeneity among study results

Explanation: Presenting results from all investigations of possible causes of heterogeneity among study results is important for users of reviews and for future research. For users, understanding the factors that may, and equally, may not, explain variability in the effect estimates, may inform decision making. Similarly, presenting all results is important for designing future studies. For example, the results may help to generate hypotheses about potential modifying factors that can be tested in future studies, or help identify “active” intervention ingredients that might be combined and tested in a future randomised trial. Selective reporting of the results leads to an incomplete representation of the evidence that risks misdirecting decision making and future research.

If investigations of possible causes of heterogeneity were conducted:

present results regardless of the statistical significance, magnitude, or direction of effect modification.

identify the studies contributing to each subgroup.

report results with due consideration to the observational nature of the analysis and risk of confounding due to other factors. 109 144

If subgroup analysis was conducted, report for each analysis the exact P value for a test for interaction as well as, within each subgroup, the summary estimates, their precision (such as standard error or 95% confidence/credible interval) and measures of heterogeneity. Results from subgroup analyses might usefully be presented graphically (see Fisher et al 121 ).

If meta-regression was conducted, report for each analysis the exact P value for the regression coefficient and its precision.

If informal methods (that is, those that do not involve a formal statistical test) were used to investigate heterogeneity—which may arise particularly when the data are not amenable to meta-analysis—describe the results observed. For example, present a table that groups study results by dose or overall risk of bias and comment on any patterns observed. 116

If subgroup analysis was conducted, consider presenting the estimate for the difference between subgroups and its precision.

If meta-regression was conducted, consider presenting a meta-regression scatterplot with the study effect estimates plotted against the potential effect modifier. 109

Example of item 20c of PRISMA 2020 checklist.

“Among the 4 trials that recruited critically ill patients who were and were not receiving invasive mechanical ventilation at randomization, the association between corticosteroids and lower mortality was less marked in patients receiving invasive mechanical ventilation (ratio of odds ratios (ORs), 4.34 [95% CI, 1.46-12.91]; P = 0.008 based on within-trial estimates combined across trials); however, only 401 patients (120 deaths) contributed to this comparison…All trials contributed data according to age group and sex. For the association between corticosteroids and mortality, the OR was 0.69 (95% CI, 0.51-0.93) among 880 patients older than 60 years, the OR was 0.67 (95% CI, 0.48-0.94) among 821 patients aged 60 years or younger (ratio of ORs, 1.02 [95% CI, 0.63-1.65], P = 0.94), the OR was 0.66 (95% CI, 0.51-0.84) among 1215 men, and the OR was 0.66 (95% CI, 0.43-0.99) among 488 women (ratio of ORs, 1.07 [95% CI, 0.58-1.98], P = 0.84).” 195

Item 20d. Present results of all sensitivity analyses conducted to assess the robustness of the synthesised results

Explanation: Presenting results of sensitivity analyses conducted allows readers to assess how robust the synthesised results were to decisions made during the review process. Reporting results of all sensitivity analyses is important; presentation of a subset, based on the nature of the results, risks introducing bias due to selective reporting. Forest plots are a useful way to present results of sensitivity analyses; however, these may be best placed in an appendix, with the main forest plots presented in the main report, to not reduce readability. An exception may be when sensitivity analyses reveal the results are not robust to decisions made during the review process.

If any sensitivity analyses were conducted:

report the results for each sensitivity analysis.

comment on how robust the main analysis was given the results of all corresponding sensitivity analyses.

If any sensitivity analyses were conducted, consider:

presenting results in tables that indicate: (i) the summary effect estimate, a measure of precision (and potentially other relevant statistics, for example, I 2 statistic) and contributing studies for the original meta-analysis; (ii) the same information for the sensitivity analysis; and (iii) details of the original and sensitivity analysis assumptions.

presenting results of sensitivity analyses visually using forest plots.

Example of item 20d of PRISMA 2020 checklist.

“Sensitivity analyses that removed studies with potential bias showed consistent results with the primary meta-analyses (risk ratio 1.00 for undetectable HIV-1 RNA, 1.00 for virological failure, 0.98 for severe adverse effects, and 1.02 for AIDS defining events; supplement 3E, 3F, 3H, and 3I, respectively). Such sensitivity analyses were not performed for other outcomes because none of the studies reporting them was at a high risk of bias. Sensitivity analysis that pooled the outcome data reported at 48 weeks, which also showed consistent results, was performed for undetectable HIV-1 RNA and increase in CD4 T cell count only (supplement 3J and 3K) and not for other outcomes owing to lack of relevant data. When the standard deviations for increase in CD4 T cell count were replaced by those estimated by different methods, the results of figure 3 either remained similar (that is, quadruple and triple arms not statistically different) or favoured triple therapies (supplement 2).” 192

Risk of reporting biases in syntheses

Item 21. present assessments of risk of bias due to missing results (arising from reporting biases) for each synthesis assessed.

Explanation: Presenting assessments of the risk of bias due to missing results in syntheses allows readers to assess potential threats to the trustworthiness of a systematic review’s results. Providing the evidence used to support judgments of risk of bias allows readers to determine the validity of the assessments.

Present assessments of risk of bias due to missing results (arising from reporting biases) for each synthesis assessed.

If a tool was used to assess risk of bias due to missing results in a synthesis, present responses to questions in the tool, judgments about risk of bias, and any information used to support such judgments to help readers understand why particular judgments were made.

If a funnel plot was generated to evaluate small-study effects (one cause of which is reporting biases), present the plot and specify the effect estimate and measure of precision used in the plot (presented typically on the horizontal axis and vertical axis respectively 106 ). If a contour-enhanced funnel plot was generated, specify the “milestones” of statistical significance that the plotted contour lines represent (P=0.01, 0.05, 0.1, etc). 145

If a test for funnel plot asymmetry was used, report the exact P value observed for the test and potentially other relevant statistics, such as the standardised normal deviate, from which the P value is derived. 106

If any sensitivity analyses seeking to explore the potential impact of missing results on the synthesis were conducted, present results of each analysis (see item #20d), compare them with results of the primary analysis, and report results with due consideration of the limitations of the statistical method. 123

If studies were assessed for selective non-reporting of results by comparing outcomes and analyses pre-specified in study registers, protocols, and statistical analysis plans with results that were available in study reports, consider presenting a matrix (with rows as studies and columns as syntheses) to present the availability of study results. 124

If an assessment of selective non-reporting of results reveals that some studies are missing from the synthesis, consider displaying the studies with missing results underneath a forest plot or including a table with the available study results (for example, see forest plot in Page et al 81 ).

Example of item 21 of PRISMA 2020 checklist.

“Clinical global impression of change was assessed in Doody 2008, NCT00912288 , CONCERT and CONNECTION using the CIBIC-Plus. However, we were only able to extract results from Doody 2008 [because no results for CIBIC-Plus were reported in the other three studies]…The authors reported small but significant improvements on the CIBIC‐Plus for 183 patients (89 on latrepirdine and 94 on placebo) favouring latrepirdine following the 26‐week primary endpoint (MD −0.60, 95% CI −0.89 to −0.31, P<0.001). Similar results were found at the additional 52‐week follow‐up (MD −0.70, 95% CI −1.01 to −0.39, P<0.001). However, we considered this to be low quality evidence due to imprecision and reporting bias. Thus, we could not draw conclusions about the efficacy of latrepirdine in terms of changes in clinical impression.” 196

Certainty of evidence

Item 22. present assessments of certainty (or confidence) in the body of evidence for each outcome assessed.

Explanation: An important feature of systems for assessing certainty, such as GRADE, is explicit reporting of both the level of certainty (or confidence) in the evidence and the basis for judgments. 97 98 127 Evidence summary tables, such as GRADE Summary of Findings tables, are an effective and efficient way to report assessments of the certainty of evidence. 97 127 146 147

Report the overall level of certainty in the body of evidence (such as high, moderate, low, or very low) for each important outcome.

Provide an explanation of reasons for rating down (or rating up) the certainty of evidence (such as in footnotes to an evidence summary table). Explanations for each judgment should be concise, informative, relevant to the target audience, easy to understand, and accurate (that is, addressing criteria specified in the methods guidance). 148

Communicate certainty in the evidence wherever results are reported (that is, abstract, evidence summary tables, results, conclusions). Use a format appropriate for the section of the review. For example, in text, certainty might be reported explicitly in a sentence (such as “Moderate-certainty evidence (downgraded for bias) indicates that…”) or in brackets alongside an effect estimate (such as “[RR 1.17, 95% CI 0.81 to 1.68; 4 studies, 1781 participants; moderate certainty evidence]”). When interpreting results in “summary of findings” tables or conclusions, certainty might be communicated implicitly using standard phrases (such as “Hip protectors probably reduce the risk of hip fracture slightly”). 130

Consider including evidence summary tables, such as GRADE Summary of Findings tables.

Example of item 22 of PRISMA 2020 checklist.

“Compared with non-operative treatment, low-certainty evidence indicates surgery (repair with subacromial decompression) may have little or no effect on function at 12 months. The evidence was downgraded two steps, once for bias and once for imprecision—the 95% CIs overlap minimal important difference in favour of surgery at this time point.” A summary of findings table presents the same information as the text above, with footnotes explaining judgments. 187

Item 23a. Provide a general interpretation of the results in the context of other evidence

Explanation: Discussing how the results of the review relate to other relevant evidence should help readers interpret the findings. For example, authors might compare the current results to results of other similar systematic reviews (such as reviews that addressed the same question using different methods or that addressed slightly different questions) and explore possible reasons for discordant results. Similarly, authors might summarise additional information relevant to decision makers that was not explored in the review, such as findings of studies evaluating the cost-effectiveness of the intervention or surveys gauging the values and preferences of patients.

Provide a general interpretation of the results in the context of other evidence.

Example of item 23a of PRISMA 2020 checklist.

“Although we need to exercise caution in interpreting these findings because of the small number of studies, these findings nonetheless appear to be largely in line with the recent systematic review on what works to improve education outcomes in low‐ and middle‐income countries of Snilstveit et al. (2012). They found that structured pedagogical interventions may be among the effective approaches to improve learning outcomes in low‐ and middle‐income countries. This is consistent with our findings that teacher training is only effective in improving early grade literacy outcomes when it is combined with teacher coaching. The finding is also consistent with our result that technology in education programs may have at best no effects unless they are combined with a focus on pedagogical practices. In line with our study, Snilstveit et al. (2012) also do not find evidence for statistically significant effects of the one‐laptop‐per‐child program. These results are consistent with the results of a meta‐analysis showing that technology in education programs are not effective when not accompanied by parent or student training (McEwan, 2015). However, neither Snilstveit et al. (2012) nor McEwan (2015) find evidence for negative effects of the one‐laptop‐per‐child program on early grade literacy outcomes.” 197

Item 23b. Discuss any limitations of the evidence included in the review

Explanation: Discussing the completeness, applicability, and uncertainties in the evidence included in the review should help readers interpret the findings appropriately. For example, authors might acknowledge that they identified few eligible studies or studies with a small number of participants, leading to imprecise estimates; have concerns about risk of bias in studies or missing results; or identified studies that only partially or indirectly address the review question, leading to concerns about their relevance and applicability to particular patients, settings, or other target audiences. The assessments of certainty (or confidence) in the body of evidence (item #22) can support the discussion of such limitations.

Discuss any limitations of the evidence included in the review.

Example of item 23b of PRISMA 2020 checklist.

“Study populations were young, and few studies measured longitudinal exposure. The included studies were often limited by selection bias, recall bias, small sample of marijuana-only smokers, reporting of outcomes on marijuana users and tobacco users combined, and inadequate follow-up for the development of cancer…Most studies poorly assessed exposure, and some studies did not report details on exposure, preventing meta-analysis for several outcomes.” 198

Item 23c. Discuss any limitations of the review processes used

Explanation: Discussing limitations, avoidable or unavoidable, in the review process should help readers understand the trustworthiness of the review findings. For example, authors might acknowledge the decision to restrict eligibility to studies in English only, search only a small number of databases, have only one reviewer screen records or collect data, or not contact study authors to clarify unclear information. They might also acknowledge that they were unable to access all potentially eligible study reports or to carry out some of the planned analyses because of insufficient data. 149 150 While some limitations may affect the validity of the review findings, others may not.

Discuss any limitations of the review processes used and comment on the potential impact of each limitation.

Example of item 23c of PRISMA 2020 checklist.

“Because of time constraints…we dually screened only 30% of the titles and abstracts; for the rest, we used single screening. A recent study showed that single abstract screening misses up to 13% of relevant studies (Gartlehner 2020). In addition, single review authors rated risk of bias, conducted data extraction and rated certainty of evidence. A second review author checked the plausibility of decisions and the correctness of data. Because these steps were not conducted dually and independently, we introduced some risk of error…Nevertheless, we are confident that none of these methodological limitations would change the overall conclusions of this review. Furthermore, we limited publications to English and Chinese languages. Because COVID-19 has become a rapidly evolving pandemic, we might have missed recent publications in languages of countries that have become heavily affected in the meantime (e.g. Italian or Spanish).” 199

Item 23d. Discuss implications of the results for practice, policy, and future research

Explanation: There are many potential end users of a systematic review (such as patients, healthcare providers, researchers, insurers, and policy makers), each of whom will want to know what actions they should take given the review findings. Patients and healthcare providers may be primarily interested in the balance of benefits and harms, while policy makers and administrators may value data on organisational impact and resource utilisation. For reviews of interventions, authors might clarify trade-offs between benefits and harms and how the values attached to the most important outcomes of the review might lead different people to make different decisions. In addition, rather than making recommendations for practice or policy that apply universally, authors might discuss factors that are important in translating the evidence to different settings and factors that may modify the magnitude of effects.

Explicit recommendations for future research—as opposed to general statements such as “More research on this question is needed”—can better direct the questions future studies should address and the methods that should be used. For example, authors might consider describing the type of understudied participants who should be enrolled in future studies, the specific interventions that could be compared, suggested outcome measures to use, and ideal study design features to employ.

Discuss implications of the results for practice and policy.

Make explicit recommendations for future research.

Example of item 23d of PRISMA 2020 checklist.

“Implications for practice and policy: Findings from this review indicate that bystander programs have significant beneficial effects on bystander intervention behaviour. This provides important evidence of the effectiveness of mandated programs on college campuses. Additionally, the fact that our (preliminary) moderator analyses found program effects on bystander intervention to be similar for adolescents and college students suggests early implementation of bystander programs (i.e. in secondary schools with adolescents) may be warranted. Importantly, although we found that bystander programs had a significant beneficial effect on bystander intervention behaviour, we found no evidence that these programs had an effect on participants' sexual assault perpetration. Bystander programs may therefore be appropriate for targeting bystander behaviour, but may not be appropriate for targeting the behaviour of potential perpetrators. Additionally, effects of bystander programs on bystander intervention behaviour diminished by 6‐month post‐intervention. Thus, programs effects may be prolonged by the implementation of booster sessions conducted prior to 6 months post‐intervention.

Implications for research: Findings from this review suggest there is a fairly strong body of research assessing the effects of bystander programs on attitudes and behaviours. However, there are a couple of important questions worth further exploration…Our understanding of the causal mechanisms of program effects on bystander behaviour would benefit from further analysis (e.g., path analysis mapping relationships between specific knowledge/attitude effects and bystander intervention)…Our understanding of the differential effects of gendered versus gender neutral programs would benefit from the design and implementation of high-quality primary studies that make direct comparisons between these two types of programs (e.g., RCTs comparing the effects of two active treatment arms that differ in their gendered approach)…Our understanding of bystander programs' generalizability to non-US contexts would be greatly enhanced by high quality research conducted across the world.” 200

Registration and protocol

Item 24a. provide registration information for the review, including register name and registration number, or state that the review was not registered.

Explanation: Stating where the systematic review was registered (such as PROSPERO, Open Science Framework) and the registration number or DOI for the register entry (see box 6 ) facilitates identification of the systematic review in the register. This allows readers to compare what was pre-specified with what was eventually reported in the review and decide if any deviations may have introduced bias. Reporting registration information also facilitates linking of publications related to the same systematic review (such as when a review is presented at a conference and published in a journal). 154

Box 6. Systematic review registration and protocols.

Registration aims to reduce bias, increase transparency, facilitate scrutiny and improve trustworthiness of systematic reviews. 151 152 Registration also aims to reduce unintended duplication; researchers planning a new review should search register listings to identify similar completed or ongoing reviews before deciding whether their review is needed, noting that planned duplication may be justified. 151

A registration entry captures key elements of the review protocol and is submitted to a host register, ideally before starting the review. The register maintains a permanent public record of this information along with any subsequent amendments (date-stamped) and issues a unique number to link the registration entry to completed review publications. 153 Publicly recording details of inclusion and exclusion criteria, planned outcomes, and syntheses enables peer reviewers, journal editors, and readers to compare the completed review with what was planned, identify any deviations, and decide whether these may have introduced bias.

PROSPERO ( www.crd.york.ac.uk/prospero/ ) currently registers systematic reviews with direct health outcomes. It also accepts systematic reviews of animal studies that have direct implications for human health, and methodology reviews which have direct bearing on human health or systematic review conduct. Reviews not meeting the criteria for inclusion in PROSPERO could be registered elsewhere; for example, in the Open Science Framework (OSF) repository. Both PROSPERO and OSF allow for registration without cost.

A review protocol is distinct from a register entry for a review. A review protocol outlines in detail the pre-planned objectives and methods intended to be used to conduct the review, helping to anticipate/avoid potential problems before embarking on a review and providing a methodical approach to prevent arbitrary decision making during the review process. 22 Systematic reviewers are encouraged to report their protocols in accordance with the PRISMA guidance for protocols (PRISMA-P). 21 PRISMA-P consists of a checklist 21 accompanied by a detailed guidance document providing researchers with a step-by-step approach for documenting a systematic review protocol. 22

A review protocol should be a public document in order to facilitate future purposeful replications or updates of the review and to help future users evaluate whether selective reporting and potential bias were present in the review process. 22 Review protocols can be made public through one of several routes. One option is to upload a PDF of the protocol to the corresponding PROSPERO registration record so they are linked in perpetuity. Another option is to make a protocol a document with its own unique identifier (that is, a DOI) so it can be cited across various documents including the PROSPERO registration record and in the full text of the completed review. To achieve this, reviewers may opt to publish a protocol in a journal that is open access or provides free access to content (such as Systematic Reviews , BMJ Open ) or a journal using the Registered Reports publishing framework ( https://cos.io/rr/ ), where it will benefit from external feedback before publication, or deposit a protocol in a general purpose or institutional open access repository (such as Open Science Framework Registries, Zenodo).

Provide registration information for the review, including register name and registration number, or state that the review was not registered.

Example of item 24a of PRISMA 2020 checklist.

“…this systematic review has been registered in the international prospective register of systematic reviews (PROSPERO) under the registration number: CRD42019128569” 201

Item 24b. Indicate where the review protocol can be accessed, or state that a protocol was not prepared

Explanation: The review protocol may contain information about the methods that is not provided in the final review report (see box 6 ). Providing a citation, DOI, or link to the review protocol allows readers to locate the protocol more easily. Comparison of the methods pre-specified in the review protocol with what was eventually done allows readers to assess whether any deviations may have introduced bias. 155 If the review protocol was not published or deposited in a public repository, or uploaded as a supplementary file to the review report, we recommend providing the contact details of the author responsible for sharing the protocol. If authors did not prepare a review protocol, or prepared one but are not willing to make it accessible, this should be stated to prevent users spending time trying to locate the document.

Indicate where the review protocol can be accessed (such as by providing a citation, DOI, or link) or state that a protocol was not prepared.

Example of item 24b of PRISMA 2020 checklist.

“…this systematic review and meta-analysis protocol has been published elsewhere [citation for the protocol provided].” 202

Item 24c. Describe and explain any amendments to information provided at registration or in the protocol

Explanation: Careful consideration of a review’s methodological and analytical approach early on is likely to lessen unnecessary changes after protocol development. 22 However, it is difficult to anticipate all scenarios that will arise, necessitating some clarifications, modifications, and changes to the protocol (such as data available may not be amenable to the planned meta-analysis). 155 156 For reasons of transparency, authors should report details of any amendments. Amendments could be recorded in various places, including the full text of the review, a supplementary file, or as amendments to the published protocol or registration record.

Report details of any amendments to information provided at registration or in the protocol, noting: ( a ) the amendment itself, ( b ) the reason for the amendment, and ( c ) the stage of the review process at which the amendment was implemented.

Example of item 24c of PRISMA 2020 checklist.

“Differences from protocol: We modified the lower limit for age in our eligibility criteria from 12 years of age to 10 years of age because the age of adolescence was reduced. We used the WHO measures for severe anaemia, defined by haemoglobin levels < 80 g/L instead of < 70 g/L as stated in the protocol. We decided to add adverse events to our list of primary outcomes (instead of secondary) and we changed reinfection rate to a secondary outcome.” 203

Item 25. Describe sources of financial or non-financial support for the review, and the role of the funders or sponsors in the review

Explanation: As with any research report, authors should be transparent about the sources of support received to conduct the review. For example, funders may provide salary to researchers to undertake the review, the services of an information specialist to conduct searches, or access to commercial databases that would otherwise not have been available. Authors may have also obtained support from a translation service to translate articles or in-kind use of software to manage or analyse the study data. In some reviews, the funder or sponsor (that is, the individual or organisation assuming responsibility for the initiation and management of the review) may have contributed to defining the review question, determining eligibility of studies, collecting data, analysing data, interpreting results, or approving the final review report. There is potential for bias in the review findings arising from such involvement, particularly when the funder or sponsor has an interest in obtaining a particular result. 157

Describe sources of financial or non-financial support for the review, specifying relevant grant ID numbers for each funder. If no specific financial or non-financial support was received, this should be stated.

Describe the role of the funders or sponsors (or both) in the review. If funders or sponsors had no role in the review, this should be declared—for example, by stating, “The funders had no role in the design of the review, data collection and analysis, decision to publish, or preparation of the manuscript.”

Example of item 25 of PRISMA 2020 checklist.

“Funding/Support: This research was funded under contract HHSA290201500009i, Task Order 7, from the Agency for Healthcare Research and Quality (AHRQ), US Department of Health and Human Services, under a contract to support the US Preventive Services Task Force (USPSTF). Role of the Funder/Sponsor: Investigators worked with USPSTF members and AHRQ staff to develop the scope, analytic framework, and key questions for this review. AHRQ had no role in study selection, quality assessment, or synthesis. AHRQ staff provided project oversight, reviewed the report to ensure that the analysis met methodological standards, and distributed the draft for peer review. Otherwise, AHRQ had no role in the conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript findings. The opinions expressed in this document are those of the authors and do not reflect the official position of AHRQ or the US Department of Health and Human Services.” 204

Competing interests

Item 26. declare any competing interests of review authors.

Explanation: Authors of a systematic review may have relationships with organisations or entities with an interest in the review findings (for example, an author may serve as a consultant for a company manufacturing the drug or device under review). 158 Such relationships or activities are examples of a competing interest (or conflict of interest), which can negatively affect the integrity and credibility of systematic reviews. For example, evidence suggests that systematic reviews with financial competing interests more often have conclusions favourable to the experimental intervention than systematic reviews without financial competing interests. 159 Information about authors’ relationships or activities that readers could consider pertinent or to have influenced the review should be disclosed using the format requested by the publishing entity (such as using the International Committee of Medical Journal Editors (ICMJE) disclosure form). 160 Authors should report how competing interests were managed for particular review processes. For example, if a review author was an author of an included study, they may have been prevented from assessing the risk of bias in the study results.

Disclose any of the authors’ relationships or activities that readers could consider pertinent or to have influenced the review.

If any authors had competing interests, report how they were managed for particular review processes.

Example of item 26 of PRISMA 2020 checklist.

“Declarations of interest: R Buchbinder was a principal investigator of Buchbinder 2009. D Kallmes was a principal investigator of Kallmes 2009 and Evans 2015. D Kallmes participated in IDE trial for Benvenue Medical spinal augmentation device. He is a stockholder, Marblehead Medical, LLC, Development of spine augmentation devices. He holds a spinal fusion patent license, unrelated to spinal augmentation/vertebroplasty. R Buchbinder and D Kallmes did not perform risk of bias assessments for their own or any other placebo‐controlled trials included in the review.” 205

Availability of data, code, and other materials

Item 27. report which of the following are publicly available and where they can be found: template data collection forms; data extracted from included studies; data used for all analyses; analytic code; any other materials used in the review.

Explanation: Sharing of data, analytic code, and other materials enables others to reuse the data, check the data for errors, attempt to reproduce the findings, and understand more about the analysis than may be provided by descriptions of methods. 161 162 Support for sharing of data, analytic code, and other materials is growing, including from patients 163 and journal editors, including BMJ and PLOS Medicine . 164

Sharing of data, analytic code, and other materials relevant to a systematic review includes making various items publicly available, such as the template data collection forms; all data extracted from included studies; a file indicating necessary data conversions; the clean dataset(s) used for all analyses in a format ready for reuse (such as CSV file); metadata (such as complete descriptions of variable names, README files describing each file shared); analytic code used in software with a command-line interface or complete descriptions of the steps used in point-and-click software to run all analyses. Other materials might include more detailed information about the intervention delivered in the primary studies that are otherwise not available, such as a video of the specific cognitive behavioural therapy supplied by the study investigators to reviewers. 73 Similarly, other material might include a list of all citations screened and any decisions about eligibility.

Because sharing of data, analytic code, and other materials is not yet universal in health and medical research, 164 even interested authors may not know how to make their materials publicly available. Data, analytic code, and other materials can be uploaded to one of several publicly accessible repositories (such as Open Science Framework, Dryad, figshare). The Systematic Review Data Repository ( https://srdr.ahrq.gov/ ) is another example of a platform for sharing materials specific to the systematic review community. 165 All of these open repositories should be given consideration, particularly if the completed review is to be considered for publication in a paywalled journal. The Findable, Accessible, Interoperable, Reusable (FAIR) data principles are also a useful resource for authors to consult, 166 as they provide guidance on the best way to share information.

There are some situations where authors might not be able to share review materials, such as when the review team are custodians rather than owners of individual participant data, or when there are legal or licensing restrictions. For example, records exported directly from bibliographic databases (such as Ovid MEDLINE) typically include copyrighted material; authors should read the licensing terms of the databases they search to see what they can share and to consider the copyright legislation of their countries.

Report which of the following are publicly available: template data collection forms; data extracted from included studies; data used for all analyses; analytic code; any other materials used in the review.

If any of the above materials are publicly available, report where they can be found (such as provide a link to files deposited in a public repository).

If data, analytic code, or other materials will be made available upon request, provide the contact details of the author responsible for sharing the materials and describe the circumstances under which such materials will be shared.

Example of item 27 of PRISMA 2020 checklist.

“All meta-analytic data and all codebooks and analysis scripts (for Mplus and R) are publicly available at the study’s associated page on the Open Science Framework ( https://osf.io/r8a24/ )...The precise sources (table, section, or paragraph) for each estimate are described in notes in the master data spreadsheet, available on the Open Science Framework page for this study ( https://osf.io/r8a24/ )” 206

Conclusion to PRISMA 2020 explanation and elaboration

This explanation and elaboration paper has been designed to assist authors seeking comprehensive guidance on what to include in systematic review reports. We hope that use of this resource will lead to more transparent, complete, and accurate reporting of systematic reviews, thus facilitating evidence-based decision making.

Acknowledgments

We dedicate this paper to the late Douglas G Altman and Alessandro Liberati, whose contributions were fundamental to the development and implementation of the original PRISMA statement.

Extra material supplied by the author

Further examples of good reporting practice

Contributors: DM and JEM are joint senior authors. MJP, JEM, PMB, IB, TCH, CDM, LS and DM conceived this paper and designed the literature review and survey conducted to inform the guideline content. MJP conducted the literature review, administered the survey and analysed the data for both. MJP prepared all materials for the development meeting. MJP and JEM presented proposals at the development meeting. All authors except for TCH, JMT, EAA, SEB and LAM attended the development meeting. MJP and JEM took and consolidated notes from the development meeting. MJP and JEM led the drafting and editing of the article. JEM, PMB, IB, TCH, LS, JMT, EAA, SEB, RC, JG, AH, TL, EMW, SM, LAM, LAS, JT, ACT, PW and DM drafted particular sections of the article. All authors were involved in revising the article critically for important intellectual content. All authors approved the final version of the article. MJP is the guarantor of this work. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Competing interests: All authors have completed the ICMJE uniform disclosure form at http://www.icmje.org/conflicts-of-interest/ and declare: EL is head of research for the BMJ ; MJP is an editorial board member for PLOS Medicine ; ACT is an associate editor and MJP, TL, EMW, and DM are editorial board members for the Journal of Clinical Epidemiology ; DM and LAS were editors in chief, LS, JMT, and ACT are associate editors, and JG is an editorial board member for Systematic Reviews ; none of these authors were involved in the peer review process or decision to publish. TCH has received personal fees from Elsevier outside the submitted work. EMW has received personal fees from the American Journal for Public Health , for which he is the editor for systematic reviews. VW is editor in chief of the Campbell Collaboration which produces systematic reviews and co-convenor of the Campbell and Cochrane equity methods group. DM is chair of the EQUATOR Network, IB is adjunct director of the French EQUATOR Centre and TCH is co-director of the Australasian EQUATOR Centre, which advocate for the use of reporting guidelines to improve the quality of reporting in research articles. JMT received salary from Evidence Partners Inc, creators of DistillerSR software for systematic reviews; Evidence Partners Inc was not involved in the design or outcomes of the statement and the views expressed solely represent those of the author.

Provenance and peer review: Not commissioned; externally peer reviewed.

1. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ 2009;339:b2535. 10.1136/bmj.b2535 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
2. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. J Clin Epidemiol 2009;62:1006-12. 10.1016/j.jclinepi.2009.06.005 [ DOI ] [ PubMed ] [ Google Scholar ]
3. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med 2009;151:264-9, W64. 10.7326/0003-4819-151-4-200908180-00135 [ DOI ] [ PubMed ] [ Google Scholar ]
4. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 2009;6:e1000097. 10.1371/journal.pmed.1000097 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
5. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Int J Surg 2010;8:336-41. 10.1016/j.ijsu.2010.02.007 [ DOI ] [ PubMed ] [ Google Scholar ]
6. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA Statement. Open Med 2009;3:e123-30. [ PMC free article ] [ PubMed ] [ Google Scholar ]
7. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group . Reprint--preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Phys Ther 2009;89:873-80. 10.1093/ptj/89.9.873 [ DOI ] [ PubMed ] [ Google Scholar ]
8. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. J Clin Epidemiol 2009;62:e1-34. 10.1016/j.jclinepi.2009.06.006 [ DOI ] [ PubMed ] [ Google Scholar ]
9. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ 2009;339:b2700. 10.1136/bmj.b2700 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
10. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Ann Intern Med 2009;151:W65-94. 10.7326/0003-4819-151-4-200908180-00136 [ DOI ] [ PubMed ] [ Google Scholar ]
11. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med 2009;6:e1000100. 10.1371/journal.pmed.1000100 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
12. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med 2009;6:e1000100. 10.1371/journal.pmed.1000100 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
13. Page MJ, Moher D. Evaluations of the uptake and impact of the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) Statement and extensions: a scoping review. Syst Rev 2017;6:263. . 10.1186/s13643-017-0663-8 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
14. Tong A, Flemming K, McInnes E, Oliver S, Craig J. Enhancing transparency in reporting the synthesis of qualitative research: ENTREQ. BMC Med Res Methodol 2012;12:181. . 10.1186/1471-2288-12-181 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
15. France EF, Cunningham M, Ring N, et al. Improving reporting of meta-ethnography: the eMERGe reporting guidance. BMC Med Res Methodol 2019;19:25. . 10.1186/s12874-018-0600-0 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
16. Hutton B, Salanti G, Caldwell DM, et al. The PRISMA extension statement for reporting of systematic reviews incorporating network meta-analyses of health care interventions: checklist and explanations. Ann Intern Med 2015;162:777-84. 10.7326/M14-2385 [ DOI ] [ PubMed ] [ Google Scholar ]
17. Stewart LA, Clarke M, Rovers M, et al. PRISMA-IPD Development Group . Preferred Reporting Items for Systematic Review and Meta-Analyses of individual participant data: the PRISMA-IPD Statement. JAMA 2015;313:1657-65. 10.1001/jama.2015.3656 [ DOI ] [ PubMed ] [ Google Scholar ]
18. Zorzela L, Loke YK, Ioannidis JP, et al. PRISMAHarms Group . PRISMA harms checklist: improving harms reporting in systematic reviews. BMJ 2016;352:i157. . 10.1136/bmj.i157 [ DOI ] [ PubMed ] [ Google Scholar ]
19. McInnes MDF, Moher D, Thombs BD, et al. and the PRISMA-DTA Group . Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: The PRISMA-DTA statement. JAMA 2018;319:388-96. . 10.1001/jama.2017.19163 [ DOI ] [ PubMed ] [ Google Scholar ]
20. Tricco AC, Lillie E, Zarin W, et al. PRISMA extension for scoping reviews (PRISMA-SCR): Checklist and explanation. Ann Intern Med 2018;169:467-73. . 10.7326/M18-0850 [ DOI ] [ PubMed ] [ Google Scholar ]
21. Moher D, Shamseer L, Clarke M, et al. PRISMA-P Group . Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev 2015;4:1. 10.1186/2046-4053-4-1 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
22. Shamseer L, Moher D, Clarke M, et al. PRISMA-P Group . Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: elaboration and explanation. BMJ 2015;350:g7647. 10.1136/bmj.g7647 [ DOI ] [ PubMed ] [ Google Scholar ]
23. Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021;372:n71. 10.1136/bmj.n71. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
24. Page MJ, McKenzie JE, Bossuyt PM, et al. Updating guidance for reporting systematic reviews: development of the PRISMA 2020 statement. J Clin Epidemiol 2021;S0895-4356(21)00040-8. 10.1016/j.jclinepi.2021.02.003. [ DOI ] [ PubMed ] [ Google Scholar ]
25. Barnes C, Boutron I, Giraudeau B, Porcher R, Altman DG, Ravaud P. Impact of an online writing aid tool for writing a randomized trial report: the COBWEB (Consort-based WEB tool) randomized controlled trial. BMC Med 2015;13:221. . 10.1186/s12916-015-0460-y [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
26. Chauvin A, Ravaud P, Moher D, et al. Accuracy in detecting inadequate research reporting by early career peer reviewers using an online CONSORT-based peer-review tool (COBPeer) versus the usual peer-review process: a cross-sectional diagnostic study. BMC Med 2019;17:205. . 10.1186/s12916-019-1436-0 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
27. Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Version 6.0. Cochrane Collaboration, 2019. https://training.cochrane.org/handbook .
28. McKenzie JE, Brennan SE. Synthesizing and presenting findings using other methods. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Collaboration, 2019. 10.1002/9781119536604.ch12 . [ DOI ] [ Google Scholar ]
29. Beller EM, Glasziou PP, Altman DG, et al. PRISMA for Abstracts Group . PRISMA for Abstracts: reporting systematic reviews in journal and conference abstracts. PLoS Med 2013;10:e1001419. 10.1371/journal.pmed.1001419 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
30. Welch V, Petticrew M, Petkovic J, et al. PRISMA-Equity Bellagio group . Extending the PRISMA statement to equity-focused systematic reviews (PRISMA-E 2012): explanation and elaboration. J Clin Epidemiol 2016;70:68-89. 10.1016/j.jclinepi.2015.09.001 [ DOI ] [ PubMed ] [ Google Scholar ]
31. Thomas J, Kneale D, McKenzie JE, et al. Determining the scope of the review and the questions it will address. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions . Cochrane Collaboration, 2019. 10.1002/9781119536604.ch2 . [ DOI ] [ Google Scholar ]
32. Rehfuess EA, Booth A, Brereton L, et al. Towards a taxonomy of logic models in systematic reviews and health technology assessments: A priori, staged, and iterative approaches. Res Synth Methods 2018;9:13-24. . 10.1002/jrsm.1254 [ DOI ] [ PubMed ] [ Google Scholar ]
33. Booth A, Noyes J, Flemming K, Moore G, Tunçalp Ö, Shakibazadeh E. Formulating questions to explore complex interventions within qualitative evidence synthesis. BMJ Glob Health 2019;4(Suppl 1):e001107. . 10.1136/bmjgh-2018-001107 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
34. Munn Z, Stern C, Aromataris E, Lockwood C, Jordan Z. What kind of systematic review should I conduct? A proposed typology and guidance for systematic reviewers in the medical and health sciences. BMC Med Res Methodol 2018;18:5. . 10.1186/s12874-017-0468-4 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
35. IOM (Institute of Medicine) . Finding What Works in Health Care: Standards for Systematic Reviews. The National Academies Press, 2011. [ PubMed ] [ Google Scholar ]
36. Page MJ, Shamseer L, Altman DG, et al. Epidemiology and Reporting Characteristics of Systematic Reviews of Biomedical Research: A Cross-Sectional Study. PLoS Med 2016;13:e1002028. . 10.1371/journal.pmed.1002028 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
37. McKenzie JE, Brennan SE, Ryan RE, et al. Defining the criteria for including studies and how they will be grouped for the synthesis. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions . Cochrane Collaboration, 2019. 10.1002/9781119536604.ch3 . [ DOI ] [ Google Scholar ]
38. Chamberlain C, O’Mara-Eves A, Porter J, et al. Psychosocial interventions for supporting women to stop smoking in pregnancy. Cochrane Database Syst Rev 2017;2:CD001055. . 10.1002/14651858.CD001055.pub5 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
39. Dwan KM, Williamson PR, Kirkham JJ. Do systematic reviews still exclude studies with “no relevant outcome data”? BMJ 2017;358:j3919. . 10.1136/bmj.j3919 [ DOI ] [ PubMed ] [ Google Scholar ]
40. Lefebvre C, Glanville J, Briscoe S, et al. Searching for and selecting studies. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions . Cochrane Collaboration, 2019. 10.1002/9781119536604.ch4 . [ DOI ] [ Google Scholar ]
41. Rethlefsen ML, Kirtley S, Waffenschmidt S, et al. PRISMA-S Group . PRISMA-S: an extension to the PRISMA statement for reporting literature searches in systematic reviews. Syst Rev 2021;10:39. 10.1186/s13643-020-01542-z. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
42. Faggion CM, Jr, Huivin R, Aranda L, Pandis N, Alarcon M. The search and selection for primary studies in systematic reviews published in dental journals indexed in MEDLINE was not fully reproducible. J Clin Epidemiol 2018;98:53-61. . 10.1016/j.jclinepi.2018.02.011 [ DOI ] [ PubMed ] [ Google Scholar ]
43. Spry C, Mierzwinski-Urban M. The impact of the peer review of literature search strategies in support of rapid review reports. Res Synth Methods 2018;9:521-6. . 10.1002/jrsm.1330 [ DOI ] [ PubMed ] [ Google Scholar ]
44. ISSG Search Filter Resource. Glanville J, Lefebvre C, Wright K, eds. The InterTASC Information Specialists' Sub-Group. 2020. https://sites.google.com/a/york.ac.uk/issg-search-filters-resource/home .
45. Stansfield C, O’Mara-Eves A, Thomas J. Text mining for search term development in systematic reviewing: A discussion of some methods and challenges. Res Synth Methods 2017;8:355-65. . 10.1002/jrsm.1250 [ DOI ] [ PubMed ] [ Google Scholar ]
46. Glanville J. Text mining for information specialists. In: Levay P, Craven J, eds. Systematic Searching: Practical ideas for improving results. Facet Publishing, 2019. [ Google Scholar ]
47. Clark J, Glasziou P, Del Mar C, Bannach-Brown A, Stehlik P, Scott AM. A full systematic review was completed in 2 weeks using automation tools: a case study. J Clin Epidemiol 2020;121:81-90. . 10.1016/j.jclinepi.2020.01.008 [ DOI ] [ PubMed ] [ Google Scholar ]
48. McGowan J, Sampson M, Salzwedel DM, Cogo E, Foerster V, Lefebvre C. PRESS Peer Review of Electronic Search Strategies: 2015 guideline statement. J Clin Epidemiol 2016;75:40-6. . 10.1016/j.jclinepi.2016.01.021 [ DOI ] [ PubMed ] [ Google Scholar ]
49. Wang Z, Nayfeh T, Tetzlaff J, O’Blenis P, Murad MH. Error rates of human reviewers during abstract screening in systematic reviews. PLoS One 2020;15:e0227742. . 10.1371/journal.pone.0227742 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
50. Waffenschmidt S, Knelangen M, Sieben W, Bühn S, Pieper D. Single screening versus conventional double screening for study selection in systematic reviews: a methodological systematic review. BMC Med Res Methodol 2019;19:132. . 10.1186/s12874-019-0782-0 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
51. Gartlehner G, Affengruber L, Titscher V, et al. Single-reviewer abstract screening missed 13 percent of relevant studies: a crowd-based, randomized controlled trial. J Clin Epidemiol 2020;121:20-8. . 10.1016/j.jclinepi.2020.01.005 [ DOI ] [ PubMed ] [ Google Scholar ]
52. Shemilt I, Khan N, Park S, Thomas J. Use of cost-effectiveness analysis to compare the efficiency of study identification methods in systematic reviews. Syst Rev 2016;5:140. . 10.1186/s13643-016-0315-4 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
53. O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev 2015;4:5. . 10.1186/2046-4053-4-5 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
54. Marshall IJ, Noel-Storr A, Kuiper J, Thomas J, Wallace BC. Machine learning for identifying randomized controlled trials: an evaluation and practitioner’s guide. Res Synth Methods 2018;9:602-14. . 10.1002/jrsm.1287 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
55. Noel-Storr A. Working with a new kind of team: harnessing the wisdom of the crowd in trial identification. EFSA J 2019;17(Suppl 1):e170715. . 10.2903/j.efsa.2019.e170715 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
56. Nama N, Sampson M, Barrowman N, et al. Crowdsourcing the citation screening process for systematic reviews: validation study. J Med Internet Res 2019;21:e12953. . 10.2196/12953 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
57. Li T, Saldanha IJ, Jap J, et al. A randomized trial provided new evidence on the accuracy and efficiency of traditional vs. electronically annotated abstraction approaches in systematic reviews. J Clin Epidemiol 2019;115:77-89. . 10.1016/j.jclinepi.2019.07.005 [ DOI ] [ PubMed ] [ Google Scholar ]
58. Robson RC, Pham B, Hwee J, et al. Few studies exist examining methods for selecting studies, abstracting data, and appraising quality in a systematic review. J Clin Epidemiol 2019;106:121-35. . 10.1016/j.jclinepi.2018.10.003 [ DOI ] [ PubMed ] [ Google Scholar ]
59. e JY, Saldanha IJ, Canner J, Schmid CH, Le JT, Li T. Adjudication rather than experience of data abstraction matters more in reducing errors in abstracting data in systematic reviews. Res Synth Methods 2020;11:354-62. . 10.1002/jrsm.1396 [ DOI ] [ PubMed ] [ Google Scholar ]
60. Li T, Higgins JPT, Deeks JJ. Collecting data. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Collaboration, 2019. 10.1002/9781119536604.ch5 . [ DOI ] [ Google Scholar ]
61. Marshall IJ, Wallace BC. Toward systematic review automation: a practical guide to using machine learning tools in research synthesis. Syst Rev 2019;8:163. . 10.1186/s13643-019-1074-9 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
62. Jonnalagadda SR, Goyal P, Huffman MD. Automating data extraction in systematic reviews: a systematic review. Syst Rev 2015;4:78. . 10.1186/s13643-015-0066-7 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
63. Jackson JL, Kuriyama A, Anton A, et al. The accuracy of Google Translate for abstracting data from non-English-language trials for systematic reviews. Ann Intern Med 2019;171:677-9. . 10.7326/M19-0891 [ DOI ] [ PubMed ] [ Google Scholar ]
64. Jelicic Kadic A, Vucic K, Dosenovic S, Sapunar D, Puljak L. Extracting data from figures with software was faster, with higher interrater reliability than manual extraction. J Clin Epidemiol 2016;74:119-23. . 10.1016/j.jclinepi.2016.01.002 [ DOI ] [ PubMed ] [ Google Scholar ]
65. Mayo-Wilson E, Li T, Fusco N, Dickersin K, MUDS investigators . Practical guidance for using multiple data sources in systematic reviews and meta-analyses (with examples from the MUDS study). Res Synth Methods 2018;9:2-12. . 10.1002/jrsm.1277 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
66. Page MJ, Forbes A, Chau M, Green SE, McKenzie JE. Investigation of bias in meta-analyses due to selective inclusion of trial effect estimates: empirical study. BMJ Open 2016;6:e011863. . 10.1136/bmjopen-2016-011863 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
67. Mayo-Wilson E, Fusco N, Li T, Hong H, Canner JK, Dickersin K, MUDS investigators . Multiple outcomes and analyses in clinical trials create challenges for interpretation and research synthesis. J Clin Epidemiol 2017;86:39-50. . 10.1016/j.jclinepi.2017.05.007 [ DOI ] [ PubMed ] [ Google Scholar ]
68. Mayo-Wilson E, Li T, Fusco N, et al. Cherry-picking by trialists and meta-analysts can drive conclusions about intervention efficacy. J Clin Epidemiol 2017;91:95-110. . 10.1016/j.jclinepi.2017.07.014 [ DOI ] [ PubMed ] [ Google Scholar ]
69. López-López JA, Page MJ, Lipsey MW, Higgins JPT. Dealing with effect size multiplicity in systematic reviews and meta-analyses. Res Synth Methods 2018;9:336-51. . 10.1002/jrsm.1310 [ DOI ] [ PubMed ] [ Google Scholar ]
70. Page MJ, McKenzie JE, Kirkham J, et al. Bias due to selective inclusion and reporting of outcomes and analyses in systematic reviews of randomised trials of healthcare interventions. Cochrane Database Syst Rev 2014;(10):MR000035. 10.1002/14651858.MR000035.pub2 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
71. Lundh A, Boutron I, Stewart L, Hróbjartsson A. What to do with a clinical trial with conflicts of interest. BMJ Evid Based Med 2020;25:157-8. . 10.1136/bmjebm-2019-111230 [ DOI ] [ PubMed ] [ Google Scholar ]
72. Boutron I, Page MJ, Higgins JPT, et al. Considering bias and conflicts of interest among the included studies. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Collaboration, 2019. 10.1002/9781119536604.ch7 . [ DOI ] [ Google Scholar ]
73. Hoffmann TC, Oxman AD, Ioannidis JP, et al. Enhancing the usability of systematic reviews by improving the consideration and description of interventions. BMJ 2017;358:j2998. . 10.1136/bmj.j2998 [ DOI ] [ PubMed ] [ Google Scholar ]
74. Lewin S, Hendry M, Chandler J, et al. Assessing the complexity of interventions within systematic reviews: development, content and use of a new tool (iCAT_SR). BMC Med Res Methodol 2017;17:76. . 10.1186/s12874-017-0349-x [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
75. Montgomery P, Underhill K, Gardner F, Operario D, Mayo-Wilson E. The Oxford Implementation Index: a new tool for incorporating implementation data into systematic reviews and meta-analyses. J Clin Epidemiol 2013;66:874-82. . 10.1016/j.jclinepi.2013.03.006 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
76. Bai A, Shukla VK, Bak G, et al. Quality Assessment Tools Project Report. Canadian Agency for Drugs and Technologies in Health, 2012. [ Google Scholar ]
77. Büttner F, Winters M, Delahunt E, et al. Identifying the ‘incredible’! Part 1: assessing the risk of bias in outcomes included in systematic reviews. Br J Sports Med 2020;54:798-800. . 10.1136/bjsports-2019-100806 [ DOI ] [ PubMed ] [ Google Scholar ]
78. Moher D, Jadad AR, Nichol G, Penman M, Tugwell P, Walsh S. Assessing the quality of randomized controlled trials: an annotated bibliography of scales and checklists. Control Clin Trials 1995;16:62-73. 10.1016/0197-2456(94)00031-W [ DOI ] [ PubMed ] [ Google Scholar ]
79. Olivo SA, Macedo LG, Gadotti IC, Fuentes J, Stanton T, Magee DJ. Scales to assess the quality of randomized controlled trials: a systematic review. Phys Ther 2008;88:156-75. 10.2522/ptj.20070147 [ DOI ] [ PubMed ] [ Google Scholar ]
80. Page MJ, Higgins JP, Clayton G, Sterne JA, Hróbjartsson A, Savović J. Empirical evidence of study design biases in randomized trials: systematic review of meta-epidemiological studies. PLoS One 2016;11:e0159267. . 10.1371/journal.pone.0159267 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
81. Page MJ, Higgins JPT, Sterne JAC. Assessing risk of bias due to missing results in a synthesis. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Collaboration, 2019. 10.1002/9781119536604.ch13 . [ DOI ] [ Google Scholar ]
82. Chan AW, Song F, Vickers A, et al. Increasing value and reducing waste: addressing inaccessible research. Lancet 2014;383:257-66. . 10.1016/S0140-6736(13)62296-5 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
83. Dwan K, Gamble C, Williamson PR, Kirkham JJ, Reporting Bias Group . Systematic review of the empirical evidence of study publication bias and outcome reporting bias - an updated review. PLoS One 2013;8:e66844. . 10.1371/journal.pone.0066844 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
84. Page MJ, McKenzie JE, Higgins JPT. Tools for assessing risk of reporting biases in studies and syntheses of studies: a systematic review. BMJ Open 2018;8:e019703. . 10.1136/bmjopen-2017-019703 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
85. Whiting P, Wolff R, Mallett S, Simera I, Savović J. A proposed framework for developing quality assessment tools. Syst Rev 2017;6:204. . 10.1186/s13643-017-0604-6 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
86. Sterne JAC, Savović J, Page MJ, et al. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ 2019;366:l4898. 10.1136/bmj.l4898 [ DOI ] [ PubMed ] [ Google Scholar ]
87. Sterne JA, Hernán MA, Reeves BC, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ 2016;355:i4919. . 10.1136/bmj.i4919 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
88. Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. J Am Med Inform Assoc 2016;23:193-201. . 10.1093/jamia/ocv044 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
89. Higgins JPT, Li T, Deeks JJ. Choosing effect measures and computing estimates of effect. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Collaboration, 2019. 10.1002/9781119536604.ch6 . [ DOI ] [ Google Scholar ]
90. Deeks JJ. Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes. Stat Med 2002;21:1575-600. . 10.1002/sim.1188 [ DOI ] [ PubMed ] [ Google Scholar ]
91. Schünemann HJ, Vist GE, Higgins JPT, et al. Interpreting results and drawing conclusions. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Collaboration, 2019. 10.1002/9781119536604.ch15 . [ DOI ] [ Google Scholar ]
92. McKenzie JE, Brennan SE, Ryan RE, et al. Summarizing study characteristics and preparing for synthesis. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Collaboration, 2019. 10.1002/9781119536604.ch9 . [ DOI ] [ Google Scholar ]
93. da Costa BR, Rutjes AW, Johnston BC, et al. Methods to convert continuous outcomes into odds ratios of treatment response and numbers needed to treat: meta-epidemiological study. Int J Epidemiol 2012;41:1445-59. . 10.1093/ije/dys124 [ DOI ] [ PubMed ] [ Google Scholar ]
94. Weir CJ, Butcher I, Assi V, et al. Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review. BMC Med Res Methodol 2018;18:25. . 10.1186/s12874-018-0483-0 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
95. Mavridis D, White IR. Dealing with missing outcome data in meta-analysis. Res Synth Methods 2020;11:2-13. . 10.1002/jrsm.1349 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
96. Kahale LA, Diab B, Brignardello-Petersen R, et al. Systematic reviews do not adequately report or address missing outcome data in their analyses: a methodological survey. J Clin Epidemiol 2018;99:14-23. . 10.1016/j.jclinepi.2018.02.016 [ DOI ] [ PubMed ] [ Google Scholar ]
97. Guyatt G, Oxman AD, Akl EA, et al. GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables. J Clin Epidemiol 2011;64:383-94. . 10.1016/j.jclinepi.2010.04.026 [ DOI ] [ PubMed ] [ Google Scholar ]
98. Schünemann HJ, Higgins JPT, Vist GE, et al. Completing ‘Summary of findings’ tables and grading the certainty of the evidence. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Collaboration, 2019. 10.1002/9781119536604.ch14 . [ DOI ] [ Google Scholar ]
99. Anzures-Cabrera J, Higgins JP. Graphical displays for meta-analysis: An overview with suggestions for practice. Res Synth Methods 2010;1:66-80. . 10.1002/jrsm.6 [ DOI ] [ PubMed ] [ Google Scholar ]
100. Lewis S, Clarke M. Forest plots: trying to see the wood and the trees. BMJ 2001;322:1479-80. . 10.1136/bmj.322.7300.1479 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
101. Schriger DL, Altman DG, Vetter JA, Heafner T, Moher D. Forest plots in reports of systematic reviews: a cross-sectional study reviewing current practice. Int J Epidemiol 2010;39:421-9. . 10.1093/ije/dyp370 [ DOI ] [ PubMed ] [ Google Scholar ]
102. Kossmeier M, Tran US, Voracek M. Charting the landscape of graphical displays for meta-analysis and systematic reviews: a comprehensive review, taxonomy, and feature analysis. BMC Med Res Methodol 2020;20:26. . 10.1186/s12874-020-0911-9 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
103. Deeks JJ, Higgins JPT, Altman DG. Analysing data and undertaking meta-analyses. In: Higgins JPT, Thomas J, Chandler J, et al, eds. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Collaboration, 2019. 10.1002/9781119536604.ch10 . [ DOI ] [ Google Scholar ]
104. McKenzie JE, Beller EM, Forbes AB. Introduction to systematic reviews and meta-analysis. Respirology 2016;21:626-37. 10.1111/resp.12783 [ DOI ] [ PubMed ] [ Google Scholar ]
105. Rice K, Higgins J, Lumley T. A re‐evaluation of fixed effect(s) meta‐analysis. J R Stat Soc Ser A Stat Soc 2018;181:205-27. 10.1111/rssa.12275 . [ DOI ] [ Google Scholar ]
106. Sterne JA, Sutton AJ, Ioannidis JP, et al. Recommendations for examining and interpreting funnel plot asymmetry in meta-analyses of randomised controlled trials. BMJ 2011;343:d4002. . 10.1136/bmj.d4002 [ DOI ] [ PubMed ] [ Google Scholar ]
107. Page MJ, Altman DG, McKenzie JE, et al. Flaws in the application and interpretation of statistical analyses in systematic reviews of therapeutic interventions were common: a cross-sectional analysis. J Clin Epidemiol 2018;95:7-18. . 10.1016/j.jclinepi.2017.11.022 [ DOI ] [ PubMed ] [ Google Scholar ]
108. Veroniki AA, Jackson D, Bender R, et al. Methods to calculate uncertainty in the estimated overall effect size from a random-effects meta-analysis. Res Synth Methods 2019;10:23-43. . 10.1002/jrsm.1319 [ DOI ] [ PubMed ] [ Google Scholar ]
109. Thompson SG, Higgins JP. How should meta-regression analyses be undertaken and interpreted? Stat Med 2002;21:1559-73. . 10.1002/sim.1187 [ DOI ] [ PubMed ] [ Google Scholar ]
110. Mavridis D, Salanti G. A practical introduction to multivariate meta-analysis. Stat Methods Med Res 2013;22:133-58. . 10.1177/0962280211432219 [ DOI ] [ PubMed ] [ Google Scholar ]
111. Konstantopoulos S. Fixed effects and variance components estimation in three-level meta-analysis. Res Synth Methods 2011;2:61-76. . 10.1002/jrsm.35 [ DOI ] [ PubMed ] [ Google Scholar ]
112. Hedges LV, Tipton E, Johnson MC. Robust variance estimation in meta-regression with dependent effect size estimates. Res Synth Methods 2010;1:39-65. . 10.1002/jrsm.5 [ DOI ] [ PubMed ] [ Google Scholar ]
113. Veroniki AA, Jackson D, Viechtbauer W, et al. Methods to estimate the between-study variance and its uncertainty in meta-analysis. Res Synth Methods 2016;7:55-79. . 10.1002/jrsm.1164 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
114. Langan D, Higgins JPT, Jackson D, et al. A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses. Res Synth Methods 2019;10:83-98. . 10.1002/jrsm.1316 [ DOI ] [ PubMed ] [ Google Scholar ]
115. Higgins JPT, López-López JA, Becker BJ, et al. Synthesising quantitative evidence in systematic reviews of complex health interventions. BMJ Glob Health 2019;4(Suppl 1):e000858. . 10.1136/bmjgh-2018-000858 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
116. Campbell M, McKenzie JE, Sowden A, et al. Synthesis without meta-analysis (SWiM) in systematic reviews: reporting guideline. BMJ 2020;368:l6890. . 10.1136/bmj.l6890 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
117. Harris R, Bradburn M, Deeks J, et al. metan: fixed- and random-effects meta-analysis. Stata J 2008;8:3-28 10.1177/1536867X0800800102 . [ DOI ] [ Google Scholar ]
118. Viechtbauer W. Conducting Meta-Analyses in R with the metafor Package. J Stat Softw 2010;36:1-48. 10.18637/jss.v036.i03 . [ DOI ] [ Google Scholar ]
119. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ 2003;327:557-60. . 10.1136/bmj.327.7414.557 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
120. Riley RD, Higgins JP, Deeks JJ. Interpretation of random effects meta-analyses. BMJ 2011;342:d549. . 10.1136/bmj.d549 [ DOI ] [ PubMed ] [ Google Scholar ]
121. Fisher DJ, Carpenter JR, Morris TP, Freeman SC, Tierney JF. Meta-analytical methods to identify who benefits most from treatments: daft, deluded, or deft approach? BMJ 2017;356:j573. . 10.1136/bmj.j573 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
122. Mueller KF, Meerpohl JJ, Briel M, et al. Methods for detecting, quantifying, and adjusting for dissemination bias in meta-analysis are described. J Clin Epidemiol 2016;80:25-33. . 10.1016/j.jclinepi.2016.04.015 [ DOI ] [ PubMed ] [ Google Scholar ]
123. Vevea JL, Coburn K, Sutton A. Publication bias. In: Cooper H, Hedges LV, Valentine JC, eds. The Handbook of Research Synthesis and Meta-Analysis. Russell Sage Foundation, 2019: 383-430 10.7758/9781610448864.21 . [ DOI ] [ Google Scholar ]
124. Kirkham JJ, Altman DG, Chan AW, Gamble C, Dwan KM, Williamson PR. Outcome reporting bias in trials: a methodological approach for assessment and adjustment in systematic reviews. BMJ 2018;362:k3802. . 10.1136/bmj.k3802 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
125. Balshem H, Helfand M, Schünemann HJ, et al. GRADE guidelines: 3. Rating the quality of evidence. J Clin Epidemiol 2011;64:401-6. 10.1016/j.jclinepi.2010.07.015 [ DOI ] [ PubMed ] [ Google Scholar ]
126. Schunemann HJ, Brozek J, Guyatt G, et al., eds. Handbook for grading the quality of evidence and the strength of recommendations using the GRADE approach. McMaster University, 2013. [ Google Scholar ]
127. Berkman ND, Lohr KN, Ansari MT, et al. Grading the strength of a body of evidence when assessing health care interventions: an EPC update. J Clin Epidemiol 2015;68:1312-24. . 10.1016/j.jclinepi.2014.11.023 [ DOI ] [ PubMed ] [ Google Scholar ]
128. Nikolakopoulou A, Higgins JPT, Papakonstantinou T, et al. CINeMA: An approach for assessing confidence in the results of a network meta-analysis. PLoS Med 2020;17:e1003082. . 10.1371/journal.pmed.1003082 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
129. Hultcrantz M, Rind D, Akl EA, et al. The GRADE Working Group clarifies the construct of certainty of evidence. J Clin Epidemiol 2017;87:4-13. . 10.1016/j.jclinepi.2017.05.006 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
130. Santesso N, Glenton C, Dahm P, et al. GRADE Working Group . GRADE guidelines 26: informative statements to communicate the findings of systematic reviews of interventions. J Clin Epidemiol 2020;119:126-35. . 10.1016/j.jclinepi.2019.10.014 [ DOI ] [ PubMed ] [ Google Scholar ]
131. Boers M. Graphics and statistics for cardiology: designing effective tables for presentation and publication. Heart 2018;104:192-200. . 10.1136/heartjnl-2017-311581 [ DOI ] [ PubMed ] [ Google Scholar ]
132. Stovold E, Beecher D, Foxlee R, Noel-Storr A. Study flow diagrams in Cochrane systematic review updates: an adapted PRISMA flow diagram. Syst Rev 2014;3:54. 10.1186/2046-4053-3-54 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
133. Sampson M, Tetzlaff J, Urquhart C. Precision of healthcare systematic review searches in a cross-sectional sample. Res Synth Methods 2011;2:119-25. . 10.1002/jrsm.42 [ DOI ] [ PubMed ] [ Google Scholar ]
134. Haddaway NR, Westgate MJ. Predicting the time needed for environmental systematic reviews and systematic maps. Conserv Biol 2019;33:434-43. . 10.1111/cobi.13231 [ DOI ] [ PubMed ] [ Google Scholar ]
135. McDonagh M, Peterson K, Raina P, et al. AHRQ Methods for Effective Health Care: Avoiding Bias in Selecting Studies. Methods Guide for Effectiveness and Comparative Effectiveness Reviews. Agency for Healthcare Research and Quality, 2008. [ PubMed ] [ Google Scholar ]
136. McGuinness LA, Higgins JPT. Risk-of-bias VISualization (robvis): An R package and Shiny web app for visualizing risk-of-bias assessments. Res Synth Methods 2021;12:55-61. 10.1002/jrsm.1411. [ DOI ] [ PubMed ] [ Google Scholar ]
137. Altman DG, Cates C. The need for individual trial results in reports of systematic reviews. BMJ 2001;323:776.11588078 [ Google Scholar ]
138. Hemming K, Taljaard M, McKenzie JE, et al. Reporting of stepped wedge cluster randomised trials: extension of the CONSORT 2010 statement with explanation and elaboration. BMJ 2018;363:k1614. . 10.1136/bmj.k1614 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
139. Jap J, Saldanha IJ, Smith BT, Lau J, Schmid CH, Li T, Data Abstraction Assistant Investigators . Features and functioning of Data Abstraction Assistant, a software application for data abstraction during systematic reviews. Res Synth Methods 2019;10:2-14. . 10.1002/jrsm.1326 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
140. Silagy CA, Middleton P, Hopewell S. Publishing protocols of systematic reviews: comparing what was done to what was planned. JAMA 2002;287:2831-4. . 10.1001/jama.287.21.2831 [ DOI ] [ PubMed ] [ Google Scholar ]
141. Parmelli E, Liberati A, D’Amico R. Reporting of outcomes in systematic reviews: comparison of protocols and published systematic reviews (abstract). XV Cochrane Colloquium; 2007 Oct 23-27; São Paulo, Brazil . 2007:118-9. [ Google Scholar ]
142. Dwan K, Kirkham JJ, Williamson PR, Gamble C. Selective reporting of outcomes in randomised controlled trials in systematic reviews of cystic fibrosis. BMJ Open 2013;3:e002709. . 10.1136/bmjopen-2013-002709 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
143. Shah K, Egan G, Huan LN, Kirkham J, Reid E, Tejani AM. Outcome reporting bias in Cochrane systematic reviews: a cross-sectional analysis. BMJ Open 2020;10:e032497. . 10.1136/bmjopen-2019-032497 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
144. Schandelmaier S, Briel M, Varadhan R, et al. Development of the Instrument to assess the Credibility of Effect Modification Analyses (ICEMAN) in randomized controlled trials and meta-analyses. CMAJ 2020;192:E901-6. . 10.1503/cmaj.200077 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
145. Peters JL, Sutton AJ, Jones DR, Abrams KR, Rushton L. Contour-enhanced meta-analysis funnel plots help distinguish publication bias from other causes of asymmetry. J Clin Epidemiol 2008;61:991-6. . 10.1016/j.jclinepi.2007.11.010 [ DOI ] [ PubMed ] [ Google Scholar ]
146. Guyatt GH, Oxman AD, Santesso N, et al. GRADE guidelines: 12. Preparing summary of findings tables-binary outcomes. J Clin Epidemiol 2013;66:158-72. 10.1016/j.jclinepi.2012.01.012 [ DOI ] [ PubMed ] [ Google Scholar ]
147. Guyatt GH, Thorlund K, Oxman AD, et al. GRADE guidelines: 13. Preparing summary of findings tables and evidence profiles-continuous outcomes. J Clin Epidemiol 2013;66:173-83. 10.1016/j.jclinepi.2012.08.001 [ DOI ] [ PubMed ] [ Google Scholar ]
148. Santesso N, Carrasco-Labra A, Langendam M, et al. Improving GRADE evidence tables part 3: detailed guidance for explanatory footnotes supports creating and understanding GRADE certainty in the evidence judgments. J Clin Epidemiol 2016;74:28-39. . 10.1016/j.jclinepi.2015.12.006 [ DOI ] [ PubMed ] [ Google Scholar ]
149. Whiting P, Savović J, Higgins JP, et al. ROBIS group . ROBIS: A new tool to assess risk of bias in systematic reviews was developed. J Clin Epidemiol 2016;69:225-34. 10.1016/j.jclinepi.2015.06.005 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
150. Shea BJ, Reeves BC, Wells G, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ 2017;358:j4008. . 10.1136/bmj.j4008 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
151. Stewart L, Moher D, Shekelle P. Why prospective registration of systematic reviews makes sense. Syst Rev 2012;1:7. . 10.1186/2046-4053-1-7 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
152. Page MJ, Shamseer L, Tricco AC. Registration of systematic reviews in PROSPERO: 30,000 records and counting. Syst Rev 2018;7:32. . 10.1186/s13643-018-0699-4 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
153. Booth A, Clarke M, Dooley G, et al. The nuts and bolts of PROSPERO: an international prospective register of systematic reviews. Syst Rev 2012;1:2. . 10.1186/2046-4053-1-2 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
154. Sender D, Clark J, Hoffmann TC. Analysis of articles directly related to randomized trials finds poor protocol availability and inconsistent linking of articles. J Clin Epidemiol 2020;124:69-74. . 10.1016/j.jclinepi.2020.04.017 [ DOI ] [ PubMed ] [ Google Scholar ]
155. Koensgen N, Rombey T, Allers K, Mathes T, Hoffmann F, Pieper D. Comparison of non-Cochrane systematic reviews and their published protocols: differences occurred frequently but were seldom explained. J Clin Epidemiol 2019;110:34-41. . 10.1016/j.jclinepi.2019.02.012 [ DOI ] [ PubMed ] [ Google Scholar ]
156. Pieper D, Allers K, Mathes T, et al. Comparison of protocols and registry entries to published reports for systematic reviews. Cochrane Database Syst Rev 2020;(2). 10.1002/14651858.MR000053 . [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
157. Lundh A, Lexchin J, Mintzes B, Schroll JB, Bero L. Industry sponsorship and research outcome. Cochrane Database Syst Rev 2017;2:MR000033. 10.1002/14651858.MR000033.pub3. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
158. Hakoum MB, Anouti S, Al-Gibbawi M, et al. Reporting of financial and non-financial conflicts of interest by authors of systematic reviews: a methodological survey. BMJ Open 2016;6:e011997. . 10.1136/bmjopen-2016-011997 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
159. Hansen C, Lundh A, Rasmussen K, Hróbjartsson A. Financial conflicts of interest in systematic reviews: associations with results, conclusions, and methodological quality. Cochrane Database Syst Rev 2019;8:MR000047. . 10.1002/14651858.MR000047.pub2 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
160. Taichman DB, Backus J, Baethge C, et al. A disclosure form for work submitted to medical journals: a proposal from the International Committee of Medical Journal Editors. Ann Intern Med 2020;172:429-30. . 10.7326/M19-3933 [ DOI ] [ PubMed ] [ Google Scholar ]
161. Haddaway NR. Open Synthesis: on the need for evidence synthesis to embrace Open Science. Environ Evid 2018;7:26. 10.1186/s13750-018-0140-4 . [ DOI ] [ Google Scholar ]
162. Goldacre B, Morton CE, DeVito NJ. Why researchers should share their analytic code. BMJ 2019;367:l6365. . 10.1136/bmj.l6365 [ DOI ] [ PubMed ] [ Google Scholar ]
163. Mello MM, Lieou V, Goodman SN. Clinical trial participants’ views of the risks and benefits of data sharing. N Engl J Med 2018;378:2202-11. . 10.1056/NEJMsa1713258 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
164. Naudet F, Sakarovitch C, Janiaud P, et al. Data sharing and reanalysis of randomized controlled trials in leading biomedical journals with a full data sharing policy: survey of studies published in The BMJ and PLOS Medicine . BMJ 2018;360:k400. . 10.1136/bmj.k400 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
165. Saldanha IJ, Smith BT, Ntzani E, Jap J, Balk EM, Lau J. The Systematic Review Data Repository (SRDR): descriptive characteristics of publicly available data and opportunities for research. Syst Rev 2019;8:334. . 10.1186/s13643-019-1250-y [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
166. Wilkinson MD, Dumontier M, Aalbersberg IJ, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 2016;3:160018. . 10.1038/sdata.2016.18 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
167. Khodashahi M, Rezaieyazdi Z, Sahebari M. Comparison of the therapeutic effects of rivaroxaban versus warfarin in antiphospholipid syndrome: a systematic review. Arch Rheumatol 2019;35:107-16. . 10.5606/ArchRheumatol.2020.7375 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
168. Keynejad RC, Hanlon C, Howard LM. Psychological interventions for common mental disorders in women experiencing intimate partner violence in low-income and middle-income countries: a systematic review and meta-analysis. Lancet Psychiatry 2020;7:173-90. . 10.1016/S2215-0366(19)30510-3 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
169. Chu DK, Akl EA, Duda S, Solo K, Yaacoub S, Schünemann HJ, COVID-19 Systematic Urgent Review Group Effort (SURGE) study authors . Physical distancing, face masks, and eye protection to prevent person-to-person transmission of SARS-CoV-2 and COVID-19: a systematic review and meta-analysis. Lancet 2020;395:1973-87. . 10.1016/S0140-6736(20)31142-9 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
170. Verhoef LM, van den Bemt BJF, van der Maas A, et al. Down-titration and discontinuation strategies of tumour necrosis factor-blocking agents for rheumatoid arthritis in patients with low disease activity. Cochrane Database Syst Rev 2019;5:CD010455. . 10.1002/14651858.CD010455.pub3 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
171. Odor PM, Bampoe S, Gilhooly D, Creagh-Brown B, Moonesinghe SR. Perioperative interventions for prevention of postoperative pulmonary complications: systematic review and meta-analysis. BMJ 2020;368:m540. . 10.1136/bmj.m540 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
172. Jay MA, Mc Grath-Lone L. Educational outcomes of children in contact with social care in England: a systematic review. Syst Rev 2019;8:155. . 10.1186/s13643-019-1071-z [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
173. Freemantle N, Ginsberg DA, McCool R, et al. Comparative assessment of onabotulinumtoxinA and mirabegron for overactive bladder: an indirect treatment comparison. BMJ Open 2016;6:e009122. . 10.1136/bmjopen-2015-009122 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
174. Bomhof-Roordink H, Gärtner FR, Stiggelbout AM, Pieterse AH. Key components of shared decision making models: a systematic review. BMJ Open 2019;9:e031763. . 10.1136/bmjopen-2019-031763 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
175. Claire R, Chamberlain C, Davey MA, et al. Pharmacological interventions for promoting smoking cessation during pregnancy. Cochrane Database Syst Rev 2020;3:CD010078. . 10.1002/14651858.CD010078.pub3 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
176. Brennan SE, McDonald S, Page MJ, et al. Long-term effects of alcohol consumption on cognitive function: a systematic review and dose-response analysis of evidence published between 2007 and 2018. Syst Rev 2020;9:33. . 10.1186/s13643-019-1220-4 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
177. Allida S, Cox KL, Hsieh CF, Lang H, House A, Hackett ML. Pharmacological, psychological, and non-invasive brain stimulation interventions for treating depression after stroke. Cochrane Database Syst Rev 2020;1:CD003437. . 10.1002/14651858.CD003437.pub4 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
178. Hollands GJ, Carter P, Anwer S, et al. Altering the availability or proximity of food, alcohol, and tobacco products to change their selection and consumption. Cochrane Database Syst Rev 2019;9:CD012573. 10.1002/14651858.CD012573.pub3. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
179. Kunzler AM, Helmreich I, König J, et al. Psychological interventions to foster resilience in healthcare students. Cochrane Database Syst Rev 2020;7:CD013684. 10.1002/14651858.CD013684. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
180. Munthe-Kaas HM, Berg RC, Blaasvær N. Effectiveness of interventions to reduce homelessness: a systematic review and meta-analysis. Campbell Syst Rev 2018;14:1-281. 10.4073/csr.2018.3 . [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
181. Das JK, Salam RA, Mahmood SB, et al. Food fortification with multiple micronutrients: impact on health outcomes in general population. Cochrane Database Syst Rev 2019;12:CD011400. . 10.1002/14651858.CD011400.pub2 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
182. Hunter BM, Harrison S, Portela A, Bick D. The effects of cash transfers and vouchers on the use and quality of maternity care services: A systematic review. PLoS One 2017;12:e0173068-68. . 10.1371/journal.pone.0173068 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
183. Kyburz KS, Eliades T, Papageorgiou SN. What effect does functional appliance treatment have on the temporomandibular joint? A systematic review with meta-analysis. Prog Orthod 2019;20:32. . 10.1186/s40510-019-0286-9 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
184. Pantoja T, Grimshaw JM, Colomer N, Castañon C, Leniz Martelli J. Manually-generated reminders delivered on paper: effects on professional practice and patient outcomes. Cochrane Database Syst Rev 2019;12:CD001174. . 10.1002/14651858.CD001174.pub4 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
185. Kock L, Brown J, Hiscock R, Tattan-Birch H, Smith C, Shahab L. Individual-level behavioural smoking cessation interventions tailored for disadvantaged socioeconomic position: a systematic review and meta-regression. Lancet Public Health 2019;4:e628-44. . 10.1016/S2468-2667(19)30220-8 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
186. Roth DE, Leung M, Mesfin E, Qamar H, Watterworth J, Papp E. Vitamin D supplementation during pregnancy: state of the evidence from a systematic review of randomised trials. BMJ 2017;359:j5237. . 10.1136/bmj.j5237 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
187. Karjalainen TV, Jain NB, Heikkinen J, Johnston RV, Page CM, Buchbinder R. Surgery for rotator cuff tears. Cochrane Database Syst Rev 2019;12:CD013502. 10.1002/14651858.CD013502. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
188. Mudano AS, Tugwell P, Wells GA, Singh JA. Tai Chi for rheumatoid arthritis. Cochrane Database Syst Rev 2019;9:CD004849. 10.1002/14651858.CD004849.pub2. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
189. Chambergo-Michilot D, Tellez WA, Becerra-Chauca N, Zafra-Tanaka JH, Taype-Rondan A. Text message reminders for improving sun protection habits: A systematic review. PLoS One 2020;15:e0233220. . 10.1371/journal.pone.0233220 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
190. Kayssi A, Al-Jundi W, Papia G, et al. Drug-eluting balloon angioplasty versus uncoated balloon angioplasty for the treatment of in-stent restenosis of the femoropopliteal arteries. Cochrane Database Syst Rev 2019;1:CD012510. . 10.1002/14651858.CD012510.pub2 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
191. Barker AL, Soh SE, Sanders KM, et al. Aspirin and fracture risk: a systematic review and exploratory meta-analysis of observational studies. BMJ Open 2020;10:e026876. . 10.1136/bmjopen-2018-026876 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
192. Feng Q, Zhou A, Zou H, et al. Quadruple versus triple combination antiretroviral therapies for treatment naive people with HIV: systematic review and meta-analysis of randomised controlled trials. BMJ 2019;366:l4179. . 10.1136/bmj.l4179 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
193. Neufeld KJ, Needham DM, Oh ES, et al. Antipsychotics for the Prevention and Treatment of Delirium. Agency for Healthcare Research and Quality, 2019. (AHRQ Comparative Effectiveness Reviews.) 10.23970/AHRQEPCCER219 . [ DOI ] [ PubMed ] [ Google Scholar ]
194. Gelbenegger G, Postula M, Pecen L, et al. Aspirin for primary prevention of cardiovascular disease: a meta-analysis with a particular focus on subgroups. BMC Med 2019;17:198. . 10.1186/s12916-019-1428-0 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
195. Sterne JAC, Murthy S, Diaz JV, et al. WHO Rapid Evidence Appraisal for COVID-19 Therapies (REACT) Working Group . Association between administration of systemic corticosteroids and mortality among critically ill patients with COVID-19: a meta-analysis. JAMA 2020;324:1330-41. . 10.1001/jama.2020.17023 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
196. Chau S, Herrmann N, Ruthirakuhan MT, Chen JJ, Lanctôt KL. Latrepirdine for Alzheimer’s disease. Cochrane Database Syst Rev 2015;(4):CD009524. 10.1002/14651858.CD009524.pub2. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
197. Stone R, de Hoop T, Coombes A, et al. What works to improve early grade literacy in Latin America and the Caribbean? A systematic review and meta-analysis. Campbell Syst Rev 2020;16:e1067. 10.1002/cl2.1067 . [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
198. Ghasemiesfe M, Barrow B, Leonard S, Keyhani S, Korenstein D. Association between marijuana use and risk of cancer: a systematic review and meta-analysis. JAMA Netw Open 2019;2:e1916318. . 10.1001/jamanetworkopen.2019.16318 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
199. Nussbaumer-Streit B, Mayr V, Dobrescu AI, et al. Quarantine alone or in combination with other public health measures to control COVID-19: a rapid review. Cochrane Database Syst Rev 2020;4:CD013574. 10.1002/14651858.Cd013574. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
200. Kettrey HH, Marx RA, Tanner-Smith EE. Effects of bystander programs on the prevention of sexual assault among adolescents and college students: A systematic review. Campbell Syst Rev 2019;15:e1013. 10.4073/csr.2019.1 . [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
201. Jud L, Fotouhi J, Andronic O, et al. Applicability of augmented reality in orthopedic surgery - A systematic review. BMC Musculoskelet Disord 2020;21:103. . 10.1186/s12891-020-3110-2 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
202. Semahegn A, Torpey K, Manu A, Assefa N, Tesfaye G, Ankomah A. Psychotropic medication non-adherence and its associated factors among patients with major psychiatric disorders: a systematic review and meta-analysis. Syst Rev 2020;9:17. . 10.1186/s13643-020-1274-3 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
203. Tanjong Ghogomu E, Suresh S, Rayco-Solon P, et al. Deworming in non-pregnant adolescent girls and adult women: a systematic review and meta-analysis. Syst Rev 2018;7:239. . 10.1186/s13643-018-0859-6 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
204. Chou R, Dana T, Fu R, et al. Screening for hepatitis C virus infection in adolescents and adults: updated evidence report and systematic review for the US Preventive Services Task Force. JAMA 2020; . 10.1001/jama.2019.20788 [ DOI ] [ PubMed ] [ Google Scholar ]
205. Buchbinder R, Johnston RV, Rischin KJ, et al. Percutaneous vertebroplasty for osteoporotic vertebral compression fracture. Cochrane Database Syst Rev 2018;11:CD006349. 10.1002/14651858.CD006349.pub4. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
206. Ritchie SJ, Tucker-Drob EM. How much does education improve intelligence? A meta-analysis. Psychol Sci 2018;29:1358-69. . 10.1177/0956797618774253 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

View on publisher site
Collections