Systematic Reviews, Evidence-Based Imaging, and Knowledge Translation

7


Systematic Reviews, Evidence-Based Imaging, and Knowledge Translation


Andrea S. Doria, Jennifer Stinson, and Prakeshkumar Shah


1973: “It is surely a great criticism of our profession that we have not organized a critical summary, by specialty or subspecialty, adapted periodically, of all relevant randomized controlled trials.”


—Professor Archibald Cochrane
(1909–1988)


images Learning Objectives


• To provide basic concepts on definitions of unstructured reviews, systematic reviews, meta-analyses, pooled analyses, evidence-based imaging, and guidelines.


• To outline and describe the steps for conducting a systematic review and meta-analysis and for developing practical clinical guidelines based on evidence derived from available systematic reviews/meta-analyses.


• To discuss the available tools for assessing the quality of reporting and methodology of papers and systematic reviews in diagnostic imaging.


• To introduce concepts on implementation and knowledge translation, T1 and T2 translational research, key guideline questions for knowledge translation activities, and the role of this new science on accelerating the dissemination of knowledge in the radiological sciences.


images Introduction


Evidence-based medicine is “the conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients. The practice of evidence-based medicine means integrating individual clinical expertise with the best available external evidence from systematic research.”1,2


Evidence-based imaging (EBI), in contrast to the traditional paradigm, acknowledges that intuition, unsystematic clinical experience, and pathophysiologic rationale are insufficient grounds for clinical decision making, and stresses the examination of evidence from clinical research in a critical manner. EBI suggests that a formal set of rules must complement medical training and common sense for clinicians to effectively interpret the results of clinical research. Finally, EBI places a lower value on authority than the traditional paradigm of medical practice.3,4


The evidence-based process involves a series of steps: (1) formulation of the clinical question, (2) identification of the medical literature, (3) critical appraisal of the literature, (4) summary or synthesis of the evidence, and (5) application of the evidence to derive an appropriate course of action.5


An evidence-based practitioner must be able to understand the patients’ circumstances or predicament (including issues such as their social supports and financial resources); identify knowledge gaps, and frame questions to fill those gaps; conduct an efficient literature search; critically appraise the research evidence; and apply that evidence to patient care.6,7


The overall number of structured review articles in medicine has increased more than 40 times in the last two decades, according to a search of the publication terms “meta-analysis” (MeSH or tw) or “systematic review” (tw) in MEDLINE, derived from Ovid: from 3,255 articles published prior to 1994 to 22,302 articles up to 2004 to 122,232 articles up to 2015 (searched in late December 2015). However, of these review articles catalogued up to 2015, only approximately 4,470 (3.7%) have evaluated radiology-related topics, including conventional radiography, ultrasonography, computed tomography, magnetic resonance imaging, and radionuclide imaging. The proportion of systematic review/meta-analysis articles in radiology compared to the overall number in the entire field of medicine has slowly increased over the last two decades (1.6% in 1994 and 2.6% in 2004). However, these numbers still indicate a paucity of articles that summarize the best estimates of procedures’ effects, imaging as outcome measures in clinical effectiveness studies, diagnostic tests, or economic evaluations in radiology.8 This could be secondary to an insufficient body of primary evidence that can be reviewed for some topics in radiology, and/or relate to the fact that some reviews that contain a substantial number of low-quality primary studies may provide contradictory evidence on the effectiveness of interventions or accuracy of diagnostic tests.


images Definitions and Types of Reviews


Reviews are essential tools for researchers and clinicians who want to keep up with the evidence that has been accumulated in their fields. They can be unstructured (narrative reviews or commentary) or structured (systematic reviews, meta-analysis, pooled analysis). The latter type of reviews enables assessment of existing evidence on a topic of interest and concludes to support a practice, refute a practice, or identify areas for which additional studies are needed.


The most generic types of review articles are narrative overviews, followed by systematic reviews, meta-analyses, and pooled analyses. A narrative overview is a potentially biased nonstructured literature review on a specific topic that raises a broad research question. It provides a qualitative summary of the literature in the field.9 The Radiology series “State of the Art” and “How I Do It” are typical examples of narrative reviews where experts are invited to write an article for the journal.2


A systematic review is a review of a clearly formulated question that uses systematic and explicit methods to identify, select, critically appraise relevant research, collect, and analyze data from studies included in the review.9 They are classified as “secondary” literature and should be distinguished from original published journal articles, which are classified as the “primary” literature.10,11 Table 7.1 compares the characteristics of narrative and systematic reviews.


Systematic reviews aim at estimating summary effects (synthetic goal) from a qualitative perspective and may or may not include a meta-analysis. The term meta-analysis is used when an attempt is made to estimate summary effects (synthetic goal) and differences (analytic goal) from a quantitative perspective by applying statistical methods.12 A pooled analysis is a meta-analysis based on individual-level patient or study data.13 Although pooling results of multiple studies reduces random error and increases the applicability of results across a broad range of patients, it risks violating the initial assumption of the analysis, which is to provide a nonbiased single best estimate of a patient’s prognosis, the effect of a treatment or diagnostic procedure, or the accuracy of a diagnostic test. The solution to this dilemma is to evaluate the extent to which results differ from study to study; namely, the heterogeneity of study results14 which is further discussed in the Chapter 16.


The art of conducting and evaluating a systematic review or meta-analysis requires previous knowledge on evidence-based concepts.


images Relevance


Unsystematic observations of clinicians constitute one source of evidence, and physiologic experiments another. Unsystematic clinical observations are limited by small sample size and by limitations in human processes of making inferences.15 Predictions about intervention effects on clinically important outcomes from physiologic experiments are usually right, but can be wrong. Observational studies are inevitably limited by the possibility that apparent differences in treatment effect are really due to differences in patients’ prognosis in the treatment and control groups.3 Given the limitations of unsystematic clinical observations and physiologic rationale, EBI is desirable.


Table 7.1 Comparison of characteristics of narrative and systematic reviews









































 


Narrative review


Systematic review


Overarching research question(s) scope


Broad


Narrow (focused)


Authorship


Typically one or a small number of authors from a given discipline


Typically multiple authors from different disciplines that relate to the review scope


Source


Data provided by the authors, therefore often biased


Data obtained systematically to identify all relevant literature, therefore less prone to bias


Data appraisal


No objective assessment of the quality of the data reviewed


Objective assessment of the quality of the primary studies


Data synthesis


No statistical analysis of primary studies


If possible statistical analysis by meta-analysis; if heterogeneity of primary studies then meta-analysis is not possible and reasons for data heterogeneity should be explained


Data reliability


Replication of results not expected


Expected replication of results by using the presented methods in primary studies


Inferences


Conclusions reflect a small group of experts’ perspectives, therefore may be biased


Conclusions of overarching questions based on objective data analysis


The overall goal of a systematic review or meta-analysis is to combine results of previous studies to arrive at summary conclusions about a body of research.16 A properly conducted systematic review/meta-analysis can summarize large amounts of data. For health care providers, consumers, and policy makers who are interested in the bottom line of evidence, systematic reviews can help outline conflicting results of research. In radiology, systematic reviews or meta-analyses can be used to provide a summary estimate of effect size of a treatment that used imaging data to assess outcomes in observational or randomized controlled clinical trials, to estimate the clinical effectiveness of an imaging-guided therapy procedure, to synthesize results of economic evaluations that have used imaging data, or to evaluate the summary diagnostic accuracy of an imaging test. With regard to the latter purpose, clinicians, policy makers, and patients would like to know if the application of the test improves the outcome, what test to use or to recommend in practice guidelines, and how to interpret test results.17 Well-designed diagnostic accuracy studies can help in making these decisions, provided that they fully report their participants, tests, methods, and results.


images Steps for Conducting a Systematic Review


Steps involved in a systematic review are similar to the phases of any other research undertaking formulation of the problem to be addressed, collection, critical appraisal (quality assessment), analysis of data from observational or randomized studies, and interpretation of results (assessment of heterogeneity, sensitivity, and subgroup analyses).


Protocol Phase


The initial step in the protocol phase is to define the main outcome of interest, such as, clinical effectiveness of diagnostic procedures or drugs in studies that use imaging as outcome measures, performance of diagnostic tests, or cost–benefit, cost-effectiveness, or cost-utility of treatment strategies or health care programs that involve diagnostic imaging tools. Before the start of the study, a detailed review protocol should be established that clearly states the question to be addressed, subgroups of interest, methods, and criteria to be used to identify and select relevant studies, and extract and analyze information. However, in some circumstances, unexpected or undesired results can be excluded by post hoc changes to the inclusion criteria, which should be documented in the review. Eligibility criteria for the review should define study participants, interventions, outcomes, study designs, and method quality of studies to be included in the review. An example on how authors can establish a systematic review/meta-analysis protocol to evaluate the diagnostic performance of ultrasonography (US) and computed tomography (CT) for the diagnosis of appendicitis in pediatric and adult populations18 is available in Table 7.2.


Review Phase


Identification of Studies

The search strategy for identifying relevant studies should be clearly defined considering multiple database sources: MEDLINE, EMBASE, EBM reviews, Cochrane Controlled Clinical Trials Register (CCTR), and bibliographic databases specific to such disciplines as nursing (CINAHL), behavioral sciences (PsycINFO), alternative medicine (MANTIS, AMED), physiotherapy (PeDRO), and oncology (CANCERLIT); checking of reference lists and personal files; hand searching of key journals; and personal communication with experts in the field. The search process should include MeSH terms pertaining to the population, intervention, comparison groups, and outcomes of interest as described in Chapter 10.


A comprehensive search should consist of the following steps:


1. Meta-analysis (pt)


2. Meta-anal: (textword)


3. Metanal: (textword)


4. Quantitative: review: OR quantitative: overview: (textword)


5. Systematic: review: OR systematic: overview: (textword)


6. Methodologic: review: OR methodologic: overview: (textword)


7. Review (pt) AND Medline (text word) [and other databases]


1 OR 2 OR 3 OR 4 OR 5 OR 6 OR 7


It is highly recommended that authors prepare a flow diagram revealing the search and selection process used for the identification and quality assessment of articles (Fig. 7.1). Moreover, when feasible, investigators should use a “topic-only” search strategy and avoid restrictions with regard to articles written in certain languages as an attempt to prevent language bias.19,20


Selection of Studies

Decisions regarding the inclusion or exclusion of individual studies often involve some degree of subjectivity. It therefore is useful to have at least two readers checking eligibility of candidate studies, with disagreements resolved by consensus or by a third reviewer.


It is recommendable to keep a log of excluded studies with reasons for exclusions, which should be available on request from the authors of the review.


Assessment of “Risk of Bias” for Study Quality

Independent assessment of method quality of individual studies by more than one reader is recommended. Blinding of readers to investigators’ names and institutions, journal names, and acknowledgments is controversial because it is time consuming and potential benefits may not always justify the additional costs.21 The quality of primary studies can be measured with scales or checklists. Use of scales involves assigning each item a numerical score; the sum of scores of these items then determines the overall quality of the study.22 Use of checklists involves scoring items as “yes” or “no” and assigning one point for each “yes” item; the final score for a study corresponds to the sum of these “yes” items.23 Theoretical considerations24,25 suggest that scales generally not be used to assess the quality of trials in meta-analyses, and sample checklists should be preferable. A grading display of the strength of evidence level according to type of design of the studies is shown in Fig. 7.2. In general, the type of design has an effect on the overall quality of the primary study. Double-blind, randomized, controlled, clinical trials provide the strongest evidence for causality relationship. Conversely, indirect evidence shown in case reports, expert opinion, and consensus committees provides the weakest evidence.26 Some potential sources of bias in primary studies include selection bias (caused by incomplete randomization or allocation of patients in the alternative and standard care groups), performance bias (caused by differences in care provided to patients exposed or not to an intervention or diagnostic procedure), detection bias (caused by differences in outcome assessment between two groups of patients), and attrition bias (caused by differences in withdrawal or participation rates of patients in randomized controlled trials).27


Table 7.2 Example of a protocol (inclusion and exclusion criteria) for study selection in a systematic review/meta-analysis

















































Study Feature


Inclusion Criteria


Exclusion Criteria


Participants


Segmentation of results according to age groups, with a maximum age of 20 years for children and young adults and a minimum age of 13 years for adults, and if this criterion was not fully met, the proportion of patients with outlying ages could not exceed 5% of the total sample size


Inclusion of both female and male patients (i.e., ratio of one sex to the other, < 3:1)


Data for pregnant women


Target disorder


Appendicitis


Studies with a 15%–75% sample prevalence of appendicitis derivable from the reported results (i.e., true-positive plus false-negative results, divided by the total of true-positive, true-negative, false-positive, and false-negative results), as arbitrarily determined and checked by means of sensitivity analysis


 


Research design of primary studies


Prospective or retrospective studies evaluating the performance of abdominal ultrasound and/or CT


Case reports, case series, reviews, pictorial essays, unpublished data, abstracts, and letters to editor


 


Availability of data for the absolute number of true-positive, true-negative, false-positive, and false-negative findings either reported, derivable from the results, or communicated by the authors in response to our request


Focus on topics other than diagnostic test assessment, such as management decision issues or cost-effectiveness analyses


 


No language restriction


When more than one study uses the same data or when the durations of studies overlap, the study with the larger sample size is selected to avoid duplication of data


Prior tests


Exclude studies where patients had prior diagnosis of appendicitis (interval appendectomies)


 


Ultrasound test methods


Criteria for positive and negative test results defined. Imaging criteria for positivity for appendicitis that included visualization of an inflamed appendix (diameter > 6 mm) noncompressible appendix at ultrasound, or, in the case of nonvisualization of the appendix, presence of inflammatory signs of appendicitis, such as an appendicolith, cecal thickening, arrowhead sign, or cecal bar (as seen on CT images)


Experience of operators described


Performance of more than one ultrasound examination per patient


CT test methods


Criteria for positive and negative test results defined. In studies evaluating the performance of CT scanning, a description of the technique used—namely, the use of oral, rectal, and intravenous contrast material with a limited or complete scan


Experience of operators described


Performance of more than one CT examination per patient


Reference test


Surgical/anatomopathologic or follow-up results


Criteria for positive and negative test results defined


 



Let us assume a hypothetical example of a clinical trial that compared the effectiveness of an imaging-guided therapy procedure (alternative arm of the study) and an open laparotomy procedure (standard care study arm) in terms of rates of postprocedure complications in subsets of patients with clinically suspected perforated appendicitis. Without a strategy to avoid biases, more patients with severe clinical symptoms were examined by means of CT than sonography before their surgical procedures in each of the study arms (selection bias). In addition, their imaging guided therapy and open laparotomy procedures were performed by both a radiology fellow and a staff radiologist, rather than by the on-call fellow only (performance bias), and their rate of complications in each procedure arm was evaluated by two observers, rather than by a single observer, as would be the standard approach for patients with less severe abdominal pain (detection bias). Finally, a greater proportion of patients with severe abdominal pain in both study arms had refused to participate in the study; therefore, some included patients may not have the outcome (attrition bias). In this example, despite the presence of biases in both arms of the study, results in patients with severe abdominal pain were systematically different from results in patients with less severe symptoms. These biases could have influenced the summary estimate of the effect of the procedures in subsets of patients with clinically suspected perforated appendicitis in a meta-analysis that included this particular study.



Extraction of data: A standardized data record form is needed for this purpose. More than one reader should extract the data to avoid or minimize errors and inadequate indexing of existing reports of studies recorded by different observers. An arbitrator should be required to reach agreement. The rate of disagreement between readers during the data-extraction process should be shown in the final report. To facilitate extraction and subsequent analysis, data should be noted in a study table designed with the research questions in mind (example in Supplementary Table 1). Important domains and elements to rate methodologic quality of individual studies in data synthesis of studies of clinical or technology implementation effectiveness and of diagnostic tests and economic evaluations should include:


• Study question


• Study population


• Randomization


• Blinding


• Intervention


• Outcomes


• Statistical analysis


• Results


• Discussion


• Funding (if appropriate)


images Data Synthesis of Studies of Clinical or Technology Implementation Effectiveness


For data synthesis on randomized controlled clinical trials as primary studies, checklists of risk of bias (The Cochrane Collaboration’s tool for assessing risk of bias)28 (Supplementary Table 2) can assess the quality of primary studies. After studies have been selected and critically appraised and data have been extracted, characteristics of included studies and individual results should be expressed in a standardized format to allow for comparison between studies. If the outcome is binary (e.g., disease vs. no disease; intervention vs. standard practice procedure), odds ratios, relative risks, or risk differences can be calculated. If the outcome is continuous (e.g., percentual enhancement of a tissue after contrast administration), mean difference, standardized mean difference, or correlation coefficients can be applied. Odds ratios have convenient mathematical properties because they do not have inherent range limitations associated with high baseline rates29 and are suitable for statistical manipulation as the antilog of coefficients. Details on this are available in Chapter 16. Nevertheless, relative risks usually are preferred over odds ratios because they are more intuitively understandable.30,31 Before pooling results of individual studies using an effect measure (e.g., odds ratio or relative risk), the investigator should evaluate for the presence of heterogeneity within and between studies.


Data Synthesis of Studies of Diagnostic Tests


Methodologic quality assessment of individual studies in systematic reviews is therefore necessary to identify potential sources of bias and to limit the effects of these biases on the estimates and the conclusions of the review. Methodologic quality of a study has been defined as “the extent to which all aspects of a study’s design and conduct can be shown to protect against systematic bias, non-systematic bias that may arise in poorly performed studies, and inferential error.”32


The Standards for Reporting of Diagnostic Accuracy (STARD) checklist was not developed as a tool to assess the quality of diagnostic studies.33 This 25-item checklist has been used to evaluate the quality of reporting of diagnostic studies by ensuring that all relevant information is present.


However, many items in the checklist are included in recently developed tools for quality assessment of diagnostic accuracy (the Quality Assessment of Diagnostic Accuracy Studies [QUADAS-2] tool34) and reliability (the Quality Appraisal of Reliability Studies [QAREL] tool,35 Supplementary Table 3). The QUADAS-2 tool is structured as a list of two domains (risk of bias and applicability) and 14 questions on diagnostic accuracy that should each be answered “yes,” “no,” or “unclear.” Under both domains items cover patient selection, index test, reference standard, verification and review bias, clinical review bias, incorporation bias, test execution, study withdrawals, and intermediate results. Additionally, the risk of bias domain also covers flow/timing. The QAREL tool includes 11 items that explore seven principles. Items cover the spectrum of subjects, spectrum of examiners, examiner blinding, order effects of examination, suitability of the time interval among repeated measurements, appropriate test application and interpretation, and appropriate statistical analysis.


The results of quality appraisal can be summarized to offer a general impression of the validity of the available evidence. Review authors should not use an overall quality score, as different shortcomings may generate different magnitudes of bias, even in opposing directions, making it very hard to attach sensible weights to each quality item. A way to summarize the quality assessment is shown in Fig. 7.3, where stacked bars are used for each QUADAS-2 item. Another way of presenting the quality assessment results is by tabulating the results of the individual QUADAS-2 items for each single study. The effects of the STARD guidelines for complete and transparent reporting are only gradually becoming visible in the literature.36,37


Stay updated, free articles. Join our Telegram channel

Apr 5, 2019 | Posted by in GENERAL RADIOLOGY | Comments Off on Systematic Reviews, Evidence-Based Imaging, and Knowledge Translation

Full access? Get Clinical Tree

Get Clinical Tree app for offline access