Overdiagnosis and Risks of Breast Cancer Screening

Since its widespread introduction 30 years ago, screening mammography has contributed to substantial reduction in breast cancer-associated mortality, ranging from 15% to 50% in observational trials. It is currently the best examination available for the early diagnosis of breast cancer, when survival and treatment options are most favorable. However, like all medical tests and procedures, screening mammography has associated risks, including overdiagnosis and overtreatment, false-positive examinations, false-positive biopsies, and radiation exposure. Women should be aware of the benefits and risks of screening mammography in order to make the most appropriate care decisions for themselves.

Key points

•

The benefits of screening mammography include early diagnosis, reduced breast cancer-associated mortality, and reduced treatment-associated morbidity.
•

The risks of screening mammography include overdiagnosis/overtreatment, false-positive screening, false-positive biopsy, and radiation exposure.
•

Overdiagnosis of breast cancer is a consequence of screening, and is more common in older women. When lead time and background incidence of breast cancer are adjusted for, the most reliable estimates of breast cancer overdiagnosis are less than 10%.
•

For most women 40 years of age and over, the potential benefit of breast cancer screening outweighs its risks.

Introduction

Breast cancer remains a substantial health risk for women worldwide. In 2020, the American Cancer Society estimates that 276,480 women in the United States will be diagnosed with invasive breast cancer, and over 42,000 women will die. Despite this sobering statistic, the mortality would be higher if not for the widespread implementation of screening mammography in the United States 3 decades ago. Since its introduction, screening mammography has contributed to a 40% reduction in breast cancer associated mortality among US women from 1989 to 2016. Using Cancer Intervention and Surveillance Modeling Network (CISNET) data, Plevritis and colleagues reported a 49% reduction in breast cancer-associated mortality from 2000 to 2012 because of screening mammography and therapeutic advances. Helvie and colleagues reported a 37% decrease in late-stage disease from 2007 to 2009, compared with the prescreening era. Depending on the background incidence rate of breast cancer used in the modeling, Hendrick and colleagues estimated that 384,046 to 614,484 breast cancer deaths have been averted since 1989, because of screening and treatment advances. Tabar and colleagues evaluated long-term Swedish data and found that women who participated in organized breast cancer screening had a 60% lower risk of dying from breast cancer within 10 years after diagnosis and a 47% lower risk of dying 20 years after diagnosis compared with nonscreened participants with access to comparable stage-specific national treatment protocols. These results demonstrated that women who have participated in screening mammography obtain a significantly greater benefit from available therapies at the time of diagnosis than do those who have not participated. Overall, screening mammography has contributed considerably to the health of women in the United States over the last 30 years. However, like all medical tests and procedures, screening mammography is imperfect and has associated risks. These risks include overdiagnosis and overtreatment, false-positive examinations, and radiation exposure. In order to make the most appropriate care decisions for themselves, patients should be aware of the benefits and risks of screening mammography.

Overdiagnosis

Definition

Overdiagnosis is the concept that breast cancer detected at screening may not become clinically apparent or harm a patient in her lifetime. Overdiagnosis is often cited among the most harmful risks of breast cancer screening ^, , due to the ensuing treatment (overtreatment) that may cause physical and/or psychosocial morbidity. Overdiagnosis can be conceptualized as 2 types. Type I overdiagnosis, which has also been referred to as obligate overdiagnosis, is a screen-detected cancer that would not have become clinically evident prior to a patient’s death from a noncancer etiology, such as cardiovascular disease. If the patient’s death occurs during the mammographic lead time, caused by something other than breast cancer, it represents an example of Type I overdiagnosis. Because women of more advanced age have higher competing causes of mortality, the risk of Type I overdiagnosis is higher for older women. Type II overdiagnosis, which has been referred to as nonobligate overdiagnosis, is a screen-detected breast cancer that is extremely indolent (nonobligate progression) or possibly may even regress. However, breast cancer regression is extremely low in likelihood. In 2017, an expert group of radiologists who had interpreted a total of 6.8 million screening mammograms and diagnosed over 34,000 breast cancers reported zero cases of breast cancer regression out of 479 untreated, screen-detected invasive and noninvasive breast cancers. Indolent, nonprogressive breast cancer is possible, particularly in older, postmenopausal women. However, the genetic events associated with breast cancer progression are not yet fully understood. Thus, it is not currently possible to discern which breast cancers will remain indolent. As scientific knowledge of breast cancer genetics and tailored therapies improves, science may be able to prospectively identify biologically indolent tumors and allow for limited or no treatment in some cases. This will reduce the individual risks of overdiagnosis and associated overtreatment.

Measuring Overdiagnosis

The incidence of overdiagnosed breast cancer is not quantifiable as an absolute number, and attempts to estimate overdiagnosis are complicated. The lack of a standardized method to measure breast cancer overdiagnosis is readily evident in the wide array of published estimates, ranging from 1% to 57%. A direct quantification would require following a large cohort of women with similar breast cancer risk and untreated screen-detected breast cancers over an extended period of time, and then measuring the percentage of breast cancers that do not manifest clinically. Such a study is not ethically feasible because of the possible harm it may cause patients. Furthermore, few women would willingly forgo treatment, leading to a small study cohort that could not be used to inform treatment for a large population of women. Therefore, overdiagnosis measurements are often estimated from randomized control trial (RCT) and observational study data as the difference between the observed and expected (absent screening) breast cancer incidence. There is large discrepancy among published overdiagnosis estimates, in part secondary to disputes over the expected incidence. One method to quantify overdiagnosis is outlined as follows. It starts with the observation of cumulative breast cancer incidence during and following a screening RCT. During the intervention portion of the trial, breast cancer incidence in the screening arm would be greater than the control group, because diagnoses are brought forward chronologically earlier. At the completion of the RCT, breast cancer incidences in the control and study patients then are observed over an extended period of time (equal to or greater than the mammographic lead time distribution) with neither arm participating in any additional screening. In randomized groups of equal-risk patients, the cumulative breast cancer incidence should be the same in both groups over time if there is no overdiagnosis. An extended observation period of at least 10 years would be necessary to allow control patients to reach the incidence of the patients invited to screen. However, this type of observation from an RCT is often not possible, because multiple RCTs invited the control group to screen at the completion of the trial, rendering an overdiagnosis estimate unreliable.

The Malmö trial did not invite the control group to screen after completion of the trial, and thus may provide the best opportunity to quantify overdiagnosis. Inviting women age 55 to 69 (at randomization) to screening detected 10% more breast cancers (invasive and noninvasive) than became clinically evident in age-matched control subjects at 15 years of follow-up. When noninvasive breast cancers were excluded, an overdiagnosis rate of 7% was estimated. It has been postulated that if members of the screening arm continued screening after the trial, ^, then the observed incidence of breast cancer in the screened patients would increase, thus increasing the overdiagnosis estimate, with a potential of 10% as the maximum overdiagnosis estimated as the result of mammographic screening. Conversely, a fraction of control patients (25% in the Malmö and Canadian trials) reported opportunistic screening in the trial and follow-up period, which may have led to an underestimate of overdiagnosis.

The Canadian National Breast Screening Studies (CNBSS-1 and CNBSS-2) were RCTs that did not systematically invite the control patients to screen at the close of the trial. In 2016, Baines and colleagues evaluated the data from the CNBSS trials, and reported that 20 years after cessation of screening, overdiagnosis of invasive breast cancer was 48% for CNBSS-1 (women 40–49 years of age) and 5% for CNBSS-2 (women 50–59 years of age). However, when Marmot and colleagues analyzed the CNBSS trial data, they reported an overdiagnosis estimate of 12.4% and 9.7% for CNBSS-1 and 2 respectively. The marked difference between these 2 authors’ overdiagnosis estimates may be explained by the denominators used in the overdiagnosis calculations. For example, the estimate by Marmot and colleagues varied whether the denominator used was the “cancers diagnosed over the entire follow-up period” or the “cancers diagnosed during the screening period.” The denominator in the latter example is lower, leading to a higher estimate of overdiagnosis. When overdiagnosis rates from the Malmö and both Canadian RCTs were calculated using variable denominators, the estimates ranged from 9.7% to 29.4%. The denominator used in the overdiagnosis calculation may depend on the purpose of the estimate. For example, if the calculation is being made from a population perspective, a denominator that includes all breast cancers diagnosed in women of all ages would be used, whereas if the purpose is to calculate from an individual patient’s risk of being overdiagnosed, a denominator that includes women of screening age and older may be used. When the aggregate RCT data were calculated from a population perspective, Marmot and colleagues estimated an 11% risk of overdiagnosis. Similarly, de Gelder and colleagues demonstrated the extent to which overdiagnosis estimates are influenced by the denominator used. Working with observational incidence data from the Dutch screening program from 1990 to 2006, her group found that the estimated overdiagnosis rate could vary by a factor of 3.5 when different denominators were used.

The Swedish Two-County and Gothenburg trials are examples where the control group was offered screening at the completion of the study arm. 6 to 8 years after randomization, the control group was invited to screening. Yen and colleagues note that in the Swedish Two-County trial, the “catch-up” of the control group’s breast cancer incidence began immediately after they were invited to screen, indicating that any degree of overdiagnosis is largely incurred at the prevalence screen. At 29 years of follow-up, Yen reported no excess incidence of breast cancer in the patients invited to screen in the Swedish Two-County Trial, whether or not in situ disease is included (relative risk [RR] 1.00, confidence interval [CI] 0.92-1.08). Evaluating the data by patient age at randomization, no excess cancer incidence was seen in any of the active study cohorts except the oldest (aged 70–74 years of age at randomization), although this was nonsignificant (RR 1.25, CI 0.97–1.61). In this age group, a higher degree of overdiagnosis may be expected because of the higher competing mortality causes. Overall, long-term follow-up data from the Two-County Trial suggest that overdiagnosis is a minor phenomenon, more notable in patients of older age, and is primarily confined to the prevalent screen. Prevalent screening may be more prone to detect indolent breast cancers, because cancers detected at incident screening indicate that the cancer is more biologically active and may be more likely to be clinically relevant. In fact, when Duffy and colleagues evaluated the Two-County and Gothenburg RCT data, they found an estimated overdiagnosis rate of 1% for breast cancers diagnosed at incident screen after accounting for lead time. See Table 1 for adjusted overdiagnosis estimates from randomized control trials.

Table 1

Overdiagnosis estimates from selected randomized control trials

Randomized Control Trial	Country	Patient Age at Randomization	Overdiagnosis (Invasive and DCIS)	Reference
Malmo	Sweden	55–69	10.5%	Zackrisson et al, 2006
Swedish Two County	Sweden	40–69	RR 1.0	Yen et al, 2012
CNBSS-1	Canada	40–49	12.4%	Marmot et al, 2013
CNBSS-2	Canada	50–59	9.7%	Marmot et al, 2013

Published RCT estimates of overdiagnosis are based on mammographic screening that was performed in an experimental setting at least 30 years ago. It would be informative to evaluate more recent clinical screening programs to investigate how time and technologic advances may have influenced overdiagnosis estimates from observational studies. de Gelder and colleagues estimated overdiagnosis based on clinical data from 1990 to 2006 in the Netherlands, and found that the overdiagnosis risk ranged between 2.8% and 9.7%. Puliti and colleagues and the EUROSCREEN Working Group analyzed 13 observational studies from 7 European countries that estimated breast cancer overdiagnosis in clinical screening programs extending to 2006. Overdiagnosis estimates that adjusted for breast cancer risk and lead time ranged from 1% to 10%. Unadjusted estimates, however, ranged from 0% to 54%. In general, overdiagnosis estimates that adjust for lead time are similar to those of the Malmö trial and EUROSCREEN Group. Studies that do not adjust for these factors tend to have higher estimates of overdiagnosis. Some of the other discrepancies seen between overdiagnosis estimates could be explained by differences in the patient denominators used, the length of follow up, and no lead time adjustment. For example, evaluating incidence data from several clinical screening programs in Europe, Canada, and Australia, Jorgensen and colleagues estimated a much higher overdiagnosis rate of 52% (95% CI = 46%–58%). However, as Kopans points out, the expected cancer incidence may have been underestimated, because the background cancer incidence rates were considered stable and adjustment was not made for rising background incidence. Furthermore, focusing on the screening uptake phase of the study can lead to a high observed incidence that cannot be accounted for by the lead time because of prevalence screening effect. Evaluating the clinical screening program in Denmark, Jorgensen and colleagues published an overdiagnosis estimate of 33%, but when Njor and colleagues evaluated the same Danish screening program, they found a rate of only 2.3% (95% CI −3% to 8%) when the follow-up period was extended to account for lead time ( Table 2 ).

Table 2

Only gold members can continue reading. Log In or Register to continue