The MR imaging BI-RADS atlas includes a category for probably benign findings, and many radiologists recommend short-term interval follow-up for breast MR imaging lesions. However, the characteristics of those MR imaging findings that could appropriately be considered probably benign are not well defined. This article reviews the published retrospective and prospective data regarding the use and cancer yield of the BI-RADS 3 assessment category as well as the morphology and kinetics of lesions that may be considered suitable for short-term follow-up. Consideration is also given to how the costs, clinical indication for the examination, experience of the reader, and availability of comparisons may affect the use of the BI-RADS 3 category. Based on the evidence that is available, an algorithm is offered with imaging examples as a guide for determining if lesions are appropriate for short-term follow-up. Additional research is needed to fully clarify the distinct morphologic and kinetic characteristics that allow patients to safely avoid biopsy without changing prognosis if malignancy is present.
The “probably benign” Breast Imaging Reporting and Data System (BI-RADS) 3 assessment category was originally established for use in mammography. The published evidence and current practice support the use of BI-RADS 3 and short-term follow-up for specific mammographic findings that have been shown to have a chance of representing malignancy in less than 2% of cases. Careful observation of these lesions may be preferred over biopsy to avoid the risks and costs of invasive tissue sampling. In addition, if the lesion is malignant, then the follow-up interval should be short enough that there is no change in prognosis. Therefore, current evidence supports short-interval mammographic follow-up as a valid alternative to tissue sampling when adhering to strict criteria.
In 2003 the BI-RADS atlas was revised. BI-RADS introduced the first edition lexicon for breast magnetic resonance (MR) imaging that, like its mammographic predecessor, promotes the standardization of lesion descriptors and assessment categories to facilitate reporting, communication, and research. This lexicon is based on the results of the International Working Group on Breast MR imaging and the American College of Radiology Breast MR imaging Lexicon Committee. An MR imaging BI-RADS 3 assessment category was included in this first edition to allow radiologists the option of following, rather than biopsying, some lesions noted on MR imaging. Although this approach was well established for mammographic lesions, the strict criteria for what safely constitutes BI-RADS 3 lesions had not been established for breast MR imaging. The provided definition appropriately reflected the lack of published literature pertaining to this MR imaging assessment:
A finding placed in this category is highly unlikely for malignancy and should have a very high probability of being benign. It is not expected to change over the follow-up interval, but the radiologist would prefer to establish its stability. Data are becoming available that shed light on the efficacy of short-interval follow-up. At the present time, most approaches are intuitive. These will likely undergo future modifications as more data accrue as to the validity of an approach, the interval required, and the type of findings that should be followed.
Although the lexicon represents an attempt to use similar concepts for each breast imaging modality, there are significant differences that should be considered. For example, the cost of a follow-up MR imaging is much greater than the cost of a follow-up mammogram. The average risk of a patient undergoing screening mammography differs from a patient that has a breast MR imaging.
This article discusses the issues that are specific to MR imaging and that affect the utility of an MR imaging probably benign assessment. This aspect is especially important, as the last few years have witnessed a significant increase in the number of breast MR examinations performed nationwide as well as concerns over the indications for and outcomes of its use. The authors address these questions through a review of the data on the MR imaging BI-RADS 3 category that have emerged since the first edition of the MR imaging lexicon was published. Given the combination of the high cost of breast MR imaging and MR image–guided biopsy, it is critical that clinicians strive to determine the proper selection criteria and follow-up interval of MR imaging BI-RADS 3 lesions, to prevent unnecessary biopsies and minimize unwarranted follow-up imaging.
Incidence of MR imaging BI-RADS 3
Prior published reports suggest that there is significant variation in the use of the probably benign assessment, which has been applied to 14% to 34% of patients and 6.6% to 25% of examinations in prior studies. At the upper end of this spectrum it is worth noting that as many as 1 out of every 3 patients undergoing breast MR imaging may be recommended to return for a follow-up examination. Such a strategy is unlikely to be sustainable in consideration of the cost of the examinations in dollars, time, and potential for additional false-positive findings.
Four retrospective studies tracked and reported the incidence of BI-RADS 3 even when BI-RADS 4 or 5 assessments were provided for additional lesions in that patient ( Table 1 ). These methods tracked all BI-RADS 3 lesions, even those that might not have been clinically relevant, for example, in a patient with known cancer who is planning mastectomy. The patient populations have been mixed, which may explain the wide range of BI-RADS 3 use in these studies, ranging from 14% to 24%.
Study | Probably Benign Examinations | Probably Benign Patients | Compliance with First Short-Interval Follow-Up | Frequency of Tissue Sampling a | Cancer Yield b |
---|---|---|---|---|---|
Liberman et al, 2003 | NR | 89/367 (24%) | 70/89 (79%) | 20/89 (22%) | 9/89 (10%) |
Sadowski & Kelcz, 2005 | NR | 79/473 (17%) | 68/79 (86%) | 5/79 (6.3%) | 4/79 (5.1%) |
Eby et al, 2009 c , | 260/2569 (10%) | 236/1735 (14%) | 150/236 (64%) | 18/236 (7.6%) | 2/236 (0.8%) |
Total | 260/2569 (10%) | 404/2575 (15.7%) | 288/404 (71.3%) | 43/404 (10.6%) | 15/404 (3.7%) |
a Calculated as the number with tissue sampling divided by the number of patients.
b Calculated as the number of malignancies divided by the number of patients.
c Includes all 809 examinations from Eby PR, Demartini WB, Peacock S, et al. Cancer yield of probably benign breast MR examinations. J Magn Reson Imaging 2007;26(4):950–5.
There have been many well-designed prospective investigations of the performance of breast MR imaging for screening high-risk patients. A few of them have published data on the use and cancer yield of lesions assessed as probably benign ( Table 2 ). The incidence of probably benign lesions in these trials ranges from 6.6% to 25% of examinations. It should be noted that the methods for assigning and reporting BI-RADS assessments, which are then used to calculate use, vary. For example, Kriege and colleagues only reported a case as probably benign if BI-RADS 3 was the most suspicious assessment given for any lesion in the examination. This method results in a lower reported rate of overall use of BI-RADS 3 compared with other retrospective trials, such as that by Eby and colleagues, which reported all lesions assessed as BI-RADS 3 regardless of whether that was the highest order assessment for the patient’s examination or not.
Study | Probably Benign Examinations | Probably Benign Patients | Frequency of Tissue Sampling b | Cancer Yield c |
---|---|---|---|---|
Kuhl et al, 2000 a , | 45/363 (12%) | 42/198 (21%) | 2/42 (4.8%) | 1/42 (2.4%) |
Kriege et al, 2004 a , | 275/4169 (6.6%) | NR/1909 | 12/275 (4.4%) | 3/275 (1.1%) |
Hartman et al, 2004 | 19/75 (25%) | 14/41 (34%) | 0/14 (0%) | 0/14 (0%) |
Kuhl et al, 2005 d , | 167/1452 (11.5%) | NR/529 | NR | NR |
a Examinations were only included if BI-RADS 3 was the most severe assessment.
b Calculated as the number with tissue sampling divided by the number of patients, except for the Kriege study where it is divided by the number of probably benign examinations.
c Calculated as the number of malignancies divided by the number of patients, except for the Kriege study where it is divided by the number of probably benign examinations.
d Includes Data from Kuhl CK, Schmutzler RK, Leutner CC, et al. Breast MR imaging screening in 192 women proved or suspected to be carriers of a breast cancer susceptibility gene: preliminary results. Radiology 2000;215(1):267–9.
For comparison, the incidence of probably benign assessments for patients undergoing mammography has been reported to range from 1.2% to 14%. Although strict recommendations for target rate of BI-RADS 3 use in mammography are not published, it has been suggested that it should be “considerably less than 5%.” Kerlikowske and colleagues reported that 5.2% of first and 1.7% of subsequent screening examinations included recommendation for short-interval follow-up in a multicenter study by the Breast Cancer Surveillance Consortium. If possible, one should strive to reach this low level of MR imaging BI-RADS 3 use for reasons that are described later in this article.
Cancer yield of BI-RADS 3
A mix of 7 retrospective and prospective articles have included data on the MR imaging BI-RADS 3 assessment category, with a resulting wide range of cancer yields (0%–10%). Four of these were designed, albeit retrospectively, to specifically investigate the MR imaging BI-RADS 3 assessment (see Table 1 ). The other 3 articles include data regarding the cancer yield of MR imaging BI-RADS 3 in prospective high-risk screening MR imaging trials (see Table 2 ).
Cancer Yield in Retrospective Studies
The cancer yield of MR imaging lesions assessed as probably benign ranges from 0.8% to 10%, and averages 3.7% in retrospective studies that specifically address the follow-up of BI-RADS 3 findings (see Table 1 ). These results paint a mixed picture of the utility of the MR imaging BI-RADS 3 assessment. The lowest cancer yield (0.8%, 2/236) was published by Eby and colleagues. This study included patients who underwent MR imaging for any reason, although the majority of examinations were performed to screen women at high risk or evaluate the extent of disease following a new diagnosis of cancer. The results suggest that the use of a BI-RADS 3 assessment is a safe alternative to biopsy. However, the very low rate of malignancy suggests that the probably benign assessment may have been used for some lesions more appropriately assessed as benign. Such a strategy, while cautious, may result in overuse of expensive medical resources and unnecessary anxiety for patients.
Liberman and colleagues, on the other hand, documented the highest cancer yield of 10% (9/89) among studies of probably benign MR imaging findings. All included lesions were initially assessed as probably benign in a population of high-risk and asymptomatic patients who underwent MR imaging for screening. The risk of malignancy in the cohort of lesions that were assessed as BI-RADS 3 was higher than acceptable for probably benign mammographic findings (<2%). The study identified lesion types that would be more appropriately placed into the BI-RADS 4 category. For example, although the sample size was small, a cancer yield of 25% (1/4) was identified in lesions described as ductal nonmass-like enhancement (NMLE) and initially assessed as probably benign. The investigators subsequently recommended that ductal enhancing lesions be biopsied rather than followed.
Sadowski and Kelcz reported a cancer yield of 5.1% (4/79) among probably benign lesions in a population of patients who underwent MR imaging to further evaluate a mammographic finding that was initially assessed BI-RADS 0 and “unresolved” by diagnostic mammography and/or ultrasound.
It is important to acknowledge that each of these retrospective studies had different inclusion criteria for patient populations: screening MR imaging only (Liberman and colleagues), screening and diagnostic MR imaging (Eby and colleagues), and BI-RADS 0 mammographic and sonographic workups. Because of this variability, it is not surprising that the results have also varied from an acceptable probably benign cancer yield of less than 2% to as high as 10%.
Cancer Yield in Prospective High-Risk Screening Trials
Four prospective high-risk screening trials have published data on the cancer yield in patients who received a BI-RADS 3 as the highest order assessment on screening breast MR imaging (see Table 2 ). The yields range from 1.1% to 2.4% in large multicenter studies. However, evaluation of the probably benign assessment was not the primary purpose of these investigations. As stated earlier, the methods in some cases underreported the incidence of BI-RADS 3 lesions and, therefore, may have underreported the cancer yield as well. However, because these are prospective trials in populations of screening patients, the low cancer yield suggests that the use of the BI-RADS 3 category may be appropriate in certain clinical scenarios.
Cancer Yield and Practice Patterns
As stated, the available data arise from heterogeneous studies with undefined criteria for probably benign imaging characteristics. It is, therefore, important to place the subsequent cancer yield of such lesions into the context of the practice patterns of the interpreting radiologists. Data that result in a high use of BI-RADS 3 and a low cancer yield may reflect an over-cautious pattern of recommending follow-up for many benign lesions. In essence, this results in a BI-RADS 3 category that is comprised of benign findings and very few cancers. Alternatively, data that result in a high cancer yield can occur when some suspicious lesions are allowed to be followed—shifting lesions that may deserve BI-RADS 4 assessments into the BI-RADS 3 category.
The primary mechanism for assessing individual or group practice patterns is an audit. Methods for performing an audit are described in the BI-RADS atlas. An audit can help determine, for example, if biopsies are being recommended appropriately. Although the performance benchmarks from an audit are fairly robust for mammography, additional information will be needed to establish similar targets for MR imaging. Regular audits will be a requirement for accreditation of breast MR imaging programs by the American College of Radiology (ACR).
Risks and Benefits of Short-Term Follow-Up
The mammographic BI-RADS 3 category was implemented when repeating a relatively fast and inexpensive mammogram could obviate the need for a more costly trip to the operating room for surgical excisional biopsy. Considering the need for preoperative and postoperative appointments, risk of general anesthesia, possible undesirable cosmetic outcomes, and morbidity of the procedure, the potential benefit for 98% of patients with benign lesions was large.
The development of percutaneous biopsy methods decreased the gap between follow-up imaging and tissue sampling. These techniques allow radiologists to acquire tissue samples without a visit to a surgeon or an intraoperative procedure. Risks of undesirable cosmetic outcomes are reduced along with morbidity. Compared with an operation, percutaneous tissue sampling is faster and less expensive. In 1997, Brenner and Sickles estimated the cost savings of periodic mammographic follow-up versus percutaneous biopsy to be $1040. A similar study of the cost savings of follow-up breast MR imaging has not been published.
At a minimum, patients that participate in an annual high-risk screening MR imaging program and receive a BI-RADS 3 assessment may undergo a single extra breast MR imaging at 6 months in lieu of MR-guided tissue sampling. These patients may thus benefit by having follow-up MR imaging that is less expensive and invasive than a tissue sampling procedure. However, this must be balanced against the risks. The patient population undergoing MR imaging has a higher baseline level of risk for malignancy than the general population undergoing screening mammography. The high-risk group may also be younger, have higher levels of anxiety, and have more aggressive tumors. The latter is particularly critical to the safety of short-term follow-up because the stated goal is to allow imaging surveillance of a lesion without a change in prognosis. Additional research is needed to determine if a balance that is acceptable can be achieved.
Cancer yield of BI-RADS 3
A mix of 7 retrospective and prospective articles have included data on the MR imaging BI-RADS 3 assessment category, with a resulting wide range of cancer yields (0%–10%). Four of these were designed, albeit retrospectively, to specifically investigate the MR imaging BI-RADS 3 assessment (see Table 1 ). The other 3 articles include data regarding the cancer yield of MR imaging BI-RADS 3 in prospective high-risk screening MR imaging trials (see Table 2 ).
Cancer Yield in Retrospective Studies
The cancer yield of MR imaging lesions assessed as probably benign ranges from 0.8% to 10%, and averages 3.7% in retrospective studies that specifically address the follow-up of BI-RADS 3 findings (see Table 1 ). These results paint a mixed picture of the utility of the MR imaging BI-RADS 3 assessment. The lowest cancer yield (0.8%, 2/236) was published by Eby and colleagues. This study included patients who underwent MR imaging for any reason, although the majority of examinations were performed to screen women at high risk or evaluate the extent of disease following a new diagnosis of cancer. The results suggest that the use of a BI-RADS 3 assessment is a safe alternative to biopsy. However, the very low rate of malignancy suggests that the probably benign assessment may have been used for some lesions more appropriately assessed as benign. Such a strategy, while cautious, may result in overuse of expensive medical resources and unnecessary anxiety for patients.
Liberman and colleagues, on the other hand, documented the highest cancer yield of 10% (9/89) among studies of probably benign MR imaging findings. All included lesions were initially assessed as probably benign in a population of high-risk and asymptomatic patients who underwent MR imaging for screening. The risk of malignancy in the cohort of lesions that were assessed as BI-RADS 3 was higher than acceptable for probably benign mammographic findings (<2%). The study identified lesion types that would be more appropriately placed into the BI-RADS 4 category. For example, although the sample size was small, a cancer yield of 25% (1/4) was identified in lesions described as ductal nonmass-like enhancement (NMLE) and initially assessed as probably benign. The investigators subsequently recommended that ductal enhancing lesions be biopsied rather than followed.
Sadowski and Kelcz reported a cancer yield of 5.1% (4/79) among probably benign lesions in a population of patients who underwent MR imaging to further evaluate a mammographic finding that was initially assessed BI-RADS 0 and “unresolved” by diagnostic mammography and/or ultrasound.
It is important to acknowledge that each of these retrospective studies had different inclusion criteria for patient populations: screening MR imaging only (Liberman and colleagues), screening and diagnostic MR imaging (Eby and colleagues), and BI-RADS 0 mammographic and sonographic workups. Because of this variability, it is not surprising that the results have also varied from an acceptable probably benign cancer yield of less than 2% to as high as 10%.
Cancer Yield in Prospective High-Risk Screening Trials
Four prospective high-risk screening trials have published data on the cancer yield in patients who received a BI-RADS 3 as the highest order assessment on screening breast MR imaging (see Table 2 ). The yields range from 1.1% to 2.4% in large multicenter studies. However, evaluation of the probably benign assessment was not the primary purpose of these investigations. As stated earlier, the methods in some cases underreported the incidence of BI-RADS 3 lesions and, therefore, may have underreported the cancer yield as well. However, because these are prospective trials in populations of screening patients, the low cancer yield suggests that the use of the BI-RADS 3 category may be appropriate in certain clinical scenarios.
Cancer Yield and Practice Patterns
As stated, the available data arise from heterogeneous studies with undefined criteria for probably benign imaging characteristics. It is, therefore, important to place the subsequent cancer yield of such lesions into the context of the practice patterns of the interpreting radiologists. Data that result in a high use of BI-RADS 3 and a low cancer yield may reflect an over-cautious pattern of recommending follow-up for many benign lesions. In essence, this results in a BI-RADS 3 category that is comprised of benign findings and very few cancers. Alternatively, data that result in a high cancer yield can occur when some suspicious lesions are allowed to be followed—shifting lesions that may deserve BI-RADS 4 assessments into the BI-RADS 3 category.
The primary mechanism for assessing individual or group practice patterns is an audit. Methods for performing an audit are described in the BI-RADS atlas. An audit can help determine, for example, if biopsies are being recommended appropriately. Although the performance benchmarks from an audit are fairly robust for mammography, additional information will be needed to establish similar targets for MR imaging. Regular audits will be a requirement for accreditation of breast MR imaging programs by the American College of Radiology (ACR).
Risks and Benefits of Short-Term Follow-Up
The mammographic BI-RADS 3 category was implemented when repeating a relatively fast and inexpensive mammogram could obviate the need for a more costly trip to the operating room for surgical excisional biopsy. Considering the need for preoperative and postoperative appointments, risk of general anesthesia, possible undesirable cosmetic outcomes, and morbidity of the procedure, the potential benefit for 98% of patients with benign lesions was large.
The development of percutaneous biopsy methods decreased the gap between follow-up imaging and tissue sampling. These techniques allow radiologists to acquire tissue samples without a visit to a surgeon or an intraoperative procedure. Risks of undesirable cosmetic outcomes are reduced along with morbidity. Compared with an operation, percutaneous tissue sampling is faster and less expensive. In 1997, Brenner and Sickles estimated the cost savings of periodic mammographic follow-up versus percutaneous biopsy to be $1040. A similar study of the cost savings of follow-up breast MR imaging has not been published.
At a minimum, patients that participate in an annual high-risk screening MR imaging program and receive a BI-RADS 3 assessment may undergo a single extra breast MR imaging at 6 months in lieu of MR-guided tissue sampling. These patients may thus benefit by having follow-up MR imaging that is less expensive and invasive than a tissue sampling procedure. However, this must be balanced against the risks. The patient population undergoing MR imaging has a higher baseline level of risk for malignancy than the general population undergoing screening mammography. The high-risk group may also be younger, have higher levels of anxiety, and have more aggressive tumors. The latter is particularly critical to the safety of short-term follow-up because the stated goal is to allow imaging surveillance of a lesion without a change in prognosis. Additional research is needed to determine if a balance that is acceptable can be achieved.