Disease and Treatment Monitoring

Preoperative, or neoadjuvant chemotherapy (NAC), in which systemic therapy is administered before surgery, is used to downstage primary breast cancers while reducing the risk of recurrence. NAC often results in complete eradication of tumor at the time of surgery (pathological complete response [pCR]), and it is well established that pCR confers excellent survival outcomes. The US Food and Drug Administration (FDA) now accepts pCR as an endpoint in clinical trials to support accelerated drug approval for high-risk early-stage breast cancer.

Among breast imaging methods, dynamic contrast-enhanced magnetic resonance imaging (DCE MRI) is particularly effective for visualizing the effects of neoadjuvant treatment on breast tumors. DCE MRI has been found to be more effective than clinical examination and other routine imaging modalities (mammography and ultrasound) for detecting residual disease and defining its extent. In addition to its high sensitivity, DCE MRI noninvasively provides information about breast tumor biology that can be used to predict response.

Although DCE is the standard MRI method for evaluating breast cancer, diffusion-weighted imaging (DWI) has been shown to provide additional and complementary information about tissue cellularity and microstructure of the tumor and surrounding tissue environment, which can also be used to characterize breast tumors and to monitor their response to treatment. In fact, several studies have suggested that changes in quantitative diffusion measurements from DWI can be detectable earlier in the course of NAC than changes in tumor size or vascularity measured by DCE MRI. This is because effective drugs induce apoptosis and/or necrosis in the tumor, which alters the cell density and reduces barriers to water diffusion. Before the cell death associated with NAC, cell swelling or damage to membrane integrity can occur, which may also affect the water diffusion in the tumor. For this reason, many studies have evaluated DWI for prediction of breast cancer treatment response.

Diffusion-Weighted Magnetic Resonance Imaging Approaches for Treatment Monitoring

Three primary requirements common to almost all studies investigating quantitative DWI for treatment outcome prediction are DWI acquisition; image analysis, in particular the selection of regions of interest (ROIs) for quantitation; and statistical analysis for prediction of a clinically useful patient outcome. In the setting of neoadjuvant treatment of invasive breast cancer, the outcome of interest is generally the pathological response at surgery or disease-free survival at 3 to 5 years following treatment—both short-term endpoints commonly used as surrogates of overall survival. Fig. 5.1 illustrates a typical multiregimen NAC treatment timeline, showing sequential MRI examinations at baseline (pretreatment), early in the first drug regimen, between the first and second regimens, and posttreatment before surgery, similar to that used in the I-SPY 2 TRIAL. Timing of the early treatment MRI varies among published studies and is generally between 1 and 6 weeks of treatment. Fig. 5.2 illustrates serial DWI acquisitions for a patient undergoing NAC. In this example, multiparametric MRI examinations including DCE MRI and DWI acquisitions were conducted at fixed time points before, during (early and midtreatment), and after the full course of NAC treatment (see Fig. 5.1 ). The patient showed a positive but incomplete response to treatment with both MRI modalities. DCE MRI showed a large reduction in tumor volume but residual enhancing tumor following treatment. On DWI, tumor apparent diffusion coefficient (ADC) increased steadily with treatment but remained lower than that of normal fibroglandular tissue.

Dwi Acquisition and ADC Mapping

For DWI acquisition, the most common sequence is a 2D, fat-suppressed single-shot echo-planar imaging (SS-EPI) using a single low b value (generally 0) and a single high b value applied in three gradient directions (isotropic acquisition). This sequence has the advantages of simplicity and speed, with reasonably good signal-to-noise ratio (SNR). There is currently no consensus on the optimum high b value for evaluating therapeutic effects, with a range of at least 600 to 1500 s/mm ² reported in the majority of treatment response studies. The primary disadvantages of the two b value SS-EPI DWI approach are the limitation to monoexponential modeling (yielding a single ADC metric) and image quality issues common to EPI-based acquisitions including distortions, ghosting, and incomplete fat suppression. These issues are of particular concern in treatment monitoring as they can be inconsistent over the course of multiple longitudinal examinations, contributing errors to the measurement of changes in diffusion parameters.

Image Analysis

For image analysis, a particular challenge of DWI treatment response studies is appropriate definition of ROIs. Tumor localization and delineation on breast DWI can be difficult or even impossible in this setting. As opposed to DCE MRI, which is generally obtained with relatively high spatial resolution and high SNR, DWI scans are typically of lower resolution and poorer SNR. Furthermore, spatial resolution differences and geometrical distortions that are common in DWI also make it difficult to transfer tumor ROIs directly from DCE MRI, where they can be more accurately defined.

Currently, most breast DWI studies employ manually drawn ROIs done either on a picture archiving and communication system (PACS) workstation (where ROI geometries may be limited) or on a dedicated research workstation. It is a common practice to avoid necrotic and cystic areas and clip artifacts so that only viable tissues are included. Fig. 5.3 illustrates some challenges of ROI definition by contrasting a single mass lesion with a diffusely enhancing tumor being evaluated for ADC before treatment. The guideline applied in these examples was to identify the lesion on a DCE MRI image and then manually delineate the ROI at the corresponding locations on the diffusion scan, referencing the high b value DWI (e.g., b = 800 s/mm ² ), the ADC map, or a combination thereof, selecting regions hyperintense on DWI and hypointense on ADC. The challenge of getting a true whole-tumor segmentation in diffuse or multifocal disease is clearly illustrated by the number of individual contours seen in the second example in Fig. 5.3 . In a consensus publication by Padhani and colleagues, the recommendation was made that ROIs be drawn to completely delineate lesions on images that have the highest contrast between lesion and normal tissue and to avoid defining smaller ROIs within lesions, which is considered more subjective and not recommended for assessing treatment response.

It is common to define “pseudo-3D” regions by selecting tumor on multiple planes to reduce sampling errors inherent with single-plane definitions. However, this can greatly increase the skilled-operator time required for analysis, and there are some indications there may be no benefit over single-slice ROIs. Studies with multiple readers, unless explicitly investigating interreader reproducibility, generally use consensus ROIs from multiple radiologists or trained researchers. For longitudinal trials, most commonly a single operator or consensus team would evaluate all DWI studies for an individual subject to minimize errors from interoperator variability. We also note that in longitudinal studies, the challenge of defining reproducible tumor ROIs tends to increase in difficulty over the course of NAC, as the contrast between tumor and normal fibroglandular tissue decreases, particularly in good responders. This may be less of a concern in studies investigating the early-treatment prediction of patient outcome; however, even as early as 3 weeks into NAC, there are examples of excellent responders where the tumor is no longer discernable on MRI ( Fig. 5.4 ).

Statistical Analysis for Response Prediction

Selection of outcome variable(s) and methods for evaluation of predictive capability of the predictor variable(s) can vary between treatment response trials. As mentioned earlier, the most common outcome variable across studies is pCR, defined as the absence of residual invasive cancer in the breast and all sampled axillary lymph nodes. The FDA also allows a stricter guideline requiring absence of both residual invasive cancer and in situ cancer, and all sampled axillary lymph nodes. For binary endpoints like pCR predictive ability is generally evaluated using receiver operating characteristic (ROC) analysis with area under the receiver-operator curve (AUC) as the primary metric. In some studies, response is defined by clinical or imaging measurements by applying the response evaluation criteria in solid tumor (RECIST) definition of less than 30% reduction to identify nonresponders.

Tumor Mean ADC for Predicting Response

The most common diffusion measurement for NAC monitoring and outcome prediction is the mean ADC value within the tumor. ADC is typically calculated pixel-wise by fitting a monoexponential function to all b values acquired in the DWI sequence (or directly if there are only two b values) and then taking the mean over the region representing some or all of the cancerous lesion. Alternatively, mean ADC can be calculated using the average DWI signal intensities (e.g., b = 0, 800 s/mm ² ) over the region. Iima and colleagues demonstrated that the second approach is less affected by noise than the first approach and thus is more accurate. As mentioned earlier, the region is generally delineated by user-defined ROIs. Numerous studies have investigated the value of tumor mean ADC measures before, during and after NAC, as well as ADC changes from pretreatment values to later time points during NAC, for the prediction of response to treatment.

Pretreatment Adc: Correlation With Treatment Outcome

Although the major focus of monitoring treatment response is on assessing the disease state during the course of NAC, the pretreatment determination of tumor ADC may be helpful for categorizing tumors, especially when combined with tumor subtype information, and may help in assignment of treatment. Several studies have found lower pretreatment tumor ADC to be associated with positive treatment response. A retrospective study by Park and colleagues looked at 53 consecutive women with invasive breast cancer who had MRI examinations with DCE and DWI (b = 0, 750 s/mm ² , 1.5 T) before and after NAC. Pretreatment ADC was significantly lower in clinical responders (based on RECIST criteria) than ADC in nonresponders. Richard and colleagues performed a similar study in a larger cohort of 118 women who had DWI (b = 50, 700 s/mm ² , 1.5 T) performed less than 15 days before chemotherapy. In this study, pretreatment ADC was also found to be significantly lower in complete responders but only in an n = 37 subgroup with triple negative tumors (ADC = 1.060 ± 0.143 × 10 ⁻³ mm ² /s vs. 1.227 ± 0.271 × 10 ⁻³ mm ² /s, P = .047). In this analysis, pathological response were classified according to the Chevallier and Sataloff classifications. However, in a 2018 report on 142 women, Yuan and colleagues found small but significant differences in baseline ADC value between pCR and non-pCR groups within all genomic subtypes. This result was part of a broader investigation of optimum time points for DWI evaluation, described in more detail later. A possible explanation for an inverse relationship between pretreatment ADC and response could be that high ADCs may be due to local necrosis or fibrosis, which indicate more poorly perfused tumors and relative difficulty in delivering chemotherapy agents into the tumor. Other studies failed to find statistical difference of pretreatment tumor ADC values between responders and nonresponders.

Change in ADC with Neoadjuvant Treatment

Tumor ADC tends to increase with treatment, approaching the higher values typical of healthy fibroglandular tissue. Several animal studies demonstrated that DWI-detected cellular changes are associated with treatment-induced cell death. For example, Cheng and colleagues observed ADC values increased on day 3 after chemotherapy, and further showed that these changes preceded measurable tumor volume changes in a gastric cancer mouse model. In a preclinical study of a breast cancer mouse model treated by anti-DR5 antibody, Kim and colleagues found ADC values increased on day 3 with the amount of increased dose level dependent, and ADC increases were inversely proportional to the density of cells showing Ki67 expression.

A number of clinical studies have also reported that an increase in tumor ADC can be detected early in the course of the treatment (e.g., after one cycle of NAC), in some instances before the tumor shows any significant decrease in size. In a prospective study of 62 patients, both DCE MRI and DWI (b = 0, 750 s/mm ² , 1.5 T) were acquired at pretreatment, after one cycle (3–4 weeks, anthracycline- and cyclophosphamide-based NAC), and posttreatment. The study found that the percent increase in mean ADC after one cycle of NAC was significantly higher in the pCR group ( P < .001). The longest diameter was also measured from DCE MRI, but no significant differences were found between pCR and non-pCR groups after one cycle of NAC. Similar results were also seen in studies with smaller cohorts, using different methodologies to measure tumor size before and after one cycle of NAC.

Mri Time Point Selection for Response

Selecting the most effective timings and frequency for DWI assessment in NAC treatment outcome prediction is a challenge. Many factors weigh into the optimum time point or points, including treatment regimens, tumor subtypes, drugs administered, and possibly technical aspects such as the typical SNR and b value selection, which may affect the magnitude of ADC changes that can be detected. Yuan and colleagues systematically studied ADC and change in ADC from pretreatment (ΔADC) in the prediction of pCR in 142 patients, using DWI (b = 0, 300, 600, and 1000 s/mm ² , 3.0 T) at pretreatment and after the first, second, third, fourth, sixth, and eighth cycles of NAC. Each cycle of therapy in this study was 3 weeks long, in contrast to weekly cycles commonly used in other treatment regimens. The study cohort included patients in three different treatment regimens and included analysis by genomic subtype. The study found first that ΔADC was generally better for pCR prediction than absolute ADC measures. This was generally true across subtypes; for example, for luminal B subtype in one treatment regimen, the best AUC for ΔADC was 0.865 versus only 0.602 for the best single time point ADC. Similar results were seen in other subtypes and treatment arms, though not all showed as strong predictive performance. The optimal timing window during NAC for ADC measurement for prediction of pCR was found to vary by subtype and treatment, but in all cases was either after the first or second 3-week cycle of NAC. By contrast, in the prospective ACRIN-6698 study (discussed in the next section), Partridge and colleagues did not find significant predictive power of ADC for pCR after 3 weekly cycles of NAC but did find significance at 12-week and later time points.

Multicenter Clinical Trials Using Dwi for Outcome Prediction

Given the heterogeneity of breast DWI protocols across institutions and the differences in DWI implementations across scanner platforms, well-controlled multicenter, multivendor trials are essential for establishing the efficacy of DWI-based biomarkers for treatment response. In a multicenter study of 39 patients enrolling in three different prospective clinical trials at separate sites, Galban and colleagues tested ADC metrics for assessing response to NAC. All patients had invasive breast cancer and were treated with NAC. MRI examinations with DWI (b = 0, 800 s/mm ² , 1.5 T, and 3 T) were performed at pretreatment and 3 to 7 days, 8 to 11 days, and 35 days after treatment was initiated, respectively, for the three trials. Treatment outcome was assessed either by palpation after the first cycle of NAC (one site) or by RECIST 1.1 evaluated on DCE MRI (two sites). Assessment of treatment response on DWI scans was done using ADC both with the standard ROI-based mean analysis and the parametric response map (PRM) technique. The PRM technique involves intervisit image registration, using a combination of rigid registration and manually assisted nonrigid registration, allowing calculation of voxel-wise maps for change in diffusion parameters. Applied to the ADC maps, volumetric parameters PRM _ADC– , PRM _ADC0 , and PRM _ADC+ were calculated as the fractional tumor volumes with decreasing, stable, and increasing ADC values, respectively. Due to the small numbers in each trial and different methods applied for determining response, two patient groups were defined: (1) responders and stable disease (R/SD); and (2) progressive disease (PD). Change of mean ADC was found to be significantly predictive of outcome( P = 0.012) at the 35-day time point, although the sample size was small ( n = 14). PRM _ADC+ was also found to be significantly higher in the R/SD group than the PD group both at the 8- to 11-day time point ( P = .006) and the 35-day time point ( P = .004).

A larger multicenter study was the American College of Radiology Imaging Network (ACRIN) trial 6698, a multicenter prospective study to evaluate DWI for prediction of pathological response (ClinicalTrials.gov Identifier: NCT01564368), performed as a substudy of I-SPY 2 (Investigation of Serial Studies to Predict Your Therapeutic Response through Imaging and Molecular Analysis 2, ClinicalTrials.gov Identifier: NCT01042379). The primary objective of ACRIN 6698 was to test whole-tumor ADC for prediction of pCR in women undergoing NAC for breast cancer. A secondary aim was to evaluate repeatability of ADC measurements. ACRIN 6698 used a standardized DWI acquisition ( b = 0, 100, 600, 800 s/mm ² , 1.5 T, or 3.0 T) optimized for breast scanning, along with a QA/QC program for DWI-specific site qualification and image quality review. DWI was performed within the same examinations as DCE MRI at 4 time points during NAC: pretreatment, early treatment (typically after 3 weekly cycles), interregimen/midtreatment, and posttreatment/presurgery (see Fig. 5.1 for the study schema). Ten I-SPY 2 sites that completed DWI qualification enrolled patients to ACRIN 6698 between August 2012 and January 2015. The study used manually delimited whole-tumor ROIs.

The trial reported primary results in 242 patients, in which the change of tumor ADC was found to be predictive of pCR at midtreatment (between regimen) and before surgery. The change of ADC at the early treatment time point did not show a statistically significant difference between pCR and non-pCR groups. Repeatability of tumor ADC measures was assessed in a subset of 71 subjects who underwent “coffee-break” style repeat DWI scans, with inter- and intrareader reproducibility also assessed in a 20-subject subgroup. Overall, the results showed that excellent repeatability and reproducibility of breast tumor ADC measures could be achieved in a multiinstitution setting using a standardized protocol and QA procedure (e.g., repeatability within subject coefficient of variation [wCV] = 4.8%). This study is discussed further in Chapter 14 . In regard to treatment response monitoring, we note that these repeatability and reproducibility results only pertain directly to ADC measures taken pretreatment or within the first few cycles of NAC treatment. Reproducibility may be poorer at later time points due to increased challenges in defining tumor ROIs after significant treatment response.

Alternative DWI Metrics to Mean ADC

Histogram and Texture-Based Dwi Parameters for Treatment Response

Mean tumor ADC is the most commonly used metric for quantitative measurement in DWI and it has demonstrated predictive value for patients undergoing NAC, as described earlier. However, it is an average measurement of water diffusivity for the entire tumor and does not reflect the spatial heterogeneity of the tumor microenvironment. Analysis of ADC histograms may provide additional information for characterizing changes with treatment ( Fig. 5.5 ). Several studies found lower percentiles of tumor ADC histograms to be most predictive of treatment response in breast and other cancers. In a study of glioblastoma treated with concurrent chemotherapy and temozolomide, the fifth percentile of the cumulative histogram differentiated between true progression and pseudoprogression. The study of Kyriazi and colleagues found that percentage change of the 25th percentile of the ADC histogram was most predictive of treatment response after the first and third cycles for patients with advanced ovarian cancer. Wilmes and colleagues found strongest correlations between early treatment increase of lower (15th or 25th) ADC percentile and posttreatment tumor volume change in patients with breast cancer. However, other studies found median or higher ADC percentiles to be more predictive of pCR. For example, in a multiparametric 3 T MRI study of breast cancer with 42 patients, the highest AUC using DWI was for median ADC after 2 cycles of NAC.