Imaging in Musculoskeletal, Metabolic, Endocrinological, and Pediatric Clinical Trials



Fig. 11.1
Semiquantitative scoring system for vertebral deformity in osteoporosis, graphic representation (Adapted with permission from Genant et al. [8])



Bone densitometry using DXA in osteoporosis has become a standard in clinical trials. In theory it is not the best technique for measuring bone density as it provides a two-dimensional outcome parameter (in gram per square cm or g/cm2) while measuring a three-dimensional object. However, the regulatory agencies are acceptable of the data but not as a primary efficacy outcome in osteoporosis treatment. Once proven by a fracture study, DXA is an acceptable technique for both the assessment of prevention trials and more recently non-inferiority studies. However, DXA (or DEXA) has become the best validated technique just because of its accessibility, low radiation dose, and ease of use.

The challenge of using DXA for eligibility criteria has been described in more detail elsewhere [12]. However, briefly challenge is that usually for an osteoporosis study or similar, patients who are defined as osteoporotic have a so-called T-score (comparison against peak bone mass or Z-score which is age-matched control) of −2.5 (minus 2.5) or lower. This is gender, race, and anatomical area specific. Furthermore, the manufacturers have normative data bases which are not quite interchangeable, so some allowance has to be considered to ensure the population is uniform throughout the study [13]. To further reduce this variation, there are two manufacturers, GE Healthcare (Lunar) and Hologic Inc, that make 90–95 % of all the world’s DXA instruments, so most studies are reduced to using just these types. There is a second challenge: the calibrations of the two instruments have a calibration difference of about 10–15 % (Chap.​ 1).

The second and ongoing challenge with using DXA in clinical trials is that it is a Type 1 Instrument (see Chap. 2). Therefore, there is need to monitor instrument performance or calibration. If there is a calibration shift or a change in the DXA instrument, then there has to be a process described that will evaluate the effect of the calibration shift to the subject data and then a second process to recalculate the subject BMD changes to compensate these calibration shifts. The end point should be that subjects’ BMD results should be calculated as the percentage change from baseline and the results aggregated. This essentially removes inter-instrument variability. Therefore, at the start of the study, each site should measure phantom that covers a range of densities, such as the Bona Fide Phantom (BFP) (BioClinica Inc, Newtown, PA, USA), ten times without repositioning. If later during the course of the study a site changes instrument or has an instrument breakdown or change in the underlying calibration, the same BFP should measured again and the change in calibration evaluated using a regression analysis. If the measurement or calibration changes by more than twice the error of the BFP measurement (nominally 1 %), then a regression analysis can be applied to the subject BMD data acquired on that scanner, post-calibration change and the percentage change of the subject recalculated.

Quantitative computed tomography (QCT) provides a three-dimensional measure (in gram per cubic cm), thus true bone density, and has a better sensitivity to change as it measures specifically in the trabecular compartment (with high bone turnover) of the vertebral bodies in the spine. These are standard measurements with QCT that are used to report BMD in the lumbar spine, but it can also be applied to other skeletal parts. Peripheral QCT (pQCT) measurements are performed on specially designed small-bore CT scanners. Like QCT in the spine, pQCT can provide separate measurement of the cortical and trabecular structure in peripheral regions such as the forearm, femur, and tibia. High-resolution QCT (HRQCT) is a further development in QCT measurements. HRQCT allows the analysis of trabecular structure with high-resolution thin slices. HRQCT is commonly used in research setting for microstructure analysis of bone specimens but can be extended to clinical settings.

QCT can be used to measure cortical and/or trabecular bone mineral density, and volumetric and cross-sectional areal bone geometry, allowing for additional assessments of bone quality and characteristics for osteoporosis. Cortical bone assessments have generally evaluated in the femur, but due to the thickness of the spine it has not been possible to accurately or precisely assess this bone compartment. Most femur assessments have evaluated the whole cortical shell [14, 15]. More advanced analysis techniques used include finite element analysis of the spine [16] and an analysis technique developed by Mindways Software Inc (Austin, TX, USA) [17, 18] which identifies and evaluates the four quadrants of the femoral neck cortical shell for both vBMD and thickness. Quadrant QCT analysis allows a noninvasive technique to elucidate anatomic distribution which may be critical in determining resistance to fracture, e.g., the superior cortex of the femoral neck is a stronger predictor for fracture than the inferior cortex [18]. The ability to segment out trabecular and cortical bone with QCT scans is particularly important for the evaluation of new therapeutic agents in each bone compartments. This has been recently shown by a new study using rosiglitazone where a negative therapeutic response was observed in 52 weeks [19]. If such a response was observable in a compound with relatively small therapeutic impact, as the authors state, it is highly likely that this end point may be of value in the treatment of osteoporosis.



Rheumatoid Arthritis


Rheumatoid arthritis is a progressive disease characterized by synovial joint inflammation, eventually leading to destruction of cartilage and underlying bone structures. For decades it was very difficult to treat. Drugs used were nonspecific like corticosteroids (against inflammation in general) and methotrexate (against tissue proliferation in general). Nowadays, disease-modifying antirheumatic drugs (DMARDs) like anti-tumor necrosis factor-alpha (or anti-TNFα) are used and being developed that are able to halt disease progression [2027]. Furthermore, at the time of writing there are a slew of new DMARDs in development or in review with the regulatory agencies, such as the so-called JAK inhibitors [28], of which the first one has just been approved by the FDA, and a slew of interleukin (IL) compounds like IL-6 and IL-17. In imaging terms, rheumatoid arthritis is characterized by bone destruction and cartilage loss leading to joint destruction as assessed by bone erosions and decreased joint space narrowing, respectively. Disease progression is characterized by the joints being deformed and ultimately destroyed. Conventional radiography of the hands and feet are used to “grade” the disease. Very elaborate semiquantitative grading schemes have been developed over the years that encompass both joint space narrowing as well as bone erosions [29]. The historical timeline of these is shown in Table 11.1. The Sharp score is arguably the most documented, and its variation described by van der Heijde is the one most widely used in clinical drug trial to assess drug efficacy. It is now the scoring system of choice in the EMA guidelines for assessing DMARDs in clinical trials. As such these visual scoring systems are regarded fully validated. Standardized imaging protocols have been described for obtaining the radiographs of the hands and feet and are described fully elsewhere [38].


Table 11.1
The history of semiquantitative scoring systems in rheumatoid arthritis


































Scoring system

Date of publicationand reference

Steinbrocker Index

(1949) [30]

Kellgren’s Method

(1957) [31]

Sharp Scoring Method

(1971) [32]

Larsen Scoring

(1977) [33]

Genant Scoring Method

(1983) [34]

Modified Sharp

(1985) [35]

The Sharp/van der HeijdeScoring Method

(1989) [36]

Modified Genant Scoring Method

(1998) [37]

A new challenge is emerging in these trials: patients are being treated at a much earlier stage of the disease when there are no or minimal features of the disease visible on radiographs. Since the indication for DMARD requires the radiological demonstration of the decrease in the disease progression, many studies now require eligibility criteria that have to be centrally evaluated to show clear evidence of radiological disease. Furthermore, standard of care is being used as the comparator, and the trials are requiring many more subjects to show the new molecule has clinical and radiological benefit.

Magnetic resonance imaging (MRI) has been proposed as a new imaging biomarker for the assessment of rheumatoid arthritis. While at the time of writing it is still not accepted by the regulatory agencies as the primary end point for Phase III studies, it is being used very successfully in Phase II studies for “go/no-go” decisions for continuing drug development or dose-ranging studies [39]. It provides a visual interpretation of synovial inflammation, and in addition quantification of contrast uptake in the inflamed tissue has been investigated. As with radiographs there is a semiquantitative scoring system or the so-called RAMRIS (rheumatoid arthritis MRI scoring). This requires the evaluation by specialists in the field and is labor intensive. The MRI scans have also to be acquired in a very standardized manner with subjects lying prone in a scanner in the “superman” position or supine with their hands and wrist in a special coil. This can be very daunting and for those in pain, preventing motion during the 30–45 min, scan acquisition can be difficult. Also the preferred use of contrast agents further adds to the complexity of the study.

Novel inflammation-specific PET-tracers are being developed to try to assess disease activity, and more recently the evaluation of the pharmacologic intervention is being investigated by the use of dynamic contrast-enhanced (DCE) MRI [40]. Ultrasound is having a role to play, particularly in Europe, and with the incentive to reduce radiation dose to patients, ultrasound of the joints has become a recognized end point for Phase IIb and Phase IV studies. Ultrasound, as discussed in Chap. 1, is very operator dependent, so this requires a high degree of site operator training if this modality is to be used in clinical trials. Furthermore, the site has to be very careful in labelling all the joints so the central readers can clearly identify the anatomy during the central read without access to the patient.


Osteoarthritis (Degenerative Joint Disease)


The classic description of osteoarthritis is cartilage lost due to wear and tear that eventually will lead to joint space narrowing and bone remodelling (osteophytes and sclerosis). However, more recently there are debates that it may be an inflammatory disease mediated by the so-called mechanokines or mechanical insult. Furthermore, there may be different pathophysiological pathways that are more clearly elucidated such as anterior cruciate ligament repair leading to knee osteoarthritis 20–30 years later, or a meniscal tear or meniscectomy versus a patient who has spent their life undergoing heavy labor and whose joints have undergone bony degeneration, remodelling, and cartilage destruction. Without going into the debate of the etiology, radiographically osteoarthritis is now recognized as a disease of the whole joint [41, 42]. Most clinical trials have focused on the knee due to the higher incidence although osteoarthritis occurs at the hip, shoulders and hand, with the latter two joints being non-weight bearing, so there is another argument as to whether this is truly primary osteoarthritis.

Osteoarthritis is usually detected on radiographs as joint space narrowing and specific features of bone remodelling that can be graded according to the severity of the disease. The Kellgren and Lawrence scale is the best known grading system originally being described in 1952 for knee and hips [43]. It is still the so-called gold standard for the eligibility criteria for clinical trials in osteoarthritis [44]. However, there are a number of different modifications to the original description with one paper citing ten different versions [45]. It is a scaling system that while it appears straight forward and simple is very difficult to obtain initial consensus between a group of radiologists due to the nuances in the disease and therefore requires “reader calibration” for use with a pool of readers in clinical or epidemiological clinical trials. Due to the slow rate of change in the characteristics of the joint assessed by the Kellgren and Lawrence scoring system, it is not used for efficacy. The regulatory authorities (FDA and EMA) still require joint space narrowing (JSN) as assessed by plain film radiographs to be the primary outcome in a disease-modifying anti- osteoarthritis drug (DMOAD) model. Joint space width (JSW) is a difficult end point to assess due to the reproducibility required to assess a change of 0.1 mm to 0.16 mm per year decrease in subjects with confirmed osteoarthritis (Kellgren and Lawrence score 2 or 3). The acquisition protocol has to be very clearly defined, and the one arguably shown to be the most reliable is the modified Lyon-Schuss using a plexiglass positioning device [46]. With good quality acquisition the precise measurement of JSW can be obtained. Even then, there are several different methodologies that have been described [47, 48], but usually this is the medial aspect at a fixed anatomical point, but could be the narrowest within the predefined area, or even the mean of the tibial plateau/femoral condyle space.

The use of MRI for the assessment of OA has, as with RA, gained a place in clinical development especially in Phase II. However, at the time of writing, there is no one set of criteria or measurements that clearly provides the go/no-go signal that has been accepted by the FDA. MRI assessments can be broken down into quantitative and semiquantitative or scaling techniques. The former, at a minimum, evaluate cartilage thickness in different sub- anatomical areas of the medial and lateral cartilage [4951]. They can also evaluate shape of the cartilage [52] using active shape modelling. There are a number of so-called “semiquantitative” scoring systems. The first one was arguably the Whole-Organ Magnetic Resonance Imaging Score of the osteoarthritis in the knee [53]. This has been superseded by the BLOKS (Boston Leeds Osteoarthritis Knee Score) [54], and a combination of the two has recently been developed, the so-called MOAKS (MRI Osteoarthritis Knee Score), by the same team [55].

The field of clinical trials in osteoarthritis is now littered with a number of failed drugs trying to prove DMOAD status. These include the risedronate study [56, 57] which failed the primary end point but provided significant insight in the field to improve future studies. The doxycyline study was one of the best conducted but was underpowered [58]. More recently, the calcitonin studies reached statistical significance with an MRI evaluation method but failed the primary end point of reduction in JSN by radiographs [59, 60]. Since this study had previously reported futility analysis failure, it can only be surmised that either the subjects were incorrectly enrolled or the quality control of the images was performed very poorly. In contrast the most recent program for an iNOS inhibitor, cindunistat, passed futility analysis and showed statistical significance at year 1 against placebo in those subjects with a modified Kellgren and Lawrence grade 2 (not grade 3). This is an important landmark study in which the results and methodology are both published as separate papers [44] led by the Hellio Le Graverand team [46], since it is the first time drug was shown to have statistically beneficial DMOAD properties with a radiographic end point. Unfortunately efficacy was lost at year 2 and the FDA requires statistical significance in radiographic joint space narrowing for 2 years.

Unlike joint space narrowing for osteoarthritis, the FDA has accepted MRI as the end point for focal cartilage defect healing using an implant [61]. For cartilage regeneration evaluation the so-called MOCART scale (magnetic resonance observation of cartilage repair tissue) [62] was developed. This has become a standard scoring system for focal cartilage repair and regeneration and is accepted by the FDA.


Fracture Healing


Radiographs as well as CT have been used to describe fracture healing. This is not trivial since the definition of fracture healing on radiographs is not quite clear. Usually bridging of cortical bone (which is usually circumferential) of at least 75 % of the fracture plane is used as a definition of successful fracture healing in tubular bones. This requires radiographs in at least two directions or a dedicated 3D CT scan. The RUST (Radiological Union Score for Tibial fractures) [63] has become the standard approach for this end point and evaluation, at least for fractures of the tibia.


Bone Marrow Disease


Bone marrow disorders can have different origins. Next to several types of leukemic disease and metastasis, there are more exotic diseases like Gaucher’s disease. Radiographs depicting the skeletal status have been used to assess disease severity and disease progression. However, radiographs are sensitive to bone disease but less sensitive to bone marrow changes. MRI is the preferred technique to grade bone marrow burden. Only recently some imaging biomarkers have been validated for use in trials to study drug efficacy in Gaucher’s disease [64].


Pediatric Bone Disease


The development of pediatric studies has lagged behind those of the adult, but in more recent years, mainly due to the emphasis by both the EMA and FDA to have new products developed in this specialized population and the so-called “pediatric exclusivity” program, there has been a larger number of studies of late. Further development in pediatric populations has occurred as there has been a focus in the pharmaceutical industry towards orphan drug indications and other unmet medical needs, of which many are genetic mutations and therefore present in children. Although the standard radiological techniques can be applied, there are challenges evaluating the growing skeleton. Plain radiographs have beam divergence, and therefore even measuring the length and hence growth velocity of the long bones is challenging, and radiopaque rulers have to be in position during the acquisition of radiographs.

For DXA the challenge is that 3-dimensional objects, the bones, are increasing over time but only displayed and measurements calculated in 2 dimensions, confounding longitudinal measurements. Z-score change is arguably the optimum method to achieve a meaningful end point, since this uses a normal reference data set and hence growth changes in the evaluation of change in BMD seen in a pediatric population. The challenge is that many of the pediatric studies are in severely diseased children whose growth is already abnormal and whose level of pubertal on set and therefore growth patterns may be significantly distorted from the norm. So there have been a number of approaches of late to create a superior method and the development of height adjusted Z-score was developed [65, 66]. Essentially a subject’s height is the taken from the standardized growth curves by comparing their height to the mean of the curve and giving them this age to then calculate the BMD Z-score. In other words, creating a bone age related to normal development. However, no one single methodology at the time of writing has come to the fore as the de facto standard.

Another approach with DXA has been the assessment of the distal femur [67]. This measurement was originally developed by the team at the Alfred I. duPont Hospital, Delaware, USA for assessment in children suffering from cerebral palsy. The side position for the patient, is comfortable and allows them to be relaxed and still for the measurement. This measurement has been further developed and expanded into other populations and has been successfully used in a number of clinical trials [68].

Peripheral quantitative computerized tomography (pQCT) has been used extensively in pediatric studies due to the ease of use, low radiation dosage, and a 3D evaluation of bone. These are dedicated systems of which there are two main manufacturers, Stratec and Scanco. Stratec is the most prevalent system and many studies have reported outcomes based on data collected by this instrumentation. As already stated, the challenge with DXA is the 2D evaluation of the growing bone. pQCT removes this challenge. More recently Mindways has developed a “pQCT” version of their software allowing a standard CT scanner to be used. The subject lies in the scanner in a “superman” position with arms outstretched so the forearms can be scanned avoiding radiation to the brain and torso. This will provide further investigator sites that can be employed in pediatric clinical trials without having to purchase expensive dedicated equipment.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Aug 21, 2016 | Posted by in GENERAL RADIOLOGY | Comments Off on Imaging in Musculoskeletal, Metabolic, Endocrinological, and Pediatric Clinical Trials

Full access? Get Clinical Tree

Get Clinical Tree app for offline access