Morphological and Appearance Features for Predicting Physical Disability from MR Images in Multiple Sclerosis Patients

MRI scans ${I} = \{I_1,\dots ,I_{n}\}$ where each 3D MRI scan has a corresponding real number clinical score $y_i \in Y,$ and a corresponding spinal cord segmentation $S_i \in S.$ The dimensions of and are the same. Each voxel in has a value between 0 and 1, where 0 represents the background and 1 represents the spinal cord. Voxels in that are on the boundary of the spinal cord are assigned a fuzzy value between 0 and 1 that represents an estimated percentage of the voxel that contains spinal cord (i.e., partial volume) [15].

Our objective is to create a model

, using the images

and segmentations

capable of predicting the patients’ clinical scores

from novel MR images. We extract a set of features

from

and

that are transformed by model

into values $\hat{Y,}$ such that these predicted values $\hat{Y}=M(X)$ estimate the corresponding clinical scores

One approach is to set

as a simple linear regression model with the spinal cord volume as the single explanatory variable

This is similar to the existing literature where a Pearson’s correlation coefficient is computed to measure the linear dependency between the spinal cord volume and clinical score. However, as mentioned in the introduction, this linear dependency using spinal cord volume does not always reveal a strong clinical relationship. We improve on this by deriving new morphological and MRI-based appearance features

and examining ways to combine them in more descriptive models

2.2 Candidate Features

We describe simple candidate morphological and appearance features

that are potentially sensitive to spinal cord changes. This is not meant to be a comprehensive set of features, but is sufficient to explore the potential of going beyond measuring cord size to predict disability. We first define the commonly used spinal cord volume, which is computed by summing all voxels, including the partial volumes $S_i(j)\in [0,1],$ in the segmentation, ${ {v}}ol = \sum ^J_{j=1} S_i(j),$ where

is the total number of voxels in

While spinal cord volume captures a global measure of spinal cord atrophy, we are also interested in features that vary at least partly independently from area or volume, and that are sensitive to spinal cord changes at a local scale.

Fig. 1

Illustrations of the proposed features. a The distances (dashed line) from the center-of-mass (center box) to the boundary voxels (circles) make up $$per_k$$

. b The distances to the nearest boundary point from the voxels inside the cord give $$dist_k$$

(brighter implies farther). c An ellipse is fit to the cord. d The normalized intensities of the cord are considered in $$int_k$$

Our first proposed feature is designed to be more sensitive to local changes in the spinal cord’s boundary. On each 2D axial slice of the segmentation

we find voxels on the boundary between the spinal cord and background by considering voxels in

with a partial volume greater than 0.5 to be spinal cord. This results in a 2D binary image that we use to extract the cord’s boundary voxels. For the

th 2D axial slice of the spinal cord, we take the Euclidean distance between the center-of-mass

of the cord’s $k{\mathrm {th}}$ cross section, and the spinal cord boundary/perimeter voxels

computed as, $per_k = ( d(c_k,b^1_{k}), \dots , d (c_k,b_{k}^{m(k)}) ),$ where

represents the $i{\mathrm {th}}$ boundary voxel on the

th slice, and

computes the Euclidean distance between the two coordinates (Fig. 1a). The number of boundary voxels

can change for each 2D slice. We find the minimum distance from the center-of-mass to the boundary voxels in each 2D slice averaged over

2D slices,

$\begin{aligned} per_\mathrm {min} = \frac{1}{K} \sum _{k=1}^K \mathrm {min}(per_k) . \end{aligned}$

(1)

In a similar way, to compute additional features we replace the “min” function from (1) with the mean ( $per_{\mathrm {mean}}$ ), standard deviation ( $per_{\mathrm {std}}$ ), and the max ( $per_{\mathrm {max}}$ ) functions.

We define a related measure that focuses on local changes in 3D by calculating a 3D distance transform from the surface of the segmented spinal cord masked by (or restricted to) the interior region of the cord. To compute the distance transform, we calculate the Euclidean distance between voxels inside the spinal cord and the nearest boundary voxel in 3D. To further differentiate this feature from the

features, we consider voxels that contain any partial volume to be spinal cord, which changes the boundary voxels. The distance transform for slice

with

voxels inside the cord is represented as $dist_k = (t^1_k, \dots , t^{q(k)}_k)$ where

is the distance from the

th voxel inside the cord on the

th slice to the nearest 3D boundary coordinate (Fig. 1b). The number of voxels inside the cord,

can change for each 2D slice. In a similar fashion to (1), we replace

with

and the “min” function with the mean ( $dist_{\mathrm {mean}}$ ), max ( $dist_{\mathrm {max}}$ ), standard deviation ( $dist_{\mathrm {std}}$ ) and the max divided by the mean distance ( $dist^{\mathrm {max}}_{\mathrm {mean}}$ ) function averaged over the

2D slices. For clarity we formally define,

$\begin{aligned} dist^{\mathrm {max}}_{\mathrm {mean}} = \frac{1}{K}\sum _{k=1}^K \frac{\mathrm {max}(dist_k)}{\mathrm {mean}(dist_k)}, \end{aligned}$

(2)

which averages the ratio of the furthest boundary distance by the mean distance.

To compute features that are more robust to local noise, such as small segmentation errors, we fit an ellipse (Fig. 1c) to each 2D cross-sectional slice of the segmented spinal cord and compute the eccentricity (

), minor axis ( $ax_{\mathrm {min}}$ ), and major axis ( $ax_{\mathrm {maj}}$ ), averaged over the length the cord.

All the features proposed so far are dependent on the geometrical characteristics of the cord, but we also include features based on the intensities found within the MRI. As the intensity values can vary widely in different MRI scans, we normalize a scan’s intensities by its overall 3D scan intensities to produce z-scores. We extract the z-scores of those voxels that are labelled as spinal cord (partial volume

$$>$$” src=”/wp-content/uploads/2016/03/A323246_1_En_6_Chapter_IEq67.gif”></SPAN> 0.5) and take the mean (<SPAN id=IEq68 class=InlineEquation><IMG alt=

) and standard deviation ( $int_{\mathrm {std}}$ ) of the spinal cord intensity values averaged over the

2D slices (Fig. 1d).

2.3 Regression Models

Linear regression employs a linear function to model the relationship between the explanatory variable (e.g. spinal cord volume) and a response variable (clinical score). The parameters of this model are the coefficients $\beta$ of the explanatory variables and the error term $\varepsilon .$ These coefficients can be estimated from the data by applying a least-squares fitting that minimizes the differences between the response variable and the fitted explanatory variable. A model with only a single explanatory variable

, is known as simple linear regression, and is one of the simplest models to analyze. Given a dataset with

observations, this produces a straight line, $y_i = \beta _1 x_{i1} + \varepsilon _i, i=1,\dots ,n.$ Multiple linear regression builds on this by adding

explanatory variables to the model, $y_i = \beta _1 x_{i1} + \dots + \beta _r x_{ir} + \varepsilon _i.$

While these models assume a linearity of the underlying relations, we also explore a more flexible, non-linear, non-parametric model, known as a regression forest. A regression forest significantly differs from the previously described models as it is completely learned from the data and makes no assumptions about the underlying distributions [2].

2.4 Training and Testing the Models

The models in Sect. 2.3, are described in order of increasing complexity. With this added complexity, we increase the potential to accurately model the underlying function, but also increase the difficulty in intuitively understanding the model and increase the likelihood of over-fitting the model to the training data. To reduce the possibility of over-fitting, we divide our data into a training and testing set. Given the relatively small size of our dataset, we use leave-one-out cross-validation. This is repeated for all samples to give us an indication of the robustness and generalizability of our regression model and chosen features.

2.5 Clinical Scores

As discussed in the introduction, the EDSS and the MSFC scores, which we aim to predict from

are commonly used to quantify clinical disability. We choose to focus on the MSFC score rather than the EDSS score because the MSFC captures disability to which the EDSS score is relatively insensitive, such as arm/hand function. In addition, the EDSS scores tend to exhibit a poor distribution due to the non-linearity of the scale, with many patients clustered between 4.5 and 6.5 (Fig. 2a).

The MSFC score tests for: upper extremity function, determined by a 9-hole peg test (9-HPT); walking speed, measured by a timed 25-foot walk (T25W); and cognitive function, evaluated by a paced auditory serial addition test (PASAT). These three tests are shown to vary relatively independently, be sensitive to changes over time, and capture aspects of MS that are not captured in the EDSS score [3

Only gold members can continue reading. Log In or Register to continue