Biomarker Models for Progression Estimation in Alzheimer’s Disease

indicating the disease progress of a subject by fitting its ADAS scores to this curve. Similarly, Delor et al. [2] compute a disease onset time by adjusting subjects according to their CDR-SB score.

The approach presented in this work builds upon these methods. Here, quantile regression is used to estimate typical trajectories of clinical biomarkers (see Sect. 2). In detail, two models are trained, one for the transition from CN to MCI and one for the MCI-to-AD conversion. These models are then combined to a multi-stage model for the whole course of the disease. Thereafter, a probabilistic model is derived that allows the estimation of a subjects current disease progress and rate of progression based on measured biomarker values. The approach is flexible with regard to the considered biomarkers, which can be based, for example, on cognitive scores, neuroimaging, or both. Moreover, missing measurements are handled in a natural way, this means, the approach can be employed even if the set of observed biomarkers is incomplete. The proposed disease progress estimation is evaluated in Sect. 3 using clinical data. Its applicability for different classification tasks is demonstrated at the end of this section.

2 Methods

To model disease progression, the existence of a set of biomarker values $y^b_{sv}$ acquired from multiple subjects $s\in S=\{1,\dots ,n_S\}$ during multiple visits $v\in V_s$ is assumed. Here, $b\in B_{sv}$ denotes the index of the biomarker. Each biomarker vector is associated with the time $t_{sv}\in T$ of acquisition, measured in days after the first (baseline) visit, as well as the diagnosis $d_{sv}$ that was given during each visit. The number of visits can vary for each subject, $V_s\subseteq V=\{1,\dots ,n_V\}$ . Also, the biomarkers acquired at each visit might differ, such that $B_{sv}\subseteq B=\{1,\dots ,n_B\}$ .

In the training phase of the presented method, characteristic trajectories of biomarkers in the course of disease progression are learned based on a number of training subjects (Sect. 2.1). These models are then employed in the test phase to estimate how far and how fast test subjects have progressed along the disease trajectory (Sect. 2.2).

2.1 Model Learning

Aim of the model training phase is to learn the temporal trajectory of biomarker evolution throughout the disease by determining the probability that a certain biomarker b has a value $$y^b$$

at a specified time point. More technically, each measured biomarker value $y_{sv}^b$ is understood as an observation of a response variable $$Y^b$$

at a disease progress (DP) $p_{sv}\in \mathbb {R}$ (the explanatory variable or covariate). The conditional distribution of $$Y^b$$

given p is then denoted by $f_{Y^b}(y|p)$ .

A disease progression model $\mathcal {M}(p)$ comprises the distributions of all biomarkers in B on a domain $P\subset \mathbb {R}$ , such that $\mathcal {M}(p)=\{\mathcal {M}^1(p),\dots ,\mathcal {M}^{n_B}(p)\}$ with $\mathcal {M}^b(p):=f_{Y^b}(y|p)$ for $p\in P$ . Another way of representing the model is by its q-quantile functions $$y_q^b(p)$$

, which can be derived directly from $f_{Y^b}(y|p)$ (for example, the median trajectory is denoted by $y_{0.5}^b(p)$ ).

The learning of a model consists of three main steps. First, the training subjects have to be temporally aligned to establish correspondences between the time points of observation. Progression models are then estimated using quantile regression to learn the probability distributions $f_{Y^b}(y|p)$ . Since temporal alignment is based on the point of conversion from either CN to MCI or MCI to AD, two separate models are learned. These two models are then combined to a multi-stage progression model.

Temporal Alignment of the Training Data. Temporal alignment aims at associating the time points $t_{sv}$ of biomarker acquisition to the corresponding DP $p_{sv}$ . In detail, the goal is to find a strictly monotonically increasing time warp function $\tau (t)$ that maps the subject-specific acquisition time $t_{sv}\in T$ to the population-based disease progress $p_{sv}\in P$ , such that $p_{sv}=\tau (t_{sv})$ . During model training, the time point $$t_s^0$$

at which the clinical diagnosis changes and thus indicates transition to a more severe disease state is set to $$p=0$$

, that means

$\begin{aligned} p_{sv}=\tau (t_s^0;t_{sv}):=t_{sv}-t_s^0. \end{aligned}$

(1)

For this reason, a specific model is trained for each transition phase (here, CN-to-MCI and MCI-to-AD). To identify $$t_s^0$$

, the visits $$v^*$$

and $v^{**}$ with the last CN (MCI) and the first MCI (AD) diagnosis are determined. The time point of conversion is then assumed to be the average of these two visits, i.e. $t_s^0:=0.5\cdot (t_{sv^*}+t_{sv^{**}})$ .

Learning Disease Progression Model. The conditional distributions $f_{Y^b}$ are learned independently for each biomarker using quantile regression via vector generalised additive models (VGAMs) [11]. In contrast to logistic or exponential regression [10], VGAMs do not depend on prior assumptions on the functional form for each predictor variable other than their smoothness. However, the domain P of $\mathcal {M}^b(p)$ is limited to the progress interval contained in the sample set. This means P is given by $$P=[p_-,p_+]$$

, with $p_-:=\min _{s,v}(p_{sv})$ and $p_+:=\max _{s,v}(p_{sv})$ being the earliest and latest observed DP, respectively. Therefore, the models are extrapolated by a linear extension of the underlying predictor functions (see [11] for details) and $P=\mathbb {R}$ is assumed in the following.

Fig. 1.

Approach for automatically determining the optimal model offset $\delta _p$ .

Fig. 2.

Example of the model composition. First, separate models are trained for CN-to-MCI and MCI-to-AD converters. An optimal offset between these models is then automatically determined and model training is repeated with the whole set of samples (Colour figure online).

Model Composition. To combine the CN/MCI and MCI/AD models, an optimal offset $\delta _p$ is determined by optimising the similarity of the models in the overlapping region. Given $\delta _p$ , the end point $p_+^{[1]}$ of the CN/MCI model $\mathcal {M}^{[1]}$ corresponds to $p_+^{[1]}-\delta _p$ in the MCI/AD model $\mathcal {M}^{[2]}$ . Similarly, $\mathcal {M}^{[2]}(p_-^{[2]})$ corresponds to $\mathcal {M}^{[1]}(\delta _p+p_-^{[2]})$ (cf. Fig. 1). The quality of the fit is quantified by

$\begin{aligned} \hat{\delta }_p:=\underset{\delta _p}{{\text {argmax}}}\ \frac{1}{2}\left[ \left( \mathcal {M}^{[1]}(p_+^{[1]})-\mathcal {M}^{[2]}(p_+^{[1]}-\delta _p)\right) + \left( \mathcal {M}^{[1]}(\delta _p+p_-^{[2]})-\mathcal {M}^{[2]}(p_-^{[2]})\right) \right] \end{aligned}$

with $\mathcal {M}^{[1]}(p)-\mathcal {M}^{[2]}(q)=\frac{1}{|B|}\sum _{b\in B}\int f_{Y^b}^{[1]}(y|p) - f_{Y^b}^{[2]}(y|q)\ dy$ being the area between the corresponding density functions (averaged over all biomarkers). After determining $\hat{\delta }_p$ , the multi-stage model is retrained using the full set of samples with $$p=0$$

defined as the point of conversion from CN to MCI (see Fig. 2).

2.2 Progress Estimation

Once the disease progression model is built, the aim is to estimate the progress of any given subject. However, the point of conversion $$t_s^0$$

is usually unknown and thus Eq. (1) cannot be employed. Progress estimation is accomplished by finding the most likely time warp $\tau (t)$ that optimally fits the evolution of the biomarkers, as measured from the patient, into the progression model $\mathcal {M}$ .

Let $\varvec{t}_s=(t_{s1},\dots ,t_{sn_V})^T$ be the vector containing the time points of all visits of subject s and $\tau (\varvec{t}_s)=(\tau (t_{s1}),\dots ,\tau (t_{sn_V}))^T$ . Let further $\varvec{y}_s=(\varvec{y}_{sv})_{v\in V_s}$ be the biomarker vector measured for s, with $\varvec{y}_{sv}=(y_{sv}^b)_{b\in B_{sv}}$ denoting the values acquired at visit v. Based on $\varvec{t}_s$ , the most probable time warp $\hat{\tau }_s$ given $\varvec{y}_s$ is determined by maximising the logarithm of the likelihood function $\mathcal {L}(\tau (\varvec{t}_s)\,|\,\varvec{y}_s)$ . This means

$\begin{aligned} \hat{\tau }_s:=\underset{\tau }{{\text {argmax}}}\ \log \mathcal {L}(\tau (\varvec{t}_s)\,|\,\varvec{y}_s) = \underset{\tau }{{\text {argmax}}}\ \log f_{\varvec{Y}}(\varvec{y}_s\,|\,\tau (\varvec{t}_s)) \end{aligned}$

(2)

with $\varvec{Y}=(Y^1,\dots ,Y^{n_B})$ . The joint probability of all observations $y_{sv}^b$ is then

$f_{\varvec{Y}}(\varvec{y}_s\,|\,\tau (\varvec{t}_s))=\prod _{v\in V_s} f_{\varvec{Y}}(\varvec{y}_{sv}\,|\,\tau (t_{sv}))=\prod _{v\in V_s}\prod _{b\in B_{sv}} f_{Y^b}(y_{sv}^b\,|\,\tau (t_{sv}))\ .$

whereat all biomarker observations are assumed to be independent of each other.

A simple translational time warp parameterisation is given by

$\begin{aligned} \tau (p^0;t):= p^0 + t. \end{aligned}$

(3)

Here, the disease progress (DP) $p^0\in \mathbb R$ is an offset that indicates how far the subject has progressed in the course of disease at the time point of the first visit. However, this simple model cannot accommodate for different rates of progression, which are known to exist between subjects [3]. If $$|V_s|>1$$” src=”/wp-content/uploads/2016/09/A339424_1_En_30_Chapter_IEq69.gif”></SPAN>, the extended <SPAN class=EmphasisTypeItalic>affine time warp</SPAN> definition<br />
<DIV id=Equ4 class=Equation><br />
<DIV class=EquationContent><br />
<DIV class=MediaObject><IMG alt=

$$|V_s|>1$$” src=”/wp-content/uploads/2016/09/A339424_1_En_30_Chapter_IEq69.gif”></SPAN>, the extended <SPAN class=EmphasisTypeItalic>affine time warp</SPAN> definition<br />
<DIV id=Equ4 class=Equation><br />
<DIV class=EquationContent><br />
<DIV class=MediaObject><IMG alt=

(4)

can be employed, where $r\in \mathbb R^+$ is a scaling factor indicating the disease progression rate (DPR). The optimal values $\hat{p}_s^0$ and $\hat{r}_s$ for DP and DPR are determined by maximising Eq. (2) over all