Nonlinear Mixed Effects Models for Longitudinal Image Data

, are usually observed and/or normalized in a large number of locations of a common space, denoted by $\mathcal {S}$ , across multiple time points $\{t_{ij}:j=1,\cdots ,T_i\}_{i\ge 1}$ . Also, $\mathcal {S}$ is often a compact subset of Euclidean space.

Methodology to handle longitudinal image data is still in its infancy, and further theoretical and practical development is much needed. Most existing methods focus on the analysis of univariate (or multivariate) variables measured longitudinally [4]. Many parametric mixed effects models including both fixed and random effects are the predominant approach for characterizing both the temporal correlations and random individual variations. Although there is a great interest in the analysis of functional data with various levels of hierarchical structures [7, 11, 18], only a handful of them [6, 17, 21] focused on the development of linear mixed models for longitudinal image data. Recently, there was some attempt on the development of hierarchical geodesic models on diffeomorphism for longitudinal shape analysis [15].

Specifically, FNMEM contains two major components including a random nonlinear association map for characterizing dynamic association between image data and covariates, and a spatial-temporal process for capturing large subject variation across both spatial and temporal domains. Because of its greater flexibility, FNMEM is generally more interpretable and parsimonious, and the predictions obtained from FNMEM extend more reliably outside the observed range of the data. We explicitly incorporate the spatial-temporal smoothness into our estimation procedure in order to accurately estimate the nonlinear association map and the spatial-temporal covariance operator. We also propose a global test statistic for testing the association map and construct its asymptotic simultaneous confidence band.

2 Method

2.1 Functional Nonlinear Mixed Effects Model

A functional nonlinear mixed effects model consists of two major components. The first one is a pointwise nonlinear mixed effects model given by

$\begin{aligned} y_{ij}(s)=f(\phi _i(s),\mathbf {x}_{ij})+\varepsilon _{ij}(s)~~~\text{ for }~~i=1, \ldots , n, \end{aligned}$

(1)

where $f(\cdots )$ is a real-valued, differentiable nonlinear association map, $\phi _i(s)$ is a $p\times 1$ vector of subject-specific functions, $\mathbf {x}_{ij}$ is p-dimensional covariate of interest, and $\varepsilon _i(s)$ is the corresponding random error process. It is assumed that f has continuous second-order derivative with respect to $\phi _i(s)$ . For image data, it is typical that after normalization, $y_{ij}(s)$ are measured at the same location for all subjects and exhibit both the within curve and between-curve dependence structure. Thus, without loss of generality, it is assumed that $y_{ij}(s)$ are observed on the M same grid points $\mathcal {S}_0=[0, 1]=\{s_m, 0=s_1\le s_2\cdots \le s_M=1\}$ for all subjects and time points.

The second one is a spatial-temporal process for modeling large variations across subject-specific functions $\phi _i(s)$ . Specifically, $\phi _i(s)$ is modeled as

$\begin{aligned} \phi _i(s)=\beta (s)+b_i(s), \end{aligned}$

(2)

where $\beta (\cdot )=(\beta _1(\cdot ),\cdots ,\beta _p(\cdot ))^T$ is a $p\times 1$ vector of fixed effect functions and $b_i(s)=(b_{i1}(s),\cdots ,b_{ip}(s))^T$ is a $p\times 1$ vector of random effect functions. In addition, $\{b_i(s)\}$ and $\{\varepsilon _i(s)\}$ are independent and identical copies of SP(0, $\varSigma _b(s,t)$ ) and SP(0, $\sigma _{\varepsilon }^2(s)1(s=t)$ ) respectively, where SP( $\mu (s)$ , $\varSigma (s,t)$ ) is a stochastic process (e.g., Gaussian process) with mean function $\mu (s)$ and covariance function $\varSigma (s,t)$ .

2.2 An Example

Recently, nonlinear mixed effects models based on the Gompertz function have been used to characterize longitudinal white matter development during early childhood [3, 9, 14] The Gompertz function can be written as

$y=f(\phi , t)=\mathbf {asymptote}~\exp (-\mathbf {delay}~\exp (-\mathbf {speed}~t)) =\phi _1\exp \{-\phi _2\phi _3^t\},$

where $\phi _1$ is asymptote, $\phi _2$ is delay, and $\phi _3$ is $\exp (-\mathrm {speed})$ . Specifically, in [14], a nonlinear mixed effects model based on the Gompertz function is given by

$\begin{aligned} y_{ij}=\phi _{1i}\exp \{-\phi _{2i}\phi _{3i}^{t_{ij}}\}+\varepsilon _{ij}~~\text{ and }~~\phi _{i}=(\phi _{1i}, \phi _{2i}, \phi _{3i})^T=\beta +b_i, \end{aligned}$

(3)

where $\beta =(\beta _1, \beta _2, \beta _3)^T$ are fixed effects and $$b_i$$

are random effects. For image data, an extension of model (3) is to consider a FNMEM as

$\begin{aligned} y_{ij}(s)=\phi _{1i}(s)\exp \{-\phi _{2i}(s)\phi _{3i}(s)^{t_{ij}}\}+\varepsilon _{ij}(s)~~\text{ and }~~ \phi _i(s)=\beta (s)+b_i(s). \end{aligned}$

(4)

We will use model (4) to characterize the spatial-temporal dynamics of white-matter fiber tracts.

2.3 Estimation Procedure

The next interesting question is how to estimate the fixed effect and random effect functions of FNMEM. It should be noted that the estimation procedures used in [6, 17, 21] are not directly applicable here due to the nonlinear association map in (1).

Estimating the Fixed Effect Functions. At each grid point $s_m\in \mathcal {S}_0$ , we treat model (1) as a traditional nonlinear mixed effects model as

$\begin{aligned} y_{ij}(s_m)=f(\beta (s_m)+b_i(s_m),\mathbf {x}_{ij})+\varepsilon _{ij}(s_m), \end{aligned}$

(5)

where $b_i(s_m)\sim N(0,~\varSigma _b(s_m,s_m))$ and $\varepsilon _{ij}(s_m)\sim N(0,~\sigma ^2(s_m))$ . Then, we calculate the maximum likelihood estimator of $\beta (s_m)$ , denoted by $\hat{\beta }(s_m)$ , across all $$s_m$$

. Define

as the kernel function, where K is the Epanechnikov kernel, and $\tilde{K}_{h}(s_m-s)= {K_{h}(s_m-s)}/\{\sum _{m=1}^MK_{h}(s_m-s)\}.$ We calculate a kernel estimator of $\beta (s)$ as:

$\begin{aligned} \tilde{\beta }(s)=\sum _{m=1}^M\tilde{K}_{h_1}(s_m-s)\hat{\beta }(s_m)~~\text{ for } \text{ all }~~ s\in \mathcal {S}. \end{aligned}$

(6)

The bandwidth $\hat{h}_1$ is selected using a leave-one-out cross-validation method.

Estimating the Covariance Operators. Under certain smoothness conditions on $$b_i(s)$$

, we use local linear regression technique to estimate all $$b_i(s)$$

. Specifically, by using Taylor expansion for $$b_i(s_m)$$

, we have

$b_i(s_m)\approx b_i(s)+\dot{b}_i(s)(s_m-s)=B_i(s)Z(s_m-s),$

where $B_i(s)=(b_i(s),\dot{b}_i(s))$ is a $p\times 2$ matrix and $$Z(s_m-s)=(1,(s_m-s))^T$$

is a p dimensional vector, in which $\dot{b}_i(s)=(\dot{b}_{i1}(s),\cdots ,\dot{b}_{ip}(s))^T$ and $\dot{b}_{il}(s)=\partial b_{il}(s)/\partial s$ for $l=1,\cdots ,p$ . For each i and s, we estimate $$B_i(s)$$

by minimizing the weighted nonlinear least squares [19]:

$S_M(B_i(s))\mathop {=}\limits ^\mathrm{def}\sum _{j=1}^{n_i}\sum _{m=1}^M\left\{ y_{ij}(s_m) -f(\hat{\beta }(s_m)+B_i(s)Z(s_m-s),\mathbf {x}_{ij})\right\} ^2K_{h_2}(s_m-s).$

The optimal bandwidth $\hat{h}_2$ is selected using a leave-one-out cross-validation method, and an iteration algorithm is proposed to get the estimators. Finally, let $N=\sum _{i=1}^nn_i$ , we estimate $\varSigma _b(s,t)$ by using

$\hat{\varSigma }_b(s,t)=N^{-1}\sum _{i=1}^nn_i\tilde{b}_i(s)\tilde{b}_i(t)^T.$

Functional Principal Component Analysis. With the empirical covariance $\hat{\varSigma }_b(s,t)$ , we follow [13] and calculate the spectral decomposition as

$\hat{\varSigma }_{b}(s,t)=\sum _{k=1}^{\infty }\hat{\lambda }_{k}\hat{\psi }_{k}(s)\hat{\psi }_{k}(t)^{T},$

where $\hat{\lambda }_{k}$ are estimated eigenvalues and $\hat{\psi }_k(s)$ are their corresponding estimated eigenfunctions. Moreover, the k-th functional principal component scores can be computes by $\hat{\xi }_{ik}=\sum _{m=1}^M\tilde{b}_i(s_m)\psi _k(s_m)(s_m-s_{m-1})$ for $i=1,\cdots ,n$ .