Longitudinal Registration with Group-Wise Similarity Prior

, the LDDMM energy functional is defined as:



$$\begin{aligned} E(v, I_0, I_1) = \int _{0}^{1}\Vert v\Vert _Vdt + \Vert I_0 \circ \phi _1^{-1} - I_1\Vert _{L_2} \end{aligned}$$

(1)
where $$v \in L^2([0,1], V)$$ is a time dependent velocity field drawn from the reproducing kernel Hilbert space (RKHS) V. The RKHS is specified by the choice of kernel K, and the inner product in V is then given by $$\langle K^{-1}u, u\rangle _{L_2}$$ for any $$u \in V$$. The transformation $$\phi (t, x)$$ is given by the flow of the velocity v(tx) through the ODE: $$(d/dt)\phi (t, x) = v(t, \phi (t, x))$$, with initial condition $$\phi (0, x) = x$$. v(tx) and $$\phi (t, x)$$ will be written as $$v_t(x)$$ and $$\phi _t(x)$$. The minimizer of (1) is considered the optimal $$\phi $$ for the registration of $$I_0$$ and $$I_1$$.

The second term on the right hand side of (1) is a quantitative assessment of the similarity between the images $$I_0 \circ \phi _1^{-1}$$ and $$I_1$$, whereas the first term is the geodesic energy of the flow of $$v_t(x)$$. For suitable choices of K, $$\phi _t(x)$$ is always a diffeomorphism [10]; hence, (1) defines $$\phi _1$$ to be the transformation that best matches $$I_0$$ and $$I_1$$ such that $$\phi _t$$ is a geodesic in a space of diffeomorphisms specified by the choice of K. As $$\phi _t$$ is a geodesic when $$E(v, I_0, I_1)$$ is optimal, $$E(v, I_0, I_1)$$ defines a metric distance $$d(Id, \phi _1)^2$$ in the space of diffeomorphisms. This can also be considered a metric $$d(I_0, I_1)^2$$ on the orbit given by the group action of the space of diffeomorphisms on the template image $$I_0$$.

Background, Geodesic Shooting Algorithm: Several approaches to optimizing (1) have been proposed. In this paper we use the geodesic shooting approach [6, 12], which we now review. The kernel K can also be considered a mapping between $$V^*$$, the space of linear functionals on V, and V itself. Note that $$V^*$$ is also a Hilbert space. An Element of $$V^*$$ is called a momentum. Hence for any momentum $$m \in V^*$$ there is some $$v \in V$$ such that $$Km = v$$ and $$K^{-1}v = m$$.

An optimal solution to (1) specifies a geodesic, which is uniquely determined by its initial velocity $$v_0(x)$$, or equivalently, its initial momentum $$m_0(x)$$. $$m_t$$ for all t, and hence $$v_t$$ and $$\phi _t$$, can then be determined by solving the co-adjoint equation [6]: $$(\partial /\partial t)m_t = -ad_V^*m_t = -(Dv)^Tm_t - Dm_tv - div(v)m_t$$, where D denotes the Jacobian operator and div(.) the divergence operator. If the initial momentum is assumed to be proportional to the template image gradient, that is $$m_0(x) = p_0(x)\nabla I_0(x)$$ for some scalar field $$p_0$$, the adjoint equation can be separated into a disjoint system of differential equations for $$I_{0,t}$$ and $$p_t$$ respectively [12], where $$I_{0,t} = I_0 \circ \phi ^{-1}_t$$. Considering these equations and the gradient of (1) with respect to $$v_t$$, we arrive at a system of partial differential equations that completely specifies $$\phi _t$$ given initial conditions $$I_0$$ and $$p_0$$ ($$\star $$ denotes convolution):


$$\begin{aligned} {\left\{ \begin{array}{ll} &{}(\partial /\partial t)p + \nabla \cdot (pv) = 0\\ &{}(\partial /\partial t)I + \nabla I \cdot v = 0\\ &{}(\partial /\partial t)v + K \star \nabla Ip = 0 \end{array}\right. } \end{aligned}$$

(2)
With this in mind, (1) is replaced with a functional of the initial momentum exclusively:


$$\begin{aligned} \mathcal {E}(p_0, I_0, I_1) = \langle p_0\nabla I_0, K \star p_0\nabla I_0 \rangle _{L_2} +\Vert I_{0,1} - I_1\Vert ^2_{L_2} \end{aligned}$$

(3)
and optimization proceeds within $$V^*$$ only. In order to optimize (3) by gradient descent, we need the gradient of (3) with respect to $$p_0$$, subject to the geodesic shooting constraints (2). This naturally gives way to an optimal control problem. Time dependent Lagrange multipliers $$\hat{p}_t$$, $$\hat{I}_{0,t}$$, and $$\hat{v}_t$$ enable us to write an augmented functional for (3) incorporating the constraints (2):


$$\begin{aligned} \begin{aligned} \tilde{\mathcal {E}}(p_0, I_0, I_1) = \mathcal {E} + \int _{0}^{1} \langle \hat{p}_t, (\partial /\partial t)p + \nabla \cdot (pv) \rangle dt\;\;+\\ \int _{0}^{1} \langle \hat{I}_{0,t}, (\partial /\partial t)I + \nabla I \cdot v \rangle dt\;\;+\\ \int _{0}^{1} \langle \hat{v}_t, (\partial /\partial t)v + K \star \nabla Ip \rangle dt \end{aligned} \end{aligned}$$

(4)
The first variation of (4) gives the gradient of (3) subject to (2):


$$\begin{aligned} \nabla _{p_0}\mathcal {E} = \nabla I_0 \cdot K\star p_0\nabla I_0 - \hat{p}_0 \end{aligned}$$

(5)
where $$\hat{p}_0$$ is specified by a system of partial differential equations solved backward in time termed the adjoint system:


$$\begin{aligned} {\left\{ \begin{array}{ll} &{}(\partial /\partial t)\hat{p} + \nabla \hat{p} \cdot v - \nabla I \cdot K \star \hat{v} = 0\\ &{}(\partial /\partial t)\hat{I} + \nabla \cdot (Iv) + \nabla \cdot p K \star \hat{v} = 0\\ &{}(\partial /\partial t)\hat{v} + \hat{I}\nabla I -p \nabla \hat{p} = 0 \end{array}\right. } \end{aligned}$$

(6)
with initial conditions $$\hat{I}_1 = I_1 - I_{0,1}$$ and $$\hat{p}_1 = 0$$. The gradient descent proceeds by solving the system (2) forward in time to acquire $$p_t$$, $$I_{0,1}$$, and $$v_t$$ for a sufficiently dense sampling of $$t \in [0,1]$$, then solving (6) backward in time to acquire $$\hat{p}_0$$. $$p_0$$ is then updated with (5), and the process is repeated until convergence.

Group-wise Similarity Prior: We consider the case where we are given N longitudinal image pairs $$I^i_0, I^i_1 \in L^2(\Omega , \mathbb {R})$$, $$i \in [1,2,...,N]$$, all taken approximately the same time interval apart. We take $$\Omega $$ to be the unit cube with periodic boundary conditions, and the time interval to be [0, 1]. Additionally, we are given N transformations $$\psi ^i$$ mapping the initial images $$I^i_0$$ to a Minimal Deformation Template (MDT) coordinate system, that is, $$I^k_0 \circ \psi ^k \sim I^j_0 \circ \psi ^j$$ for all k and j. To consider all N registrations simultaneously with no modification to the geodesic shooting approach, we could write $$\tilde{\mathcal {E}}_{tot} = \sum _{i=1}^{N} \tilde{\mathcal {E}}_i$$ where $$\tilde{\mathcal {E}}_i$$ is eq. (3) for the ith image pair. The first variation of $$\tilde{\mathcal {E}}_{tot}$$ with respect to an initial momentum $$p^i_0$$ will only include terms for the ith pair, that is, the N transformations are decoupled. However, we would like the N transformations to explore the space of diffeomorphisms as a group. We couple them by considering equations of the form:


$$\begin{aligned} \tilde{\mathcal {E}}_{tot} = \alpha \mathcal {G} (p_1,p_2,...,p_N)\; + \; \sum _{i=1}^{N} \tilde{\mathcal {E}}_i \end{aligned}$$

(7)
$$\mathcal {G}(.)$$ is intended to enforce some criteria that we may think all $$p^i_0$$ must satisfy. In this paper, we consider longitudinal studies where all N image pairs come from patients in the same diagnostic group, where a predictable distribution of volume change is known to occur. Because $$V^*$$ is a Hilbert space, we can calculate statistical moments in this space in an ordinary manner, being careful to spatially normalize the $$p^i_0$$ to a MDT coordinate system using coadjoint transport [15]. First, let $$p^{mdt,i}_0 = |D\psi ^i|p^i_0 \circ \psi ^i$$, be the ith initial momentum in the MDT coordinate system. Let $$p^{mdt,avg}_0 = (1/N)\sum _{i=1}^{N} p^{mdt,i}_0$$ be the sample average initial momentum in MDT coordinates. Let $$p^{mdt,cen,i}_0 = p^{mdt,i}_0 - p^{mdt,avg}_0$$ be the mean centered initial momentum for image pair i in MDT coordinates, and let $$A = [p^{mdt,cen,1}_0, p^{mdt,cen,2}_0,...,p^{mdt,cen,N}_0]^T$$ be the mean centered design matrix for all initial momenta in MDT coordinates. We take $$\mathcal {G}(.)$$ to be:


$$\begin{aligned} \mathcal {G}(p_1,p_2,...,p_N) = Trace(AA^T) = \sum _{i=1}^{N} \Vert p^{mdt,i}_0 - p^{mdt,avg}_0\Vert _{L_2}^2 \end{aligned}$$

(8)
the trace of the sample inner-product matrix for $$p_0$$.

First we consider the rightmost form of (8). We see that this term maintains a group-wise average of the initial momentum, and requires that all momenta be close to this average. This is similar to hierarchical latent variable models that maintain a group-wise representation of the data and constrain updates to predictions to be similar to this representation.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Sep 16, 2016 | Posted by in GENERAL RADIOLOGY | Comments Off on Longitudinal Registration with Group-Wise Similarity Prior

Full access? Get Clinical Tree

Get Clinical Tree app for offline access