Spaces

Fig. 1

Examples of shapes (Taken from the MPEG-7 shape database)

2 Background

During the past decades, several essential contributions have been made, using rigorous mathematical concepts and methods, to address this problem and others of similar nature. This collection of efforts has progressively defined a new discipline that can be called mathematical shape theory .

Probably, the first milestone in the development of the theory is Kendall’s construction of a space of shapes, defined as a quotient of the space of disjoint points in $\mathbb{R}^{d}$ by the action of translation, rotation, and scaling [40]. Kendall’s theory has been the starting point of a huge literature [15, 41, 64] and allowed for new approaches for studying datasets in which the group of similitudes was a nuisance factor (for such data as human skulls, prehistoric jewelry, etc.). One can argue that, as a candidate for a shape space, Kendall’s model suffers from two main limitations. First, it relies on the representation of a shape by a finite number of labeled points, or landmarks. These landmarks need to have been identified on each shape, and shapes with different numbers of landmarks belong to different spaces. From a practical point of view, landmarks are most of the time manually selected, the indexation of large datasets being time consuming and prone to user-dependent errors. The second limitation is that the metric on shapes is obtained by quotienting out the standard Euclidean metric on point sets, using a standard “Riemannian submersion” process that we will discuss later in this chapter. The Euclidean metric ignores a desirable property of shape comparison, which states that shapes that are smooth deformations of one another should be considered more similar than those for which the points in correspondence are randomly displaced, even if the total point displacement is the same.

This important issue, related to smoothness, was partially addressed by another important contribution to the theory, which is Bookstein’s use of the thin plate splines originally developed by Duchon and Meinguet [10, 16, 51]. Splines interpolate between landmark displacements to obtain a smooth, dense, displacement field (or vector field). It can be addressed with the generic point of view of reproducing kernel Hilbert spaces [7, 77], which will also be reviewed later in this chapter.

This work had a tremendous influence on shape analysis based on landmarks, in particular for medical studies. It suffers, however, from two major drawbacks. The first one is that the interpolated displacement can be ambiguous, with several points moved to the same position. This is an important limitation, since inferring unobserved correspondences is one of the objectives of this method. The second drawback, in relation with the subject of this chapter, is that the linear construction associated to splines fails to provide a metric structure on the nonlinear space of shapes. The spline deformation energy provides in fact a first-order approximation of a nonconstant Riemannian metric on point sets, which provides an interesting version of a manifold of landmarks, as introduced in [11, 36, 72].

After point sets, plane curves are certainly the shape representation in which most significant advances have been observed over the last few years. Several important metrics have been discussed in publications like [37, 42, 43, 52, 79–81]. They have been cataloged, among many other metrics, in a quasiencyclopedic effort by D. Mumford and P. Michor [55]. We will return to some of these metrics in section “Spaces of Plane Curves.”

Grenander’s theory of deformable templates [27] is another seminal work for shape spaces. In a nutshell, Grenander’s basic idea, which can be traced back to D’Arcy Thomson’s work on biological shapes in the beginning of last century [65], is to introduce suitable group actions as generative engines for visual object models, with the natural use of the group of diffeomorphisms for shapes. While the first developments in this context use linear approximations of diffeomorphisms [2, 28, 29], a first computational breakthrough in the nonlinear estimation of diffeomorphisms was provided in [12] with the introduction of flows associated to ordinary differential equations. This idea was further developed in a fully metric approach of diffeomorphisms and shape spaces, in a framework that was introduced in [17, 66, 67] and further developed in [8, 11, 35, 36, 57, 58]. The approach also led to important developments in medical imaging, notably via the establishment of a new discipline, called computational anatomy , dedicated to the study of datasets of anatomical shapes [30, 60, 61, 78].

3 Mathematical Modeling and Analysis

Some Notation

The following notation will be used in this chapter. The Euclidean norm of vectors $a \in \mathbb{R}^{d}$ will be denoted using single bars and the dot product between a and b as $a \cdot b$ or explicitly as a ^Tb, where a ^Tis the transpose of a. So

$\displaystyle{\vert a\vert ^{2} = a\ \cdot \ a = a^{T}a}$

for $a \in \mathbb{R}^{d}$ .

Other norms (either Riemannian metrics or norms on infinite-dimensional spaces) will be denoted with double bars, generally with a subscript indicating the corresponding space, or relevant point in the manifold. We will use angles for the corresponding inner product, with the same index, so that, for a Hilbert space V, the notation for the inner product between V and w in V will be $\left \langle v\,,\,w\right \rangle _{V }$ with

$\displaystyle{\|v\|_{V }^{2} = \left \langle v\,,\,v\right \rangle _{ V }.}$

When f is a function that depends on a variable t, its derivative with respect to t computed at some point t ₀ will be denoted either ∂ _tf(t ₀) or $\dot{f}_{t}(t_{0})$ , depending on which form gives the most readable formula. Primes are never used to denote derivative, that is, f ^′is not the derivative of f, but just another function. The differential at x of a function of several variables F is denoted DF(x). If F is scalar valued, its gradient is denoted ∇F(x). The divergence of a vector field $v: \mathbb{R}^{d} \rightarrow \mathbb{R}^{d}$ is denoted $\nabla \cdot v$ .

If M is a differential manifold, the tangent space to M at x ∈ M will be denoted T _xM and its cotangent space (dual of the former) T _x^∗ M. The tangent bundle (disjoint union of the tangent spaces) is denoted TM and the cotangent bundle T ^∗ M.

When μ is a linear form on a vector space V (i.e., a scalar-valued linear transformation), the natural pairing between μ and v ∈ V will be denoted $\left (\mu \vert \,v\right )$ , that is,

$\displaystyle{\left (\mu \vert \,v\right ) =\mu (v).}$

A Riemannian Manifold of Deformable Landmarks

Interpolating Splines and RKHSs

Let us start with some preliminary facts on Hilbert spaces of functions or vector fields and their relation with interpolating splines. A Hilbert space is a possibly infinite-dimensional vector space equipped with an inner product which induces a complete topology. Letting V be such a space, with norm and inner product, respectively, denoted $\|\,\cdot \,\|_{V }$ and $\left \langle \cdot \,,\,\cdot \right \rangle _{V }$ , a linear form on V is a continuous linear transformation $\mu: V \mapsto \mathbb{R}$ . The set of such transformations is called the dual space of V and denoted V ^∗. An element μ in V ^∗ being continuous by definition, there exists a constant C such that

$\displaystyle{\forall v \in V,\mu (v) \leq C\|v\|_{V }.}$

The smaller number C for which this assertion is true is called the operator norm of μ and denoted $\|\mu \|_{V ^{{\ast}}}$ .

Instead of μ(v) like above, the notation $\left (\mu \vert \,v\right )$ will be used to represent the result of μ applied to V. The Riesz representation theorem implies that V ^∗ is in one-to-one correspondence with V, so that for any μ in V ^∗, there exists a unique element v = K _Vμ ∈ V such that, for any w ∈ V,

$\displaystyle{\left (\mu \vert \,w\right ) = \left \langle K_{V }\mu \,,\,w\right \rangle _{V };}$

K _Vand its inverse $L_{V } = K_{V }^{-1}$ are called the duality operators of V. They provide an isometric identification between V and V ^∗, with, in particular, $\|\mu \|_{V ^{{\ast}}}^{2} = \left (\mu \vert \,K_{V }\mu \right ) =\| K_{V }\mu \|_{V }^{2}$ .

Of particular interest is the case when V is a space of vector fields in d dimensions, that is, of functions $v: \mathbb{R}^{d} \rightarrow \mathbb{R}^{d}$ (or from $\Omega \rightarrow \mathbb{R}^{d}$ where $\Omega$ is an open subset of $\mathbb{R}^{d}$ ), and when the norm in V is such that the evaluation functionals $a \otimes \delta _{x}$ belong to V ^∗ for any $a,x \in \mathbb{R}^{d}$ , where

$\displaystyle{ \left (a \otimes \delta _{x}\vert \,v\right ) = a^{T}v(x),v \in V. }$

(1)

In this case, the vector field $K_{V }(a \otimes \delta _{x})$ is well defined and linear in a. One can define the matrix-valued function $(y,x)\mapsto \tilde{K}_{V }(y,x)$ by

$\displaystyle{\tilde{K}_{V }(y,x)a = (K_{V }(a \otimes \delta _{x}))(y);}$

$\tilde{K}_{V }$ is the kernel of the space V. In the following, we will write K _V(x, y) instead of $\tilde{K}_{V }(x,y)$ , with the customary abuse of notation of identifying the kernel and the operator that it defines.

One can easily deduce from its definition that K _Vsatisfies the reproducing property

$\displaystyle{\forall a,b \in \mathbb{R}^{d},\left \langle K_{ V }(\cdot,x)a\,,\,K_{V }(\cdot,y)b\right \rangle _{V } = a^{T}K_{ V }(x,y)b,}$

which also implies the symmetry property K _V(x, y) = K _V(y, x)^T. Unless otherwise specified, it will always be assumed that V is a space of vector fields that vanish at infinity, which implies the same property for the kernel (one variable tending to infinity and the other remaining fixed).

A space V as considered above is called a reproducing kernel Hilbert space (RKHS) of vector fields. Fixing such a space, one can consider the spline interpolation problem , which is to find v ∈ V with minimal norm such that v(x _i) = c _i, where x ₁, …, x _Nare points in $\mathbb{R}^{d}$ and $c_{1},\ldots,c_{N}$ are d-dimensional vectors. It is quite easy to prove that the solution takes the form

$v(y)=\sum\limits_{i=1}^N{{K_V}{(y,{x_i})}{\alpha_i},}$

(2)

where $\alpha _{1},\ldots,\alpha _{N}$ are identified by solving the dN-dimensional system

$\displaystyle{ \sum _{i=1}^{N}K_{ V }(x_{j},x_{i})\alpha _{i} = c_{j},\text{ for }j = 1,\ldots,N. }$

(3)

Let S _V(x) (where ${\mathbf{x}} = (x_{1},\ldots,x_{N})$ ) denote the dN by dN block matrix

$\displaystyle{S_{V }({\mathbf{x}}) = (K_{V }(x_{i},x_{j}))_{i,j=1,\ldots,N}.}$

Stacking $c_{1},\ldots,c_{N}$ and $\alpha _{1},\ldots,\alpha _{N}$ in dN-dimensional column vectors c and $\boldsymbol{\alpha }$ , one can show that, for the optimal V,

$\displaystyle{ \|\mathit{v}\|_{V }^{2} =\boldsymbol{\alpha } ^{T}S_{ V }({\mathbf{x}})\boldsymbol{\alpha } = \mathbf{c}^{T}S({\mathbf{x}})_{ V }^{-1}\mathbf{c}, }$

(4)

each term representing this spline deformation energy for the considered interpolation problem.

How one uses this interpolation method now depends on how one interprets the vector field V. One possibility is to consider it as a displacement field, in the sense that a particle at position x in space is moved to position x + v(x), therefore involving the space transformation $\varphi ^{v}:=\mathrm{ id} + v$ . In this view, the interpolation problem can be rephrased as finding the smoothest (in the V -norm sense) full space interpolation of given landmark displacements. The deformation energy in (4) can then be interpreted as some kind of “elastic” energy that evaluates the total stress involved in the transformation $\varphi ^{v}$ . This (with some variants, including allowing for some no-cost affine, or polynomial, transformations) is the framework of interpolation based on thin plates, or radial basis functions, as introduced in [3, 4, 9, 10, 18], for example. As discussed in the introduction, this approach does not lead to a nice mathematical notion of a shape space of landmarks; moreover, in the presence of large displacements, the interpolated transformation $\varphi ^{v}$ may fail to be one to one and therefore to provide a well-defined dense correspondence.

The other way to interpret V is as a velocity field, so that v(x) is the speed of a particle at x at a given time. The interpolation problem is then to obtain a smooth velocity field given the speeds $c_{1},\ldots,c_{N}$ of particles $x_{1},\ldots,x_{N}$ . This point of view has the double advantage of providing a diffeomorphic displacement when the velocity field is integrated over time and allowing for the interpretation of the deformation energy as a kinetic energy, directly related to a Riemannian metric on the space of landmarks.

Riemannian Structure

Let Lmk _Ndenote the submanifold of $\mathbb{R}^{dN}$ consisting of all ordered collections of N distinct points in $\mathbb{R}^{d}$ :

$\displaystyle{\mathit{Lmk}_{N} =\{ {\mathbf{x}} = (x_{1},\ldots,x_{N}) \in (\mathbb{R}^{d})^{N},x_{ i}\neq x_{j}\text{ if }i\neq j\}.}$

The tangent space to Lmk _Nat x can be identified to the space of all families of d-dimensional vectors $\mathbf{c} = (c_{1},\ldots,c_{N})$ , and one defines (with the same notation as in the previous section) the Riemannian metric on Lmk _N

$\displaystyle{\|\mathbf{c}\|_{{\mathbf{x}}}^{2} = \mathbf{c}^{T}S_{ V }({\mathbf{x}})^{-1}\mathbf{c}.}$

As already pointed out, $\|\mathbf{c}\|_{{\mathbf{x}}}^{2}$ is the minimum of $\|v\|_{V }^{2}$ among all V in V such that v(x _i) = c _i, i = 1, …, N. This minimum is attained at

$\displaystyle{v^{\mathbf{c}}(\cdot ) =\sum _{ i=1}^{N}K(\cdot,x_{ j})\alpha _{j}}$

with $\boldsymbol{\alpha }= S_{V }({\mathbf{x}})^{-1}\mathbf{c}$ .

Now, given any differentiable curve $t\mapsto {\mathbf{x}}(t)$ in Lmk _N, one can build an optimal time-dependent velocity field

$\displaystyle{v(t,\cdot ) = v^{\mathbf{c}(t)}(\cdot )}$

with c = ∂ _tx. One can then define the flow associated to this time-dependent velocity, namely, the time-dependent diffeomorphism $\varphi ^{v}$ , such that $\varphi ^{v}(0,x) = x$ and

$\displaystyle{\partial _{t}\varphi ^{v}(t,x) = v(t,\varphi ^{v}(t,x))}$

which is, by construction, such that $\varphi ^{v}(t,x_{i}(0)) = x_{i}(t)$ for i = 1, …, N. So, this construction provides a diffeomorphic extrapolation of any curve in Lmk _N, which is optimal in the sense that its velocity has minimal V norms, given the induced constraints. The metric that has been defined on Lmk _Nis the projection of the V norm via the infinitesimal action of velocity fields on Lmk _N, which is defined by

$\displaystyle{v \cdot (x_{1},\ldots,x_{N}) = (v(x_{1}),\ldots,v(x_{N})).}$

This concept will be extensively discussed later on in this chapter.

Geodesic Equation

Geodesics on Lmk _Nare curves that locally minimize the energy, that is, they are curves $t\mapsto {\mathbf{x}}(t)$ such that, for any t, there exists h > 0 such that

$\displaystyle{\int _{t-h}^{t+h}\|\dot{{\mathbf{x}}}_{ u}(u)\|_{{\mathbf{x}}(u)}^{2}du}$

is minimal over all possible curves in Lmk _Nthat connect x(t − h) and x(t + h). The geodesic, or Riemannian, distance between x ₀ and x ₁ is defined as the minimizer of the square root of the geodesic energy

$\displaystyle{\int _{0}^{1}\|\dot{{\mathbf{x}}}_{ u}\|_{{\mathbf{x}}(u)}^{2}du}$

over all curves in Lmk _Nthat connect x ₀ and x ₁.

Geodesics are characterized by a second-order equation, called the geodesic equation . If one denotes $G_{V }({\mathbf{x}}) = S_{V }({\mathbf{x}})^{-1}$ , with coefficients g _{(k, i), (l, j)} for k, l = 1, …, N and i, j = 1, …, d, the classical expression of this equation is

$\displaystyle\begin{array}{rcl} \ddot{x}_{k,i} +\sum _{ l,l^{\prime}=1^{\prime}}^{N}\sum _{ j,j^{\prime}=1}^{d}\Gamma _{ (l,j),(l^{\prime},j^{\prime})}^{(k,i)}\dot{x}_{ l,j}\dot{x}_{l^{\prime},j^{\prime}} = 0,& & {}\\ \end{array}$

where $\Gamma _{(l,j),(l^{\prime},j^{\prime})}^{(k,i)}$ are the Christoffel symbols, given by

$\displaystyle{\Gamma _{(l,j),(l^{\prime},j^{\prime})}^{(k,i)} = \frac{1} {2}\Big(\partial _{x_{l^{\prime},j^{\prime}}}g_{(k,i),(l,j)} + \partial _{x_{l,j}}g_{(k,i),(l^{\prime},j^{\prime})} - \partial _{x_{k,i}}g_{(l,j),(l^{\prime},j^{\prime})}\Big).}$

In these formulae, the two indices that describe the coordinates in Lmk _N, x _k, iwere made explicit, representing the ith coordinate of the kth landmark. Solutions of this equation are unique as soon as x(0) and $\dot{{\mathbf{x}}}(0)$ are specified.

Equations put in this form become rapidly intractable when the number of landmarks becomes large. The inversion of the matrix S _V(x) or even simply its storage can be computationally impossible when N gets larger than a few thousands. It is much more efficient, and analytically simpler as well, to use the Hamiltonian form of the geodesic equation, which is (see [38])

$\displaystyle{ \left \{\begin{array}{@{}l@{}} \partial _{t}{\mathbf{x}} = S_{V }({\mathbf{x}})\boldsymbol{\alpha } \\ \partial _{t}\boldsymbol{\alpha } = -\frac{1} {2}\partial _{{\mathbf{x}}}(\boldsymbol{\alpha }^{T}S_{ V }({\mathbf{x}})\boldsymbol{\alpha })\end{array} \right. }$

(5)

This equation will be justified in section “General Principles,” in which the optimality conditions for geodesics will be retrieved as a particular case of general problems in calculus of variations and optimal control. Its solution is uniquely defined as soon as x(0) and $\boldsymbol{\alpha }(0)$ are specified. The time-dependent collection of vectors $t\mapsto \boldsymbol{\alpha }(t)$ is called the momentum of the motion. It is related to the velocity $\mathbf{c}(t) =\dot{ {\mathbf{x}}}(t)$ by the identity $\mathbf{c} = S_{V }({\mathbf{x}})\boldsymbol{\alpha }$ .

Introducing K _Vand letting K _V^ij(x, y) denote the i, j entry of K _V(x, y), this geodesic equation can be rewritten in the following even more explicit form:

$\displaystyle{ \left \{\begin{array}{@{}l@{}} \partial _{t}x_{k} =\sum _{ l=1}^{N}K_{ V }(x_{k},x_{l})\alpha _{l},\quad \ k = 1,\ldots,N, \\ \partial _{t}\alpha _{k} = -\sum _{l=1}^{N}\sum _{ i,j=1}^{d}\nabla _{ 1}K_{V }^{ij}(x_{ k},x_{l})\alpha _{k,i}\alpha _{l,j},\quad \ k = 1,\ldots,N,\end{array} \right. }$

(6)

where $\nabla _{1}K_{V }^{ij}$ denotes the gradient of the i, j entry of K _Vwith respect to its first variable.

The geodesic equation defines the Riemannian exponential map as follows. Fix x ₀ ∈ Lmk _N. The exponential map at x ₀ is the transformation $\mathbf{c}\mapsto \exp _{{\mathbf{x}}_{ 0}}(\mathbf{c})$ defined over all tangent vectors c to Lmk _Nat x ₀ (which are identified to all families of d-dimensional vectors, $\mathbf{c} = (c_{1},\ldots,c_{N})$ ), such that $\exp _{{\mathbf{x}}_{ 0}}(\mathbf{c})$ is the solution at time t = 1 of the geodesic equation initialized at x(0) = x ₀ and $\dot{{\mathbf{x}}}(0) = \mathbf{c}$ . Alternatively, one can define the exponential chart in Hamiltonian form that will also be called the momentum representation in Lmk _Nby the transformation

$\displaystyle{\boldsymbol{\alpha }_{0}\mapsto {\mathrm{exp}}_{{\mathbf{x}}_{0}}^{\flat }(\boldsymbol{\alpha }_{ 0}),}$

where ${\mathrm{exp}}_{{\mathbf{x}}_{0}}^{\flat }(\boldsymbol{\alpha }_{0})$ is the solution at time 1 of system (6) initialized at $({\mathbf{x}}_{0},\boldsymbol{\alpha }_{0})$ .

For the metric that is considered here, one can prove that the exponential map at x ₀ (resp. the momentum representation) is defined for any vector c (resp. $\boldsymbol{\alpha }_{0}$ ); this also implies that they both are onto, so that any landmark configuration y can be written as $\mathbf{y} =\mathrm{ exp}_{{\mathbf{x}}_{0}}^{\flat }(\boldsymbol{\alpha }_{0})$ for some $\boldsymbol{\alpha }_{0} \in (\mathbb{R}^{d})^{N}$ . The representation is not one to one, because geodesics may intersect, but it is so if restricted to a small-enough neighborhood of 0. More precisely, there exists an open subset $U \subset T_{{\mathbf{x}}_{0}}\mathit{Lmk}_{N}$ over which $\exp _{{\mathbf{x}}_{0}}$ is a diffeomorphism. This provides the so-called exponential chart at x on the manifold.

Metric Distortion and Curvature

Exponential charts are often used for data analysis on a manifold, because they provide, in a neighborhood of a reference point x ₀, a vector-space representation which has no radial metric distortion, in the sense that, in the chart, the geodesic distance between x ₀ and $\exp _{{\mathbf{x}}_{ 0}}(\mathbf{c})$ is equal to $\|\mathbf{c}\|_{{\mathbf{x}}_{0}}$ . The representation does distort the metric in the other directions. One way to measure this is by comparing (see Fig. 2), for given c ₀ and c ₁ with $\|\mathbf{c}_{0}\|_{{\mathbf{x}}_{ 0}} =\| \mathbf{c}_{1}\|_{{\mathbf{x}}_{0}} = 1$ , the points $\exp _{{\mathbf{x}}_{0}}(t\mathbf{c}_{0})$ and $\exp _{{\mathbf{x}}_{0}}(t(\mathbf{c}_{0} + s\mathbf{c}_{1}))$ . Let F(t, s) denote the last term (so that the first one is F(t, 0)). One can write

$\displaystyle{{\mathrm{dist}}(F(t,s),F(t,0)) = s\|\partial _{s}F(t,0)\|_{F(t,0)} + o(s).}$

Without metric distortion, this distance would be given by $st\|\mathbf{c}_{1}\|_{{\mathbf{x}}_{0}} = st$ . However, it turns out that [14]

$\displaystyle{\|\partial _{s}F(t,0)\|_{F(t,0)} = t -\rho _{{\mathbf{x}}}(\mathbf{c}_{0},\mathbf{c}_{1})\frac{t^{3}} {6} + o(t^{3}),}$

where ρ _x(c ₀, c ₁) is the sectional curvature of the plane generated by c ₀ and c ₁ in $T_{{\mathbf{x}}_{0}}\mathit{Lmk}_{N}$ . So, this sectional curvature directly measures (among many other things) the first order of metric distortion in the manifold and is therefore an important indication of this distortion of the exponential charts.

Fig. 2

Metric distortion for the exponential chart

The usual explicit formula for the computation of the curvature involves the second derivatives of the metric tensor matrix G _V(x), which, as we have seen, is intractable for large values of N. In a recent work, Micheli [53] introduced an interesting new formula for the computation of the curvature in terms of the inverse tensor, S _V(x ₀).

Invariance

The previous landmark space ignored the important facts that two shapes are usually considered as identical when one can be deduced from the other by a Euclidean transformation, which is a combination of a rotation and a translation (scale invariance is another important aspect that will not be discussed in this section). To take this into account, we need to “mod out” these transformations, that is, to consider the quotient space of Lmk _Nby the Euclidean group.

One can pass from the metric discussed in the previous section to a metric on the quotient space via the mechanism of Riemannian submersion (Fig. 3). The scheme is relatively simple, and we describe it and set up notation in a generic framework before taking the special case of the landmark manifold. So, let Q be a Riemannian manifold and π: Q → M be a submersion, that is, a smooth surjection from Q to another manifold M such that its differential D π has full rank everywhere. This implies that, for m ∈ M, the set π ⁻¹(m) is a submanifold of Q, called the fiber at m. If q ∈ Q and m = π(q), the tangent space T _qQ can be decomposed into the direct sum of the tangent space to π ⁻¹(m) and the space perpendicular to it. We will refer to the former as the space of vertical vectors at q, and denote it $\mathcal{V}_{q}$ , and to the latter as the space of horizontal vectors, denoted $\mathcal{H}_{q}$ . We therefore have

$\displaystyle{T_{q}Q = \mathcal{V}_{q} \perp \mathcal{H}_{q}.}$

The differential of π at q, D π(q), vanishes on $\mathcal{V}_{q}$ (since π is constant on π ⁻¹(m)) and is an isomorphism between $\mathcal{H}_{q}$ and T _mM. Let us make the abuse of notation of still denoting D π(q) the restriction of D π(q) to $\mathcal{H}_{q}$ . Then, if $q,q^{\prime}\in \pi ^{-1}(m)$ , the map $p_{q^{\prime},\,q}:= D\pi (q)^{-1} \circ D\pi (q^{\prime})$ is an isomorphism between $\mathcal{H}_{q^{\prime}}$ and $\mathcal{H}_{q}$ . One says that π is a Riemannian submersion if and only if the maps $p_{q^{\prime},\,q}$ are in fact isometries between $\mathcal{H}_{q^{\prime}}$ and $\mathcal{H}_{q}$ whenever q and q ^′belong in the same fiber, that is, if one has, for all $v^{\prime}\in \mathcal{H}_{q^{\prime}}$ ,

$\displaystyle{\|p_{q^{\prime},\,q}v^{\prime}\|_{ q} =\| v^{\prime}\|_{ q^{\prime}}.}$

Another way to phrase this property is

$\displaystyle{\pi (q) =\pi (q^{\prime}),v \in \mathcal{H}_{ q},v^{\prime}\in \mathcal{H}_{ q^{\prime}},D\pi (q)v = D\pi (q^{\prime})v^{\prime}\Rightarrow \| v\|_{ q} =\| v^{\prime}\|_{ q^{\prime}}.}$

A Riemannian submersion naturally induces a Riemannian metric on M, simply defining, for m ∈ M and h ∈ T _mM,

$\displaystyle{\|h\|_{m} =\| D\pi (q)^{-1}h\|_{ q}}$

for any q ∈ π ⁻¹(m), the definition being independent of q by assumption. This is the Riemannian projection of the metric on Q via the Riemannian submersion π.

Fig. 3

Riemannian submersion (geodesics in the quotient space)

Let us now return to the landmark case, and consider the action of rotations and translations, that is of the special Euclidean group of $\mathbb{R}^{d}$ , which is traditionally denoted $SE(\mathbb{R}^{d})$ . The action of a transformation $g \in SE(\mathbb{R}^{d})$ on a landmark configuration ${\mathbf{x}} = (x_{1},\ldots,x_{N}) \in \mathit{Lmk}_{N}$ is

$\displaystyle{g \cdot {\mathbf{x}} = (g(x_{1}),\ldots,g(x_{N})).}$

We want to use a Riemannian projection to deduce a metric on the quotient space $M = \mathit{Lmk}_{N}/SE(\mathbb{R}^{d})$ from the metric that has been defined on Lmk _N, the surjection π being the projection π: Lmk _N → M, which assigns to a landmark configuration x its equivalence class, or orbit under the action of $SE(\mathbb{R}^{d})$ , defined by

$\displaystyle{[{\mathbf{x}}] =\{ g \cdot {\mathbf{x}},g \in SE(\mathbb{R}^{d})\} \in M.}$

To make sure that M is a manifold, one needs to restrict to affinely independent landmark configurations, which form an open subset of Lmk _Nand therefore let Q be this space and restrict π to Q. In this context, one can show that a sufficient condition for π to be a Riemannian submersion is that the action of $SE(\mathbb{R}^{d})$ is isometric, that is, for all $g \in SE(\mathbb{R}^{d})$ , the operation $a_{g}: {\mathbf{x}}\mapsto g \cdot {\mathbf{x}}$ is such that, for all u, v ∈ T _xQ,

$\displaystyle{\left \langle Da_{g}({\mathbf{x}})u\,,\,Da_{g}({\mathbf{x}})v\right \rangle _{g\cdot {\mathbf{x}}} = \left \langle u\,,\,v\right \rangle _{{\mathbf{x}}}.}$

This property can be translated into equivalent properties on the metric. For translations, for example, it says that, for every x ∈ Q and $\tau \in \mathbb{R}^{d}$ , one must have

$\displaystyle{S_{V }({\mathbf{x}}+\tau ) = S_{V }({\mathbf{x}})}$

which is in turn equivalent to the requirement that, for all $x,y,\tau \in \mathbb{R}^{d}$ , $K_{V }(x+\tau,y+\tau ) = K_{V }(x,y)$ , so that K _Vonly depends on x − y. With rotations, one needs

$\displaystyle{{\mathrm{diag}}(R)^{T}S_{ V }(R{\mathbf{x}}){\mathrm{diag}}(R) = S_{V }({\mathbf{x}}),}$

which again translates into a similar property for the kernel, namely,

$\displaystyle{ R^{T}K_{ V }(Rx,Ry)R = K_{V }(x,y). }$

Here, R is an arbitrary d dimensional rotation, and diag(R) is the dN by dN block-diagonal matrix with R repeated N times.

Kernels that satisfy these properties can be characterized in explicit forms. These kernels include all positive radial kernels, that is, all kernels taking the form

$\displaystyle{K_{V }(x,y) =\gamma (\vert x - y\vert ^{2}){\mathrm{Id}}_{ \mathbb{R}^{d}},}$

where $\gamma: [0,+\infty ) \rightarrow \mathbb{R}$ is the Laplace transform of some positive measure μ, that is,

$\displaystyle{\gamma (t) =\int _{ 0}^{\infty }e^{-ty}d\mu (y).}$

Such functions include Gaussians,

$\displaystyle{ \gamma (t^{2}) =\exp (-t^{2}/2\sigma ^{2}), }$

(7)

Cauchy,

$\displaystyle{ \gamma (t^{2}) = \frac{1} {1 + t^{2}/\sigma ^{2}}, }$

(8)

or Laplacian kernels , defined for any integer c ≥ 0 by

$\displaystyle{ \gamma _{c}(t^{2}) = \left (\sum _{ l=1}^{c}\rho (c,l)\frac{t^{l}} {\sigma ^{l}} \right )\exp (-t/\sigma ) }$

(9)

with $\rho (c,l) = 2^{l-c}(2c - l)\cdots (c + 1 - l)/l!$ .

One can also use non-diagonal kernels. One simple construction of such kernels is to start with a scalar kernel, for example, associated to a radial function γ as above, and, for some parameter λ ≥ 0, to implicitly define K _Vvia the identity, valid for all pairs of smooth compactly supported vector fields V and w,

$\displaystyle\begin{array}{rcl} \int _{\mathbb{R}^{d}}\int _{\mathbb{R}^{d}}v(x)^{T}K_{ V }(x,y)w(y)dxdy =\int _{\mathbb{R}^{d}}\int _{\mathbb{R}^{d}}\gamma \left (\vert x - y\vert ^{2}\right )(v(x)^{T}w(y))dxdy& & {}\\ + \frac{\lambda } {2}\int _{\mathbb{R}^{d}}\int _{\mathbb{R}^{d}}\gamma \left (\vert x - y\vert ^{2}\right )(\nabla \cdot v(x))(\nabla \cdot w(y))dxdy,& & {}\\ \end{array}$

where $(\nabla \cdot )$ denotes the divergence operator. The explicit form of the kernel can be deduced after a double application of the divergence theorem yielding

$\displaystyle{K_{V }(x,y) = (\gamma (r^{2}) -\lambda \dot{\gamma } (r^{2})){\mathrm{Id}}_{ \mathbb{R}^{d}} - 2\lambda \ddot{\gamma }(r^{2})(x - y)(x - y)^{T}}$

with $r = \vert x - y\vert$ .

Assume that one of these choices has been made for K _V, so that one can use a Riemannian submersion to define a metric on the quotient space $Q/SE(\mathbb{R}^{d})$ . One of the appealing properties of this construction is that geodesics in the quotient space are given by (equivalent classes of) geodesics in the original space, provided that they are initialized with horizontal velocities. Another interesting feature is that the horizontality condition is very simply expressed in terms of the momenta, which provides another advantage of the momentum representation in Eq. (6). Take translations, for example. A vertical tangent vector for their action at any point x ∈ M is a vector of the form (τ, …, τ), where τ is a d-dimensional vector repeated N times. A momentum, or covector, $\boldsymbol{\alpha }$ is horizontal if and only if it vanishes when applied to any such vertical vector, which yields

$\displaystyle{ \sum _{k=1}^{N}\alpha _{ k} = 0. }$

(10)

A similar analysis for rotations yields the horizontality condition

$\displaystyle{ \sum _{k=1}^{N}\left (\alpha _{ k}x_{k}^{T} - x_{ k}\alpha _{k}^{T}\right ) = 0. }$

(11)

These two conditions provide the $$d(d + 1)/2$$

constraints that must be imposed to the momentum representation on M to obtain a momentum representation on $M/SE(\mathbb{R}^{d})$ .

Hamiltonian Point of View

General Principles

This section presents an alternate formulation in which the accent is made on variational principles rather than on geometric concepts. Although the results obtained using the Hamiltonian approach that is presented here will be partially redundant with the ones that were obtained using the Riemannian point of view, there is a genuine benefit in understanding and being able to connect the two of them. As will be seen below, working with the Hamiltonian formulation brings new, relatively simple concepts, especially when dealing with invariance and symmetries. It is also often the best way to handle numerical implementations.

To lighten the conceptual burden, the presentation will remain within the elementary formulation that uses a state variable q and a momentum p, rather than the more general symplectic formulation. On a manifold, this implies that the presentation is made with variables restricted to a local chart.

An optimal control problem in Lagrangian form is associated to a real-valued cost function (or Lagrangian) $(q,u)\mapsto L(q,u)$ defined on Q × U, where Q is a manifold and U is the space of controls, and to a function $(q,u)\mapsto f(q,u) \in T_{q}Q$ . The resulting variational problem consists in the minimization of

$\displaystyle{ \int _{t_{i}}^{t_{f} }L(q,u)dt }$

(12)

subject to the constraint $\dot{q}_{t} = f(q,u)$ and some boundary conditions for q(t _i) and q(t _f). The simplest situation is the classical problem in the calculus of variations for which f(q, u) = u, and the problem is to minimize $\int _{t_{i}}^{t_{f}}L(q,\dot{q}_{ t})dt$ . Here, [t _i, t _f] is a fixed finite interval. The values t _i = 0 and t _f = 1 will be assumed in the following.

The general situation in (12) can be formally addressed by introducing Lagrange multipliers, denoted p(t), associated to the constraint ∂ _tq = f(q, u) at time t; p is called the costate in the optimal control setting. One then looks for critical paths of

$\displaystyle{J_{0}(q,p,u)\doteq\int _{0}^{1}\left (L(q,u) + \left (p\vert \,\dot{q}_{ t} - f(q,u)\right )\right )dt,}$

where the paths p, u, and q vary now freely as far as q(0) and q(1) remain fixed. The costate is here a linear form on Q, that is, an element of T _q^∗ Q.

Introduce the Hamiltonian

$\displaystyle{H(q,p,u)\doteq\left (p\vert \,f(q,u)\right ) - L(q,u)}$

for which

$\displaystyle{J_{0} =\int _{ 0}^{1}\left (\left (p\vert \,\dot{q}_{ t}\right ) - H(q,p,u)\right )dt.}$

Writing the conditions for criticality, $\delta J_{0}/\delta u =\delta J_{0}/\delta q =\delta J_{0}/\delta p = 0$ , directly leads to the Hamiltonian equation:

$\displaystyle{ \partial _{t}q = \partial _{p}H,\ \partial _{t}p = -\partial _{q}H,\ \partial _{u}H = 0. }$

(13)

The above derivation is only formal. A rigorous derivation in various finite-dimensional as well as infinite-dimensional situations is the central object of Pontryagin Maximum Principle (PMP) theorems which state that along a solution (q _∗, p _∗, u _∗), one has

$\displaystyle{H(q_{{\ast}}(t),p_{{\ast}}(t),u_{{\ast}}(t)) =\max _{u}H(q_{{\ast}}(t),p_{{\ast}}(t),u).}$

Introducing $\tilde{H}(q,p)\doteq\max _{u}H(q,p,u)$ , one gets the usual Hamiltonian equation:

$\displaystyle{ \partial _{t}p = -\partial _{q}\tilde{H},\ \partial _{t}q = \partial _{p}\tilde{H}. }$

(14)

One can notice that, in the classical case f(q, u) = u, $\tilde{H}(q,p)$ coincides with the Hamiltonian obtained via the Legendre transformation in which a function u(p, q) is defined via the equation p = ∂ _uL and

$\displaystyle{\tilde{H}(p,q) = \left (p\vert \,u(q,p)\right ) - L(q,u(q,p)).}$

Application to Geodesics in a Riemannian Manifold

Let Q be a Riemannian manifold with metric at q denoted $\left \langle \cdot \,,\,\cdot \right \rangle _{q}$ . The computation of geodesics in Q can be seen as a particular case of the previous framework in at least two (equivalent) ways. The first one is to take

$\displaystyle{L({\mathbf{x}},u) =\| u\|_{q}^{2}/2\text{ and }f({\mathbf{x}},u) = u,}$

which gives a variational problem in standard form. For the other choice, introduce the duality operator $K_{q}: T_{q}^{{\ast}}Q \rightarrow T_{q}Q$ defined by

$\displaystyle{\left (\alpha \vert \,\xi \right ) = \left \langle K_{q}\alpha \,,\,\xi \right \rangle _{q},}$

α ∈ T _q^∗ Q, ξ ∈ T _qQ, and let, denoting the control by α,

$\displaystyle{L(q,\alpha ) = \left (\alpha \vert \,K_{q}\alpha \right )/2\text{ and }f(q,\alpha ) = K_{q}\alpha.}$

The Hamiltonian equation in this case yields p = α and

$\displaystyle{ \left \{\begin{array}{l} \partial _{t}q = K_{q}\alpha, \\ \partial _{t}\alpha = -\frac{1} {2}\partial _{q}\big(\left (\alpha \vert \,K_{q}\alpha \right )\big). \end{array} \right. }$

(15)

This equation obviously reduces to (5) with q = x, $K_{q}\alpha = S_{V }({\mathbf{x}})\boldsymbol{\alpha }$ .

Momentum Map and Conserved Quantities

A central aspect of the Hamiltonian formulation is its ability to turn symmetries into conserved quantities. This directly relates to the Riemannian submersion discussed in section “Invariance.”

Consider a Lie group G acting on the state variable q ∈ Q, assuming, for the rest of this section and the next one, an action on the right denoted $(g,q) \rightarrow q \cdot g$ . Notice that results obtained with a right action immediately translate to left actions, by transforming a left action $(g,q)\mapsto g \cdot q$ into the right action $(g,q)\mapsto g^{-1} \cdot q$ . In fact, both right and left actions are encountered in this chapter. The standard notation $T_{{\mathrm{id}}}G = \mathfrak{G}$ will be used in the following to represent the Lie algebra of G.

By differentiation in the q variable, the action can be extended to the tangent bundle, with notation $(g,v) \rightarrow v \cdot g$ for v ∈ TQ. By duality, this induces an action on the costate variable through the equality $\left (p \cdot g\vert \,v\right )\doteq\left (p\vert \,v \cdot g^{-1}\right )$ . Differentiating again in g at g = id_Ggives the infinitesimal actions on the tangent and cotangent bundles, defined for any $\xi \in \mathfrak{G}\doteq T_{{\mathrm{id}}_{G}}G$ by $(\xi,v) \rightarrow v\cdot \xi$ and for any $(\xi,p) \rightarrow p\cdot \xi$ such that $\left (p\cdot \xi \vert \,v\right ) + \left (p\vert \,v\cdot \xi \right ) = 0$ , for all v ∈ TQ and p ∈ T ^∗ Q.

Now, assume that H is G-invariant, that is, $H(q \cdot g,p \cdot g) = H(q,p)$ for any g ∈ G, and define the momentum map $(q,p) \rightarrow \mathfrak{m}(q,p) \in \mathfrak{G}^{{\ast}}$ by

$\displaystyle{ \left (\mathfrak{m}(q,p)\vert \,\xi \right ) = \left (p\vert \,q\cdot \xi \right ). }$

(16)

Then, one has, along a Hamiltonian trajectory,

$\displaystyle{ \partial _{t}\mathfrak{m}(p,q) = 0, }$

(17)

that is, the momentum map is a conserved (vectorial) quantity along the Hamiltonian flow. This result is proved as follows. First notice that if g(t) is a curve in G with g(0) = id_Gand $\dot{g}_{t}(0) =\xi$ , then, if H is G-invariant,

$\displaystyle{0 = \partial _{t}H(q \cdot g,p \cdot g) = \left (\partial _{q}H\vert \,q\cdot \xi \right ) + \left (p\cdot \xi \vert \,\partial _{p}H\right ).}$

On the other hand, from the definitions of the actions, one has

$\displaystyle{\left (\partial _{t}\mathfrak{m}(q,p)\vert \,\xi \right ) = \partial _{t}\left (p\vert \,q\cdot \xi \right ) = \left (\partial _{t}p\vert \,q\cdot \xi \right ) -\left (p\cdot \xi \vert \,\partial _{t}q\right ),}$

so that if (q, p) is a Hamiltonian trajectory,

$\displaystyle{\left (\partial _{t}\mathfrak{m}(q,p)\vert \,\xi \right ) = -\left (\partial _{q}H\vert \,q\cdot \xi \right ) -\left (p\cdot \xi \vert \,\partial _{p}H\right ) = 0}$

which gives (17).

Notice that the momentum map has an interesting equivariance property:

$\displaystyle\begin{array}{rcl} \left (\mathfrak{m}(q \cdot g,p \cdot g)\vert \,\xi \right )& =& \left (p \cdot g\vert \,(q \cdot g)\cdot \xi \right ) {}\\ & =& \left (p \cdot g\vert \,q \cdot (g\xi )\right ) {}\\ & =& \left (p\vert \,(q \cdot (g\xi )) \cdot g^{-1}\right ) {}\\ & =& \left (p\vert \,q \cdot ((g\xi )g^{-1})\right ) {}\\ \end{array}$

where g ξ denotes the derivative of $h\mapsto gh$ in h at h = id_Galong the direction ξ and (g ξ)g ⁻¹ the derivative of $h\mapsto hg^{-1}$ in h at h = g along the direction g ξ. The map $\xi \mapsto (g\xi )g^{-1}$ defined on $\mathfrak{G}$ is called the adjoint representation and usually denoted $v\mapsto {\mathrm{Ad}}_{g}\xi$ . One therefore gets

$\displaystyle{\left (\mathfrak{m}(q \cdot g,p \cdot g)\vert \,\xi \right ) = \left (p\vert \,q \cdot \text{Ad}_{g}(\xi )\right ) = \left (\mathfrak{m}(q,p)\vert \,\text{Ad}_{g}(\xi )\right ) = \left (\text{Ad}_{g}^{{\ast}}(\mathfrak{m}(q,p))\vert \,\xi \right ),}$

where Ad_g^∗ is the conjugate of Ad_g. Hence

$\displaystyle{ \mathfrak{m}(q \cdot g,p \cdot g) = \text{Ad}_{g}^{{\ast}}(\mathfrak{m}(q,p)), }$

(18)

that is, $\mathfrak{m}$ is Ad^∗-equivariant.

Euler–Poincaré Equation

Consider the particular case in which Q = G and G acts on itself. In this case,

$\displaystyle{\left (\mathfrak{m}({\mathrm{id}}_{G},p)\vert \,v\right ) = \left (p\vert \,v\right ),}$

so that $\mathfrak{m}({\mathrm{id}}_{G},p) = p$ and one gets from Eq. (18)

$\displaystyle{pg^{-1} = \mathfrak{m}({\mathrm{id}}_{ G},pg^{-1}) = Ad_{ g^{-1}}^{{\ast}}(\mathfrak{m}(g,p)).}$

Hence, along a trajectory starting from g(0) = id_Gof a G-invariant Hamiltonian H, one has (denoting $\rho = pg^{-1} \in \mathfrak{G}^{{\ast}}$ and using the fact that the momentum map is conserved over time)

$\displaystyle\begin{array}{rcl} \rho (t)\doteq p(t)g(t)^{-1}& =& \text{Ad}_{ g^{-1}(t)}^{{\ast}}(\mathfrak{m}(g(t),p(t))) \\ & =& \text{Ad}_{g^{-1}(t)}^{{\ast}}(\mathfrak{m}({\mathrm{id}}_{ G},p(0))) = \text{Ad}_{g^{-1}(t)}^{{\ast}}(\rho (0)).{}\end{array}$

(19)

This is the integrated version of the so-called Euler–Poincaré equation on $\mathfrak{G}^{{\ast}}$ [6, 50],

$\displaystyle{ \partial _{t}\rho + \text{ad}_{v(\rho )}^{{\ast}}(\rho ) = 0, }$

(20)

where $v(\rho ) =\dot{ g}g^{-1} = \partial _{p}H({\mathrm{id}}_{G},pg^{-1}) = \partial _{p}H({\mathrm{id}}_{G},\rho )$ and ad is the differential at location g = id_Gof Ad_g.

A special case of this, which will be important later, is when the Hamiltonian corresponds to a right-invariant Riemannian metric on G. There is a large literature on invariant metrics on Lie groups, which can be shown to be related to important finite and infinite-dimensional mechanical models, including the Euler equation for perfect fluids. The interested reader can refer to [5, 6, 33, 34, 49, 50].

Such a metric is characterized by an inner product $\left \langle \cdot \,,\,\cdot \right \rangle _{V }$ on $\mathfrak{G}$ and defined by

$\displaystyle{ \left \langle v\,,\,w\right \rangle _{g} = \left \langle vg^{-1}\,,\,wg^{-1}\right \rangle _{\mathfrak{G}}. }$

(21)

If one lets $K_{\mathfrak{G}}$ be the duality operator on $\mathfrak{G}$ so that

$\displaystyle{\left (\rho \vert \,v\right ) = \left \langle K_{\mathfrak{G}}\rho \,,\,v\right \rangle _{\mathfrak{G}},}$

the issue of finding minimizing geodesics can be rephrased as an optimal control problem like in the case of landmarks, with Lagrangian $L(g,\mu ) = \left (\mu \vert \,K_{g}\mu \right )/2$ , f(g, μ) = K _gμ, and

$\displaystyle{ K_{g}\mu = (K_{\mathfrak{G}}(\mu g^{-1}))g. }$

(22)

The Hamiltonian equations are then directly given by (15), namely,

$\displaystyle{ \left \{\begin{array}{l} \partial _{t}g = K_{g}\mu, \\ \partial _{t}\mu = -\frac{1} {2}\partial _{g}\big(\left (\mu \vert \,K_{g}\mu \right )\big).\end{array} \right. }$

(23)

This equation is equivalent to the one obtained from the conservation of the momentum map, which is (with $\rho =\mu g^{-1}$ )

$\displaystyle{ \left \{\begin{array}{@{}l@{}} \partial _{t}g = vg, \\ v = K_{\mathfrak{G}}\rho, \\ \partial _{t}\rho = -{\mathrm{ad}}_{v}^{{\ast}}\rho.\end{array} \right. }$

(24)

A Note on Left Actions

Invariance with respect to left actions is handled in a symmetrical way to right actions. If G is acting on the left on G, define the momentum map by

$\displaystyle{\left (\mathfrak{m}(p,q)\vert \,v\right ) = \left (p\vert \,v \cdot q\right )}$

which is conserved along Hamiltonian trajectories. Working out the equivariance property gives

$\displaystyle{\mathfrak{m}(g \cdot p,g \cdot q) =\mathrm{ Ad}_{g^{-1}}^{{\ast}}\mathfrak{m}(p,q)\,.}$

When G acts on itself on the left, the Euler–Poincaré equation reads

$\displaystyle{\rho (t) =\mathrm{ Ad}_{g}^{{\ast}}(\rho (0))}$

$\displaystyle{\partial _{t}\rho -\mathrm{ ad}_{v(\rho )}^{{\ast}}\rho = 0}$

with $\rho = g^{-1}p$ and $v(\rho ) = g^{-1}\dot{g}_{t}$ .

Application to the Group of Diffeomorphisms

Let $G \subset \mathrm{ Diff}(\mathbb{R}^{d})$ be a group of smooth diffeomorphisms of $\mathbb{R}^{d}$ (which, say, smoothly converge to the identity at infinity). Elements of the tangent space to G, which are derivatives of curves $t\mapsto \varphi (t,\cdot )$ where $\varphi (t,\cdot ) \in G$ for all t, can be identified to vector fields $x\mapsto v(x) \in \mathbb{R}^{d}$ , $x \in \mathbb{R}^{d}$ .

To define a right-invariant metric on G, introduce a Hilbert space V of vector fields on $\mathbb{R}^{d}$ with inner product $\left \langle \cdot \,,\,\cdot \right \rangle _{V }$ . Like in section “A Riemannian Manifold of Deformable Landmarks,” let L _Vand $K_{V } = L_{V }^{-1}$ denote the duality operators on V, with $\left \langle v\,,\,w\right \rangle _{V } = \left (L_{V }v\vert \,w\right )$ and $\left \langle \mu \,,\,\nu \right \rangle _{V ^{{\ast}}} = \left (\mu \vert \,K_{V }\nu \right )$ ; K _Vis furthermore identified with a matrix-valued kernel K _V(x, y) acting on vector fields.

The application of the formulae derived for Hamiltonian systems and of the Euler–Poincaré equation will remain in the following of this section at a highly formal level, just computing the expression assumed in the case of diffeomorphisms by the general quantities introduced in the previous section. There will be no attempt at proving that these formulae are indeed valid in this infinite-dimensional context, which is out of the scope of this chapter. As an example of the difficulties that can be encountered, let us mention the dilemma that is involved in the mere choice of the group G. On the first hand, G can be chosen as a group of infinitely differentiable diffeomorphisms that coincide with the identity outside a compact set. This would provide a rather nicely behaved manifold with a Lie group structure in the sense of [44, 45]. The problem with such a choice is that the structure would be much stronger than what Riemannian metrics of interest would induce and that geodesics would typically spring out of the group. One can, on the other hand, try to place the emphasis on the Riemannian and variational aspects so that the computation of geodesics in G, for example, remains well posed. This leads to a solution, introduced in Trouvé (Infinite dimensional group action and pattern recognition. Technical report. DMI, Ecole Normale Supérieure, unpublished, 1995) (see also [70]), in which G is completed in a way which depends on the metric $\left \langle \cdot \,,\,\cdot \right \rangle _{V }$ , so that the resulting group (denote it G _V) is complete for the geodesic distance. This extension, however, comes with the cost of losing the nice features of infinitely differentiable transformations, resulting in G _Vnot being a Lie group, for example.

This being acknowledged, first consider the transcription of (23) to the case of diffeomorphisms. This equation will involve a time-evolving diffeomorphism $\varphi (t,\cdot )$ , and a time-evolving covector, denoted μ(t), which is a linear form over vector fields (it takes a vector field $x\mapsto v(x)$ and returns a number that has so far been denoted $\left (\mu (t)\vert \,v\right )$ ). It will be useful to apply μ(t) to vector-valued functions of several variables, say v(x, y) defined for $x,y \in \mathbb{R}^{2}$ , by letting one of the variables fixed and considering V as a function of the other. This will be denoted by adding a subscript representing the effective variable, so that

$\displaystyle{\left (\mu (t)\vert \,v(x,y)\right )_{x}}$

is the number, dependent of y, obtained by applying μ(t) to the vector field $x\mapsto v(x,y)$ .

One first needs to identify the operator $K_{\varphi }$ in Eq. (22), defined by

$\displaystyle{K_{\varphi }\mu = (K_{V }(\mu \varphi ^{-1}))\varphi = (K_{ V }(\mu \varphi ^{-1}))\circ \varphi }$

since right translation here coincides with composition. Now, for any vector $a \in \mathbb{R}^{d}$ and $y \in \mathbb{R}^{d}$ , one has

$\displaystyle\begin{array}{rcl} a^{T}(K_{\varphi }\mu )(y)& =& a^{T}(K_{ V }(\mu \varphi ^{-1}))(\varphi (y)) {}\\ & =& \left (a \otimes \delta _{\varphi (y)}\vert \,K_{V }(\mu \varphi ^{-1})\right ) {}\\ & =& \left (\mu \varphi ^{-1}\vert \,K_{ V }(a \otimes \delta _{\varphi (y)})\right ) {}\\ & =& \left (\mu \vert \,K_{V }(a \otimes \delta _{\varphi (y)})\circ \varphi \right ) {}\\ & =& \left (\mu \vert \,K_{V }(\varphi (x),\varphi (y))a\right )_{x}. {}\\ \end{array}$

So, letting $e_{1},\ldots,e_{d}$ denote the canonical basis of $\mathbb{R}^{d}$ , one has

$\displaystyle{(K_{\varphi }\mu )(y) =\sum _{ i=1}^{d}e_{ i}^{T}(K_{\varphi }\mu )(y)\,e_{ i} =\sum _{ i=1}^{d}\left (\mu \vert \,K_{ V }^{i}(\varphi (x),\varphi (y))\right )_{ x}e_{i},}$

where K _Vⁱis the ith column of K _V. Therefore

$\displaystyle{\left (\mu \vert \,K_{\varphi }\mu \right ) =\sum _{ i=1}^{d}\left (\mu \vert \,\left (\mu \vert \,K_{ V }^{i}(\varphi (x),\varphi (y))\right )_{ x}e_{i}\right )_{y}}$

and (using the symmetry of K _V)

$\displaystyle{\left (\partial _{\varphi }\left (\mu \vert \,K_{\varphi }\mu \right )\vert \,w\right ) = 2\sum _{i=1}^{d}\left (\mu \vert \,\left (\mu \vert \,D_{ 2}K_{V }^{i}(\varphi (x),\varphi (y))w(y)\right )_{ x}e_{i}\right )_{y},}$

where $D_{2}K_{V }^{i}$ is the derivative of K _Vwith respect to its second variable. These computations directly give the transcription of (23) for diffeomorphisms, namely,

$\displaystyle{ \left \{\begin{array}{@{}l@{}} \partial _{t}\varphi (t,y) =\sum _{ i=1}^{d}\left (\mu (t)\vert \,K_{ V }^{i}(\varphi (t,x),\varphi (t,y))\right )_{ x}e_{i} \\ \forall w: \left (\partial _{t}\mu (t)\vert \,w\right ) = -\sum _{i=1}^{d}\left (\mu (t)\vert \,\left (\mu (t)\vert \,D_{ 2}K_{V }^{i}(\varphi (t,x),\varphi (t,y))w(y)\right )_{ x}e_{i}\right )_{y}.\end{array} \right. }$

(25)

To transcribe Eq. (24) to diffeomorphisms, one only needs to work out the expressions of ${\mathrm{Ad}}_{\varphi }$ and ad_vin this context. Recall that ${\mathrm{Ad}}_{\varphi }w$ was defined by $(\varphi w)\varphi ^{-1}$ ; $\varphi w$ being the differential of the left translation (i.e., $\partial _{t}(\varphi \circ \psi (t))(0)$ with ψ(0) = id and ∂ _{t

Only gold members can continue reading. Log In or Register to continue

Share this:
Click to share on Twitter (Opens in new window)
Click to share on Facebook (Opens in new window)

Related posts:

Segmentation with Shape Priors: Explicit Versus Implicit Representations

Transform in Astronomical Data Processing

Methods for Multi-dimensional Visual Data Analysis

Set Methods for Structural Inversion and Image Reconstruction

Methods for Ill-Posed Problems

and Shah Model and Its Applications to Image Segmentation and Image Restoration

Stay updated, free articles. Join our Telegram channel}