Template Matching in Color Images

be an Euclidean or digital space (i.e. $E=\mathbb {R}^n$ or $E=\mathbb {Z}^n$ with $n\in \mathbb {N}^*$ ). Let be a subset of . Let ${\mathcal {P}}\left( X \right)$ denote the class of all subsets of : ${\mathcal {P}}\left( X \right) =\left\{ Y\subseteq X\right\}$ . Let ${X^{c}}$ denote the complement of , i.e. the set of all points of that do not belong to : ${X^{c}}=\left\{ y\in E\, | \, y\notin X\right\}$ . Let $\breve{{X}}$ denote the reflection of , i.e. the set transformed by a central symmetry: $\breve{{X}}=\left\{ -x \, | \, x\in X\right\}$ . Finally, we will denote by the translate of by $p\in E$ defined by $X_p=\left\{ x+p \, | \, x\in X\right\}$ .

The basic operators of erosion and dilation of

by the Structuring Element (SE) $B\in {\mathcal {P}}\left( E\right)$ are written respectively $\varepsilon _B(X)$ and $\delta _B(X)$ . They are defined using the Minkowski subtraction ( $\ominus$ ) and addition ( $\oplus$ ):

$\begin{aligned} \varepsilon _B(X)=X\ominus B=\bigcap _{b\in B}X_{-b} \end{aligned}$

(1)

$\begin{aligned} \delta _B(X)=X\oplus B=\bigcup _{b\in B}X_{b} \end{aligned}$

(2)

2.2 Binary HMT

Being given a subset

, the principle of the HMT is to look for all positions

where a structuring element

called the foreground fits in the shape ( $F_p\subseteq X$ ) while another structuring element

called the background fits in the complement of

( $B_p\subseteq {X^{c}}$ ). The HMT of

by the structuring elements

, noted $\mathrm HMT _{F,B}(X)$ , can be defined in terms of Minkowski subtraction and addition (structuring erosion and dilation):

$\begin{aligned} \mathrm HMT _{F,B}(X)&= \left\{ p \in X \, | \, F_p\subseteq X \text { and } B_p\subseteq {X^{c}}\right\} \nonumber \\&= \left\{ p \in X \, | \, F_p\subseteq X \subseteq B_p^c\right\} \nonumber \\&= (X \ominus F) \cap ({X^{c}} \ominus B) \nonumber \\&= \varepsilon _F(X) \backslash \delta _{\breve{{B}}}(X) \end{aligned}$

(3)

A direct consequence of this definition is that if

and

have a non-empty intersection, then $\mathrm HMT _{F,B}(X)$ is empty for all

in ${\mathcal {P}}\left( E \right)$ (a given pixel cannot simultaneously belongs to the foreground and to the background). One can also note that: first, the second formulation of (Eq. 3) is known as the interval operator from [15] and second, the complementation does not appear in the last formulation which allows to naturally extend the HMT to gray-level images where complementation is not defined.

Figure 1 illustrates the application of the hit-or-miss transform on a binary image. Our goal here is to detect $3\,\times 3$ squares with possible extensions along edges. It is noteworthy that the two structuring elements do not need to cover the whole neighborhood of a pixel and some pixels, as the middle of the edges of the square, can be excluded from the decision process by neither putting them in

nor in

. This is a first attempt to ensure robustness to uncertainty or noise. Some more advanced solutions, not specific to binary images will be reviewed in Sec. 5.

Fig. 1

Example of application of binary HMT to detect 3 x 3 squares with possible extensions along edges [27]. The pixels belonging to the images are in black while those belonging to the complement are in white. The top left image represents the foreground SE: $$F$$

, origin of the image is at the centre of the square. The bottom left image is the background SE $$B$$

, with same origin as $$F$$

. The middle image is the original image $$I$$

. And the right image is the result of the HMT applied on $$I$$

with SEs

and

3 Template Matching in Grayscale Images

We recall here basics of Mathematical Morphology in the grayscale case, with adequate definitions and notations. The various definitions of the HMT on grayscale images (e.g.,[8, 23, 31, 34]) will then be presented and discussed, with comprehensive examples.

3.1 Grayscale Mathematical Morphology

We now consider the case of grayscale images, i.e.of mappings from $$E$$

to the set of values $$T$$

(

). Moreover, we assume that $$T$$

is a complete lattice, i.e. a non empty set equipped with a partial ordering $\le$ , an infimum $\bigwedge$ and a supremum $\bigvee$ such that the infimum and supremum of any non empty subset of $$T$$

belongs to $$T$$

(for all $A\subseteq T$ , $\bigwedge A\in T$ and $\bigvee A\in T$ ). The lattice $$T$$

is thus bounded by its least element $\bot =\bigwedge T$ and its greatest element $\top =\bigvee T$ . For grayscale images we usually take $T=\overline{\mathbb {R}}=\mathbb {R}\cup \left\{ -\infty ,\infty \right\}$ or $T=\overline{\mathbb {Z}}=\mathbb {Z}\cup \left\{ -\infty ,\infty \right\}$ . The notion of structuring element naturally extends to the one of structuring function (SF) which is simply an element of $$T^E$$

The definitions of the binary erosion (Eq. 1) and binary dilation (Eq. 2) then naturally extend to grayscale images. Let $$F$$

be a grayscale image and $$V$$

be a structuring function. The grayscale erosion $\varepsilon _V(F)$ and dilation $\delta _V(F)$ of $$F$$

are then defined for all $p\in E$ by:

$\begin{aligned} \varepsilon _V(F)(p)=\bigwedge _{x\in {{\mathrm{supp}}}(V)}\left\{ F(p+x)-V(x)\right\} \end{aligned}$

(4)

$\begin{aligned} \delta _V(F)(p)=\bigvee _{x\in {{\mathrm{supp}}}(V)}\left\{ F(p-x)+V(x)\right\} \end{aligned}$

(5)

where ${{\mathrm{supp}}}(V)$ is the support of $$V$$

, i.e.the set of points of $$E$$

where

is greater than the least element $\bot$ : ${{\mathrm{supp}}}(V)=\left\{ p\in E\, | \, V(p)\ne \bot \right\}$ . These formulas can lead to disinclination like $+\infty$ plus $-\infty$ . To keep consistency $+\infty$ plus $-\infty$ must be valued as $-\infty$ in Eq. 5 and as $+\infty$ in Eq. 4.

3.2 Grayscale HMT

The binary HMT was first extended to grayscale images by Ronse [31] and then many other definitions followed: by Shaefer [32], Khosravi [18], Raducanu [29], Soille [36], Barat [7], Perret [27]. A recent survey on grayscale HMT is given in [22]. These various definitions have been recently unified into a common theoretical framework for graylevel interval operators [23]. In the binary case, the interval operator (second line of Eq. 3) looks for each translation $p\in E$ if the image fits between the background SE translated at $$p$$

and the complement of the background SE also translated at $$p$$

. In the grayscale case, the sought template is translated not only horizontally (by a point $p \in E$ ), but vertically as well (by a finite graylevel $t \in T$ ) in an attempt to detect the positions where it fits (Fig. 2). Specifically, we will note $V_{(p,t)}$ the translation of $V\in T^E$ by a couple $(p,t)\in E\times T$ : for all $x\in E$ , $V_{(p,t)}(x)=V(x-p) + t$ .

Fig. 2

The integral interval operator, (Eq. 13) [6]

According to the unified theory for grayscale HMT [23], a grayscale HMT is decomposed into two stages:

the fitting, where the locations fitting the given structuring functions, describing the sought template, are computed,
and valuation, where the resulting image containing the previously detected locations is constructed.

Let us observe that in the binary case, the fitted pixels are the ones retained by the HMT, and their valuation is simply equal to one (or foreground).

3.2.1 Fitting

We first describe the different fittings. Let $F\in T^E$ be a grayscale image and $V,W\in T^E$ be a couple of structuring functions describing the sought template such that $V \le W$ . Formally a fitting is a mapping from $$T^E$$

into ${\mathcal {P}}\left( E\times T' \right)$ where $$T'$$

can be a subset of $$T$$

or any set of values. A first fitting involved in Ronse’s HMT is given by:

$\begin{aligned} H_{V,W}(F)&= \left\{ (p,t) \in E \times T ~ | ~ V_{(p,t)} \le F \le W_{(p,t)} \right\} \end{aligned}$

(6)

$\begin{aligned}&= \left\{ (p,t) \in E \times T \, | \, \varepsilon _V(F)(p) \le t \le \delta _{W^{*}}(F)(p)\right\} \end{aligned}$

(7)

where $W^{*}:x \rightarrow -W(-x)$ , is the dual of $$W$$

. $H_{V,W}(F)$ is thus the set of all translations where the image lies between both structuring functions (Fig. 3).

Fig. 3

Illustration of $H_{V,W}$ (Eq. (6)). We consider a function $$F$$

from $E=\mathbb {Z}$ into $T=\overline{\mathbb {Z}}$ and the two structuring functions $$V$$

and

(the origin is on the left pixel of $$V$$

). The result of $H_{V,W}(F)$ is the set of points of $E\times T$ marked by a black circle inside $$F$$

. We also give the result of the application of the supremal valuation (Eq. (12)) of $H_{V,W}(F)$ in gray, i.e. the result of $RHMT_{V,W}(F)$ (Eq. (14)).

A second fitting, involved in Soille’s and Barat’s HMT is:

$\begin{aligned} K_{V,W}(F)&= \left\{ (p,t) \in E \times T ~ | ~ V_{(p,t)} \le F \ll W_{(p,t)} \right\} \end{aligned}$

(8)

$\begin{aligned}&= \left\{ (p,t) \in E \times T \, | \, \varepsilon _V(F)(p) \le t < \delta _{W^{*}}(F)(p) \right\} \end{aligned}$

(9)

where $F \ll W$ means that there is some $$h > 0$$” src=”/wp-content/uploads/2016/03/A308467_1_En_8_Chapter_IEq128.gif”></SPAN> such that for every <SPAN id=IEq129 class=InlineEquation><IMG alt=

$$h > 0$$” src=”/wp-content/uploads/2016/03/A308467_1_En_8_Chapter_IEq128.gif”></SPAN> such that for every <SPAN id=IEq129 class=InlineEquation><IMG alt=

we have $F(p) \le W(p) - h$ . $K_{V,W}(F)$ is thus the set of all translations where the image lies between both structuring functions and does not touch $$W$$

(Fig. 4).

Fig. 4

Illustration of $K_{V,W}$ (Eq. 8). We consider a function $$F$$

from $E=\mathbb {Z}$ into $T=\overline{\mathbb {Z}}$ and the two structuring functions $$V$$

and

(the origin is on the left pixel of $$V$$

). The result of $K_{V,W}(F)$ is the set of points of $E\times T$ marked by a black circle inside $$F$$

. We also give the result of the application of the integral valuation (Eq. 13) of $K_{V,W}(F)$ in gray, i.e. the result of $SHMT_{V,W}(F)$ (Eq. 15)

Fig. 5

Example of application of the $P_{V,W}(F)$ (Eq. 10) to detect an exponential profile with Gaussian noise [27]. The first image represents a 1D noisy signal. The second image represents the uncertainty area defined by the 2 SFs $$V$$

(lower function) and $$W$$

(upper function). The third image shows how well the pattern can fit the signal

Finally, a third fitting involved in Perret’s HMT is given by:

$\begin{aligned} P_{V,W}(F) = \left\{ \left( p,\textstyle {\frac{ |\left\{ q \in S \, | \, V(q)+t\le F(p+q)\le W(q)+t\right\} | }{|S|}}\right) \in E \times \left[ 0,1\right] ~\big |~ t\in T\right\} \end{aligned}$

(10)

where $S=\left\{ x\in E\, | \, V(x) \ne \bot \mathrm{ or } W(x) \ne \top \right\}$ and $$|X|$$

is the cardinal of the set $$X$$

. Hence, $P_{V,W}(F)$ does not look for locations where both structuring functions completely fit the image. Instead, it measures at each location how well both structuring functions fit the image by counting, among the points of the image that lies in the support of one of the SF, the ratio of points that fits between both SF (Fig. 5).

3.2.2 Valuation

Likewise, there are several types of valuations. Formally, a valuation $\eta$ is a mapping that associates a (grayscale) image ( $$T^E$$

) to any result of a fitting $\left( {\mathcal {P}}\left( E\times T' \right) \right)$ . Let $Y\in {\mathcal {P}}\left( E\times T' \right)$ , the simplest valuation $\eta ^B(Y)$ , the binary valuation, consists in taking the set of points $p \in E$ for which there is at least one $t \in T$ such that $$(p,t)$$

belongs to $$Y$$

$\begin{aligned} \eta ^B(Y)(p) = {\left\{ \begin{array}{ll} 1\; \mathrm{ if }\; \exists \,t\in T', (p,t)\in Y \\ 0\; \mathrm{otherwise } \end{array}\right. }. \end{aligned}$

(11)

Another valuation is the supremal valuation $\eta ^S(Y)$ , which for every point $p\in E$ of fit couples $\left\{ (p,t)\right\} \subseteq Y$ , takes the supremum of $$t$$

(Fig. 3):

$\begin{aligned} \eta ^S(Y)(p) = \sup \left\{ t \, | \, (p,t)\in Y\right\} \end{aligned}$

(12)

Finally there is the integral valuation $\eta ^S(Y)$ which instead for every point $$p$$

of fit couples $\left\{ (p,t)\right\} \subseteq Y$ , uses the length of the interval of $$t$$

for which the couples $$(p,t)$$

fit (Fig. 4).

$\begin{aligned} \eta ^I(Y)(p) = \mathrm{mes }\left\{ t \, | \, (p,t)\in Y\right\} \end{aligned}$

(13)

where $\mathrm{mes }$ is an appropriate measure of intervals in $$T'$$

(simply the Lebesgue measure if $$T'$$

is an interval of $\mathbb {R}$ or $\mathbb {Z}$ ).

3.3 HMT Definitions

The different combinations of fittings and valuations allow to recover the different definitions of HMT. Let $F,V,W\in T^E$ with $V\le W$ and $p\in E$ , among others, we mention:

$\begin{aligned} \mathrm{(Ronse) } \; RHMT_{V,W}(F)(p)&=\eta ^S(H_{V,W}(F))(p) \nonumber \\&=\left\{ \begin{array}{ll} \varepsilon _V(F)(p) &{} \text {if}\, \varepsilon _V(F)(p) \ge \delta _{W^*}(F)(p),\\ \bot &{} \text {otherwise} \end{array} \right. \end{aligned}$

(14)

$\begin{aligned} \mathrm{(Soille) } \; SHMT_{V,W}(F)(p)&=\eta ^I(K_{V,W}(F))(p) \nonumber \\&=\max \left\{ \varepsilon _V(F)(p) - \delta _{W^{*}}(F)(p),0\right\} \end{aligned}$

(15)

$\begin{aligned} \mathrm{(Perret) } \; PHMT_{V,W}(F)(p)&=\eta ^S(P_{V,W}(F))(p) \nonumber \\&= \max _{t\in T} \frac{ |\left\{ q \in S \, | \, V(q)+t\le F(p+q)\le W(q)+t\right\} |}{|S|} \end{aligned}$

(16)

Fig. 6

Grayscale HMT : a original image ( $64 \times 64$ pixels); b couple of structuring functions used within the HMT; c result with Ronse’s, d and Soille’s definition (results are displayed in inverse gray levels and for the sake of clarity, $\bot$ value was replaced with $$0$$

) [39]

Ronse’s and Soille’s HMT are illustrated in Fig. 3 and 4 and further compared in Fig. 6 while an application of Perret’s HMT is given in Sect. 6.1.

4 Template Matching in Color and Multivariate Images

The extension of the HMT to color and more generally to multivariate images such as multi- and hyper-spectral data, can amplify the application potential of the operator at a significant level. Of particular importance in this context are applications regarding color object detection from color data, target detection from remote sensing data, as well as the detection of challenging celestial objects from astronomical imaging sources.

However, given the multitude of existing approaches for the definition of the grayscale HMT, it is no surprise that the situation is no less clearer with multivariate images. In fact, conversely to grayscale morphological image processing, there is not even a generally accepted color mathematical morphology framework, let alone a standardized multivariate HMT. Nevertheless, the increasingly widespread availability of multivariate images, especially in the context of remote sensing, as well as recent advances in color morphology, have provided the missing impetus that has led to the intensification of the research work on extending the HMT to color, multi- and hyper-spectral data.

We will start this section by first recalling the basic theoretical implications of using multivariate images to the mathematical morphology theory, and elaborate on the difficulties of its extension to this type of images. Then, we will present the different methods that have been developed especially in the past few years for applying the HMT to multivariate images.

4.1 Multivariate Mathematical Morphology

According to the lattice based approach to mathematical morphology [30], digital images are represented as mappings $F: E \rightarrow T$ between the discrete coordinate grid $$E$$

and the set of pixel values $$T$$

, with

being a complete lattice. More precisely, imposing a complete lattice structure on an arbitrary set of pixel values $$T$$

is possible, if $$T$$

is equipped with at least a partial ordering relation which enables the computation of the infimum and supremum of any non-empty subset of $$T$$

. Consequently, the morphological operators can be in fact applied to any type of image data, as long as the set of pixel values $$T$$

possesses a complete lattice structure. For a detailed account of multivariate morphology the reader is referred to Refs. [14, 33].

For instance, in the case of continuous multidimensional grayscale images where $F : \mathbb {R}^{d} \rightarrow \overline{\mathbb {R}}$ , it suffices to employ the usual comparison operator $\le$ , in order to induce a complete lattice structure on $\overline{\mathbb {R}}$ . Likewise, the inclusion operator $\subseteq$ can be used with binary images $F : \mathbb {R}^{d} \rightarrow \left\{ 0,1\right\}$ . However, if we now consider multivariate images $$$\mathbf {F} : \mathbb {R}^{d} \rightarrow \overline{\mathbb {R}}^n, n > 1$$” src=”/wp-content/uploads/2016/03/A308467_1_En_8_Chapter_IEq194.gif”></SPAN>, where <SPAN id=IEq195 class=InlineEquation><IMG alt=$ for the specific case of color images, it becomes problematic to find an ordering relation for the vectors of $\overline{\mathbb {R}}^n$ , due to the fact that there is no universal method for ordering multivariate data. As a result, in the last 15 years several ordering approaches have been explored with the end of extending mathematical morphology to multivariate images, for a detailed survey of which the reader is referred to Ref. [2].

We briefly recall the main properties of an ordering, i.e.a binary relation $\le$ on a set $\mathcal {T}$ being reflexive ( $\forall \,x \in \mathcal {T},x \, \le \, x$ ), anti-symmetric ( $\forall \, x,y \in \mathcal {T},x \, \le \, y$ and $y \, \le \, x \Rightarrow x = y$ ), and transitive ( $\forall \, x,y,w \in \mathcal {T},x \, \le \, y$ and $y \, \le \, w \Rightarrow x \, \le \, w$ ). An ordering is total if the totality statement ( $\forall \, x,y \in \mathcal {T},x \, \le \, y$ or $y \, \le \, x$ ) holds, partial otherwise. It is a pre-ordering if the anti-symmetry statement does not hold.

Hence, the HMT being a morphological tool, its extension to multivariate images also requires the implication of a vector ordering. We will now proceed to examine various solutions developed for applying the HMT to color or to multivariate data in general.

4.2 HMT Based on a Vector Ordering

Aptoula et al. [6] provide the first study on the theoretical requirements of a vector HMT (VHMT). In detail, the initial step for extending the HMT to multivariate data, consists in defining the erosion and dilation operators for multivariate images in combination with multivariate structuring functions (SF). More precisely, these operators are based on horizontal translations (by a point $p \in E$ ) as well as on vertical ones (by a finite pixel value $\mathbf t \in T$ ) as in the grayscale case, the difference is however that pixel values are now multi-dimensional; in particular, given a multivariate image $\mathbf {F} : E \rightarrow T$ :

$\begin{aligned} \forall \,(p,\mathbf t ) \in E \times T, \mathbf F _{(p,\mathbf t )}(x) = \mathbf F (x-p) + \mathbf t \end{aligned}$

(17)

Furthermore, according to the fundamental Refs. [16, 17] of Heijmans and Ronse, translations need to be complete lattice automorphisms ( $\textit{i.e.}$ bijections $T \rightarrow T$ that preserve order, and whose inverse also preserve order). Consequently, the vector ordering $(\le _v)$ from which the complete lattice is derived, must be translation invariant. In other words:

$\begin{aligned} \forall \, \mathbf w ,\mathbf w ^{\prime },\mathbf t \in T, \mathbf w \le _v \mathbf w ^{\prime } \Leftrightarrow \mathbf w + \mathbf t \le _v \mathbf w ^{\prime } + \mathbf t \end{aligned}$

(18)

Thus one can give the definition of the erosion and dilation respectively of a multivariate image F by a multivariate SF B:

$\begin{aligned} \varvec{\varepsilon }_\mathbf{B }(\mathbf F )(p)&= \underset{x\,\in \,{{\mathrm{supp}}}(\mathbf B )}{{{\mathrm{inf_{\textit{v}}}}}}\left\{ \mathbf F (p+x) - \mathbf B (x)\right\} \end{aligned}$