Image Segmentation by Weighted Image Gradient Norm Terms Based on Local Histogram and Active Contours

be an open and bounded set in ${\mathbb {R}}^{2}$ , and $I :\overline{\varOmega }\longrightarrow [0,L]$ a scalar function, representing the observed grayscale input image. Based on the CV model [12, 15], the minimization problem to consider is

$\begin{aligned} \begin{array}{ll} \min \limits _{\begin{array}{c} u\in \{0,1\} \\ {\mathbf {c}} \end{array}}\Bigg \{G({\mathbf {c}},u)=\displaystyle \int _{\varOmega }\,|\nabla u |\, {\mathrm {d}}{\mathbf {x}}&{} +\displaystyle \sum _{i=1}^{n}\lambda ^{in}_{i}\displaystyle \int _{\varOmega }r_{i}\big (c_{i}^{in},f_{i}(I)\big )u\, {\mathrm {d}}{\mathbf {x}}\\ &{}+ \displaystyle \sum _{i=1}^{n}\lambda ^{out}_{i}\displaystyle \int _{\varOmega }r_{i}\big (c_{i}^{out},f_{i}(I)\big )(1-u)\, {\mathrm {d}}{\mathbf {x}}\Bigg \}, \end{array} \end{aligned}$

(1)

where for each $i, f_{i}(I)$ is some feature input channel depending on the image, the vector ${\mathbf {c}}=({\mathbf {c}}^{in},{\mathbf {c}}^{out})$ with ${\mathbf {c}}^{in}=(c_{i}^{in})$ and ${\mathbf {c}}^{out}=(c_{i}^{out})$ are inside and outside unknown averages of $f_{i}(I)$ , respectively. There are also $r_{i}(c_{i}^{in},f_{i}(I)) = (f_{i}(I)-c_{i}^{in})^{2}$ and $r_{i}(c_{i}^{out},f_{i}(I)) = (f_{i}(I)-c_{i}^{out})^{2}$ . Vector ${\lambda }=(\lambda ^{in}_{i},\lambda ^{out}_{i})$ has constant components that weight fitting terms.

Fig. 1

a Image with texture information, b input feature $f_{1}(I)$ computed from (a) with $\sigma =10$ , see (2)

2.1 The Proposed Feature Input Channels

Derivatives are a powerful tool for the extraction of features in images with texture (see e.g. [23]). Given a grayscale texture image displaying two different texture information, it is possible to discriminate between textures using the gradient vector and its norm [19].

Indeed, given a scalar image $I:\varOmega \subset {\mathbb {R}}^2\rightarrow {\mathbb {R}}$ , the new input channel is

$\begin{aligned} f_{1}(I) = (|\nabla I |)_{\sigma }, \end{aligned}$

(2)

where the lower subscript $\sigma$ in (2) means a smoothed version of the corresponding indexed function. Smoothing is done using a Gaussian function

$\begin{aligned} \displaystyle G_\sigma ({\mathbf {x}}) = (2\pi \sigma )^{-1}e^{-\frac{|{\mathbf {x}}|^2}{2\sigma }}. \end{aligned}$

(3)

This new input channel $f_{1}(I)$ embodies different texture information. Nevertheless, it is not able to differentiate between similar scale textures in natural images, as can be seen in Fig. 1b. The input

image, Fig. 1a, is an example of a natural image containing a texture object. In the image with comparable texture information both inside and outside the object of interest, the corresponding feature channel $f_{1}(I)$ can not adequately distinguish the object itself, potentially leading to poor segmentation.

To overcome this difficulty and to obtain a good input channel for the minimization approach (1), an image feature data descriptor is proposed using functions ( $\varPsi :\varOmega \rightarrow {\mathbb {R}}$ ) which contain information about image intensity. This will be combined with channel $|\nabla I |$ using the multiplication operation. The main idea about using $\varPsi$ is to help identify the edges between the texture object and the background. In this chapter, two different choices of the function $\varPsi$ are proposed.

$\bullet$ The first natural choice is $\varPsi _{1} = I$ (global information).

$\bullet$ Another choice for $\varPsi$ , called $\varPsi _{2}$ , is selected based on local cumulative distribution around a neighborhood of a given pixel, and then computing the total intensity given by the area under the local cumulative distribution curve (local information).

2.2 Weighted Gradient Based Fitting Terms I

Given a grayscale image $I:\varOmega \rightarrow [0,L]$ , using the Gaussian function (3) the following smoothed input channel is obtained:

$\begin{aligned} f_{2}(I) = (|\nabla I|\varPsi _{1})_{\sigma }=(|\nabla I|I)_{\sigma }. \end{aligned}$

(4)

Then, the new fitting term is

$\begin{aligned} \lambda \int _{\varOmega }r_{2}\big (c_{2}^{in}, f_{2}(I)\big )u\,{\mathrm {d}}{\mathbf {x}}+ \lambda \int _{\varOmega }r_{2}\big (c_{2}^{out}, f_{2}(I)\big )(1-u)\,{\mathrm {d}}{\mathbf {x}}. \end{aligned}$

(5)

It should be noticed that $f_{2}(I)$ involves computing three feature channels for each scale, $I_{x} I_{y}$ and $|\nabla I|$ , differentiating the proposed model from the Gabor filters [33, 36] for texture image segmentation. Gabor filters have the drawback of inducing a lot of redundancy feature channels. Other texture features involving image gradient were also considered by Brox et al. [7, 35]. They used the structure tensor notion defined by the matrix

$\begin{aligned} \left( \begin{array}{cc} I^{2}_{x} &{} I_{x}I_{y}\\ I_{y}I_{x} &{} I^{2}_{y} \\ \end{array} \right) . \end{aligned}$

(6)

2.2.1 Color Extension

The proposed model can be extended to color images using the vector-value CV model extension [13]. For a color image ${\mathbf {I}}=(I_{1},I_{2},I_{3})$ , where $I_{1}, I_{2}$ and $I_{3}$ represent red, green and blue input channels, respectively, the new fitting terms are

$\begin{aligned} \lambda \sum _{j=1}^{3}\int _{\varOmega }r_{2_{j}}\big (c_{2_{j}}^{in},f_{2}(I_{j})\big )\,u\,{\mathrm {d}}{\mathbf {x}}+\lambda \sum _{j=1}^{3}\int _{\varOmega }r_{2_{j}}\big (c_{2_{j}}^{out},f_{2}(I_{j})\big )\,(1-u)\,{\mathrm {d}}{\mathbf {x}}. \end{aligned}$

(7)

2.3 Weighted Gradient Based Fitting Terms II

Another weighted gradient based fitting term is proposed herein using local histograms. Local histograms give a complete description of image intensity around each pixel and do not depend on regions [2, 11, 32]. For a given grayscale image $I :\overline{\varOmega }\longrightarrow [0,L]$ , let ${\fancyscript{N}}_{{\mathbf {x}},r}$ be the local region centered at a pixel ${\mathbf {x}}$ with radius

. The local histogram of a pixel ${\mathbf {x}}\in \varOmega$ and its corresponding cumulative distribution function for input channel

are defined, respectively, by

$\begin{aligned} P_{{\mathbf {x}}}(I,{\mathsf {y}}) = \frac{|\{{\mathbf {z}}\in {\fancyscript{N}}_{{\mathbf {x}},r}\cap \varOmega \,|\, I({\mathbf {z}})={\mathsf {y}}\}|}{{\fancyscript{N}}_{{\mathbf {x}},r}\cap \varOmega }, \end{aligned}$

and

$\begin{aligned} F_{{\mathbf {x}}}(I,{\mathsf {y}}) = \frac{|\{{\mathbf {z}}\in {\fancyscript{N}}_{{\mathbf {x}},r}\cap \varOmega \,|\, I({\mathbf {z}})\le {\mathsf {y}}\}|}{{\fancyscript{N}}_{{\mathbf {x}},r}\cap \varOmega }, \end{aligned}$

for $0\le {\mathsf {y}}\le L$ . Then, $\varPsi _{2,I}:\varOmega \rightarrow {\mathbb {R}}$ is defined

$\begin{aligned} \varPsi _{2,I}({\mathbf {x}}) = \int _{0}^{L} F_{{\mathbf {x}}}(I,{\mathsf {y}})\,{\mathrm {d}}{\mathsf {y}}. \end{aligned}$

With $\varPsi _{2,I}({\mathbf {x}})$ the total intensity weight within the neighborhood ${\fancyscript{N}}_{{\mathbf {x}},r}$ of ${\mathbf {x}}$ is obtained, in contrast with $\varPsi _{1}({\mathbf {x}})=I({\mathbf {x}})$ , which only gives the intensity value in ${\mathbf {x}}$ . Together with the image gradient norm, the following input channel is considered:

$\begin{aligned} f_{3}(I)({\mathbf {x}}) =\varPsi _{2,I}({\mathbf {x}}) f_{1}(I)({\mathbf {x}})=\varPsi _{2,I}({\mathbf {x}})|\nabla I({\mathbf {x}})|_{\sigma }, \end{aligned}$

(8)

with an associated fitting term

$\begin{aligned} \lambda \int _{\varOmega }r_{3}(c_{3}^{in},f_{3}(I))\,u\,{\mathrm {d}}{\mathbf {x}} +\lambda \int _{\varOmega }r_{3}(c_{3}^{out},f_{3}(I))\,(1-u)\,{\mathrm {d}}{\mathbf {x}}. \end{aligned}$

The notion of local histogram for texture image segmentation has also been taken into account by Ni et al. [22, 32]. The current approach differs from previous work mainly because it does not use the Wasserstein distance in segmentation formulation.

Fig. 2

Local cumulative distribution functions of two images with different texture information. a Input images. b The scaled pixel-values of the input images showing the nature of texture present. c Corresponding local cumulative distribution functions inside the region of interest using $$r=10$$

. d The scaled pixel-values of the gradient image. e Corresponding local cumulative distribution functions inside the region of interest in (d) using $$r=10$$

Usually, two different types of texture images are found in natural images. The first are small scale textures corresponding to a lot of oscillations inside the main object, and the second consists of different flat regions given by large scale textures within the object of interest in the given image. Figure 2 shows two real images with different texture information. Figure 2a, shows

and

images, together with their corresponding pixel-valued image (scaled grayscale values) in Fig. 2b, and the computed local cumulative distribution function of the inner region of the main object to be segmented, Fig. 2c, with

. As shown, the local cumulative distribution function for the

has an almost convex shape over all the domain, whereas the

image has a local cumulative distribution function which oscillates between convex and concave shapes. This is because the

has a lot of different flat regions inside the body which can be considered as large scale textures. Then, the image gradient norm is used in the computation of local histograms in order to apply the weighted gradient fitting term approach. Figure 2d shows how the image gradient norm recovers oscillations patterns inside the

image. Furthermore, Fig. 2e shows that local cumulative distribution functions corresponding to the image gradient norm have an almost single convex shape over the whole domain.

2.3.1 Some Extensions

Extensions for the input channel (8) are listed below.

$\bullet$ The local histogram computation can also be applied to the image gradient norm instead of the original input image $$I$$

. That is, $\varPsi _{2,|\nabla I|}({\mathbf {x}}) = \int _{0}^{L} F_{{\mathbf {x}}}(|\nabla I|,{\mathsf {y}})\,{\mathrm {d}}{\mathsf {y}}$ is considered together with the input channel

$\begin{aligned} f_{4}(I)({\mathbf {x}}) = \varPsi _{2,|\nabla I|}({\mathbf {x}})f_{1}(I)({\mathbf {x}}) = \varPsi _{2,|\nabla I|}({\mathbf {x}})|\nabla I({\mathbf {x}})|_{\sigma }, \end{aligned}$

(9)

and the corresponding fitting term is

$\begin{aligned} \lambda \int _{\varOmega }r_{4}\big (c_{4}^{in},f_{4}(I)\big )\,u\,{\mathrm {d}}{\mathbf {x}} +\lambda \int _{\varOmega }r_{4}\big (c_{4}^{out},f_{4}(I)\big )\,(1-u)\,{\mathrm {d}}{\mathbf {x}}. \end{aligned}$

$\bullet$ By combining (8) and (9), another extension is defined with the data term

$\begin{aligned} \begin{array}{ll} \lambda \displaystyle \int _{\varOmega }\Big (r_{3}\big (c_{3}^{in},f_{3}(I)\big )&{}+\,r_{4}(c_{4}^{in},f_{4}(I)\big )\Big )\,u\,{\mathrm {d}}{\mathbf {x}}\\ &{}+\,(1-\lambda )\displaystyle \int _{\varOmega }\Big (r_{3}\big (c_{3}^{out},f_{3}(I)\big )+r_{4}(c_{4}^{out},f_{4}(I)\big )\Big )\,(1-u)\,{\mathrm {d}}{\mathbf {x}}, \end{array} \end{aligned}$

(10)

where $0\le \lambda \le 1$ .

$\bullet$ The previous extension can be applied to color images using RGB decomposition. For a texture color image ${\mathbf {I}}=(I_{1},I_{2},I_{3})$ , where $I_{1}, I_{2}$ and $I_{3}$ represent red, green and blue input channels, respectively, the new fitting terms are

$\begin{aligned} \begin{array}{ll} \lambda \displaystyle \sum _{j=1}^{3}\displaystyle \int _{\varOmega }&{}\Big (r_{3_{j}}\big (c_{3_{j}}^{in},f_{3}(I_{j})\big )+r_{4_{j}}\big (c_{4_{j}}^{in},f_{4}(I_{j})\big )\Big )\,u\,{\mathrm {d}}{\mathbf {x}}\\ &{}+\,(1-\lambda )\displaystyle \sum _{j=1}^{3}\displaystyle \int _{\varOmega }\Big (r_{3_{j}}\big (c_{3_{j}}^{out},f_{3}(I_{j})\big )+r_{4_{j}}\big (c_{4_{j}}^{out},f_{4}(I_{j})\big )\Big )\,(1-u)\,{\mathrm {d}}{\mathbf {x}}, \end{array} \end{aligned}$

(11)

where $r_{i_{j}}\big (c_{i_{j}}^{in},f_{i}(I_{j})\big ) = \big (f_{i}(I)-c_{i_{j}}^{in}\big )^{2}$ and $r_{i_{j}}\big (c_{i_{j}}^{out},f_{i}(I_{j})\big ) = \big (f_{i}(I)-c_{i_{j}}^{out}\big )^{2}$ .

3 Numerical Solution for the Proposed Model

A two-step methodology is implemented for solving the proposed texture segmentation approach (1). In the first step,

is fixed and the minimization with respect to ${\mathbf {c}}$ is computed. Then, ${\mathbf {c}}$ is fixed and the minimization with respect to

is performed. Although the functional $G({\mathbf {c}},\cdot )$ is convex in

, the minimization is carried out over a binary set. Following the same approach as in Chan et al. [12], it is proposed to use soft smooth membership function $u\in [0,1]$ . The new convex minimization problem is

$\begin{aligned} \begin{array}{lcl } \min \limits _{u\,\in \,[0,1]}G({\mathbf {c}},u), \end{array} \end{aligned}$

(12)

for which it is guaranteed the existence of a global minimizer. Moreover, if

is a minimizer of the convex relaxed problem (12), then for a.e. $s\in [0,1]$ , the function $$$\hat{u}={1\!\!1}_{\{{\mathbf {x}}\,\in \,\varOmega \,:\,u({\mathbf {x}})> s\}}$$” src=”/wp-content/uploads/2016/03/A320009_1_En_13_Chapter_IEq77.gif”></SPAN> is a minimizer of the original binary problem (<SPAN class=InternalRef><A href=$ 1). The minimization of (12) can be efficiently solved using the dual minimization approach of the TV term [6, 10].

3.1 Fast Dual Minimization

A fast numerical algorithm applying the methods in Aujol et al. [4] and Bresson et al. [6] is implemented for solving the minimization problem (12). First, it should be noted that solving the minimization problem (12) is equivalent to solve

$\begin{aligned} \min _{u\,\in \,[0,1]}\Bigg \{\tilde{G}({\mathbf {c}},u) =\int _{\varOmega }\,|\nabla u |\, {\mathrm {d}}{\mathbf {x}} +\sum _{i=1}^{n}\int _{\varOmega }\Big (\lambda _{i}^{in}r_{i}\big (c_{i}^{in},f_{i}(I)\big )- \lambda _{i}^{out}r_{i}\big (c_{i}^{out},f_{i}(I)\big )\Big )\,u\, {\mathrm {d}}{\mathbf {x}}\Bigg \}. \end{aligned}$