of Medical Image Registration by Using High-Accuracy Image Matching Techniques

Fig. 1

Framework of registration evaluation

In the procedure of landmark detection, firstly, 3D SIFT keypoint detection [14, 20], which detects local extrema from image pyramids consisting of differences in Gaussian-blurred images at multiple scales, is applied to the fixed image.

After the SIFT feature points detection in the fixed image, a 3D phase-based image matching algorithm is developed to match the corresponding landmarks in the registered moving image. A discrete Fourier transform (DFT) of image blocks around SIFT feature points and that of their candidate correspondences is computed, respectively. Phase components of the image-block pairs in the frequency domain are used to estimate locations of the correspondences. The image-block matching method can achieve high accuracy at the sub-voxel level [15, 29]. Even under rotation, the matching method can give good performance by coarse-to-fine and iterative procedures. As a result, it is possible to measure a registration error by using the distance between the SIFT keypoints and their correspondences. Note that although this work focuses on registration validation on thoracic CT images, the proposed method can be extended to different types and modalities of images, since both the feature detection and image matching algorithms can be applied to general image data.

Feature Point Detection

In this work, a 3D landmark detector is required to extract fiducial landmarks as reference points for spatial accuracy evaluation of image registration. Harkens et al. [16] investigated nine 3D differential operators for the detection of anatomical landmarks in medical images. Recently, the SIFT detector and descriptor introduced by Lowe [17] has become very popular and been widely adopted in research work of medical imaging [14, 18–21, 30]. In this work, SIFT detector, which was extended to 3D in [14, 20], is also adopted to extract candidates of fiducial landmarks for registration evaluation.

Candidate Feature Point Detection

Original SIFT feature points are detected by using difference-of-Gaussian (DoG) images [17]. For a medical image, which is usually a 3D volume, the DoG images can be computed as follows [14]:

$D\left(x,y,z,\sigma \right)=L\left(x,y,z, k\sigma \right)-L\left(x,y,z,\sigma \right),$

(1)

where k is a constant multiplicative factor. L(x,y,z,σ) is obtained by smoothing the original image I(x,y,z) with a variable-scale Gaussian filter G(x,y,z,σ):

$L\left(x,y,z,\sigma \right)=G\left(x,y,z,\sigma \right)\times I\left(x,y,z\right)=\frac{1}{{\left(\sqrt{2\pi}\sigma \right)}^3}{e}^{-\left({x}^2+{y}^2+{z}^2\right)/2{\sigma}^2}\times I\left(x,y,z\right).$

(2)

Figure 2 illustrates the computation of a group of DoG images. Using such DoG images, local extrema can be detected at the pyramid level of σ. For a voxel v in a DoG image with scale σ, the intensity of v is compared with those of its 80 neighbor voxels (26 neighbor points at the same scale σ, and 27 counterparts at the scale of k ¹ σ and k ^{− 1} σ, respectively). The voxel with the most or least extreme value of intensity is considered as a candidate of feature point. Figure 3 illustrates the procedures of local extrema detection. In this work, multi-scale searching parameters are set as that k = 2.0^1/3 and σ = {1.00,1.26,1.58,2.00,2.52,3.17,4.00}.

Fig. 2

Framework of registration evaluation

Fig. 3

Local extrema detection. This figure is a modified version of Fig. 2 in [14]

Feature Point Refinement and Filtering

For 3D medical images, a large number of candidate feature points, i.e., the scale-space extrema, are usually detected. It is necessary to eliminate candidates that have low contrast or are poorly localized along edges. Moreover, locations of the feature points are needed to be refined to provide better accuracy. In 2D SIFT algorithm [17], an approach was developed to estimate the feature point locations, combined with a thresholding on the local contrast. In addition, edge-like features were removed by testing the relative ratios of 2D Hessian eigenvalues. In this work, following the work in [20], the feature point refinement and filtering procedures are extended to 3D.

Feature Point Localization Refinement

The initial implementation of 2D SIFT [24] simply located keypoints at the coordinate of the detected maxima or minima. Brown and Lowe [25] developed a method for fitting a 3D quadratic function to the local sample points to determine the interpolated locations and scales of the extrema. For 3D medical images, this approach was extended to a fitting problem for a 4D scale-space function, D(x,y,z,σ) [20]. The Taylor expansion (up to the quadratic terms) of D(x,y,z,σ), at a certain point x = (x,y,z,σ)^T, is formulated:

$D\left(\mathbf{x}\right)=D+\frac{\partial {D}^T}{\partial \mathbf{x}}\mathbf{x}+\frac{1}{2}{\mathbf{x}}^{\mathrm{T}}\frac{\partial^2D}{\partial {\mathbf{x}}^2}\mathbf{x},$

(3)

The location of the extremum, $\widehat{\mathbf{x}}$ , is determined by taking the derivative of this function with respect to x and setting it to 0:

$\frac{\partial D}{\partial \mathbf{x}}+\frac{\partial^2D}{\partial {\mathbf{x}}^2}\widehat{\mathbf{x}}=0.$

(4)

Therefore, the sub-voxel location of the extreme is obtained:

$\widehat{\mathbf{x}}=-{\frac{\partial^2D}{\partial {\mathbf{x}}^2}}^{-1}\cdot \frac{\partial D}{\partial \mathbf{x}}.$

(5)

As suggested by Brown and Lowe [25], the Hessian and the derivative of D are approximated by using finite differences of neighboring points.

Removal of Low-Contrast Feature Points

The function value at the extremum, $D\left(\widehat{\mathbf{x}}\right)$ , is useful for rejecting unstable extrema with low contrast. $D\left(\widehat{\mathbf{x}}\right)$ can be obtained by substituting (5) into (3):

$D\left(\widehat{\mathbf{x}}\right)=D+\frac{1}{2}\frac{\partial {D}^T}{\partial \mathbf{x}}\widehat{\mathbf{x}}.$

(6)

In 2D SIFT method [17], all extrema with a value of $D\left(\widehat{\mathbf{x}}\right)$ less than 0.03 were discarded experimentally. In [20], the threshold of 0.03 was suggested for CT images, whereas a less selective value of 0.01 was suggested for the less contrasted appearance of MR and CBCT images.

Rejecting Poorly Localized Points Along Edges

For stability of image matching, it is not sufficient to reject keypoints with low contrast. The DoG function will have a strong response along edges, even if the location along the edge is poorly determined and unstable to small amount of noise. Such edge responses were eliminated by using the ratio of eigenvalues of 2D Hessian matrix [17].

In this work, adopting an approach developed in [20], the elimination of poorly localized feature points is extended to 3D. Feature points localized along edges and along ridges, other than blob-like structures, are rejected. Such blob-like structures can be characterized by the following properties:

(a)

All principal curvatures are of the same sign.

(b)

They all have a magnitude of the same order.

Feature points that satisfy the above two conditions are considered as fiducial landmarks for registration evaluation in this work.

Similar to the 2D case, edge responses in 3D space can be measured by a 3 × 3 Hessian matrix:

$H=\left[\begin{array}{ccc}\hfill {D}_{xx}\hfill & \hfill {D}_{xy}\hfill & \hfill {D}_{xz}\hfill \\ {}\hfill {D}_{xy}\hfill & \hfill {D}_{yy}\hfill & \hfill {D}_{yz}\hfill \\ {}\hfill {D}_{xz}\hfill & \hfill {D}_{yz}\hfill & \hfill {D}_{zz}\hfill \end{array}\right].$

(7)

The matrix is computed from finite differences at the location and scale of the feature point. And the principal curvatures are proportional to the eigenvalues of H [26]. Let α ≥ β ≥ γ be the three largest eigenvalues in decreasing order of signed magnitude. The trace and the determinant of H can be computed as:

$\mathrm{tr}\left(\mathbf{H}\right)=\alpha +\beta +\gamma ={D}_{xx}+{D}_{yy}+{D}_{zz};$

(8)

$\det \left(\mathbf{H}\right)=\alpha \beta \gamma ={D}_{xx}{D}_{yy}{D}_{zz}+2{D}_{xy}{D}_{yz}{D}_{zz}-{D}_{xx}{\left({D}_{yz}\right)}^2-{D}_{yy}{\left({D}_{xz}\right)}^2-{D}_{zz}{\left({D}_{xy}\right)}^2.$

(9)

In addition, the sum of principal second-order minors ∑ det ₂^P(H) is computed as:

$\left.{\displaystyle \sum },{ \det}_2^{\mathrm{P}},\left(\mathbf{H}\right)\right|=\beta \gamma +\gamma \alpha +\alpha \beta \kern1em ={D}_{yy}{D}_{zz}-{\left({D}_{yz}\right)}^2+{D}_{zz}{D}_{xx}-{\left({D}_{xz}\right)}^2+{D}_{xx}{D}_{yy}-{\left({D}_{xy}\right)}^2,$

(10)

When the eigenvalues are either all positive or all negative, condition (a) is satisfied, which corresponds to a dark blob or a bright blob, respectively [27]. This is equivalent to the condition:

$$$ {\displaystyle \sum }{ \det}_2^{\mathrm{P}}\left(\mathbf{H}\right)>0,\kern1em \mathrm{and}\kern1.25em \mathrm{tr}\left(\mathbf{H}\right)\kern1em \det \left(\mathbf{H}\right)>0. $$” src=”/wp-content/uploads/2016/03/A303895_1_En_19_Chapter_Equ11.gif”></DIV></DIV><br /> <DIV class=EquationNumber>(11)</DIV></DIV></DIV><br /> <DIV class=Para>In order to satisfy condition (b) to reject plate-like or tubular structures, let <SPAN class=EmphasisTypeItalic>r</SPAN> and <SPAN class=EmphasisTypeItalic>s</SPAN> be the ratios such that <SPAN class=EmphasisTypeItalic>α</SPAN> = <SPAN class=EmphasisTypeItalic>rβ</SPAN> and <SPAN class=EmphasisTypeItalic>β</SPAN> = <SPAN class=EmphasisTypeItalic>sγ</SPAN>. Then,<br /> <DIV id=Equ12 class=Equation><br /> <DIV class=EquationContent><br /> <DIV class=MediaObject><IMG alt=$

(12)

which depends only on the ratio of the eigenvalues rather than their individual values, and increases with both rs and s. In order to check that the ratio t = rs of principal curvatures is below some threshold, t _max, we only need to check:

$\frac{\mathrm{tr}{\left(\mathbf{H}\right)}^3}{ \det \left(\mathbf{H}\right)}<\frac{{\left(2{t}_{\max }+1\right)}^3}{{t_{\max}}^2}.$

(13)

Otherwise, features look more like edges or ridges than blobs, and are discarded. In [20], an upper threshold t _max = 5 was experimentally determined for CT images, whereas t _max = 20 was found to be suited to MR and CBCT images.

Feature Point Matching

To measure distance between the reference points and their correspondences, an image matching method based on a 3D phase-only correlation (POC) function is applied. The original POC function [22, 31, 32] was calculated with a 2D DFT to estimate displacement between image blocks, and it was extended to a 3D implementation while maintaining good performance [15, 29].

3D Phase-Only Correlation Function

POC function [22] is a correlation function used in the phase-based image matching to evaluate similarity between two images. According to [22], image matching with POC can achieve sub-pixel accuracy, and it can be extended to 3D while maintaining the excellent performance [15, 29]. This section will introduce the 3D POC function which is applied in image matching of the accuracy evaluation.

Consider two N ₁ × N ₂ × N ₃ volumes r(n ₁,n ₂,n ₃) and f(n ₁,n ₂,n ₃), where n ₁ = − M ₁, …, M ₁, n ₂ = − M ₂, …, M ₂, n ₃ = − M ₃, …, M ₃ and N ₁ = 2M ₁ + 1, N ₂ = 2M ₂ + 1, N ₃ = 2M ₃ + 1. The 3D DFT of the two volumes is given by

$\begin{array}{lll}R\left({k}_1,{k}_2,{k}_3\right)&={\displaystyle \sum}_{n_1,{n}_2,{n}_3}r\left({n}_1,{n}_2,{n}_3\right){W}_{N_1}^{k_1{n}_1}{W}_{N_2}^{k_2{n}_2}{W}_{N_3}^{k_3{n}_3}\\&={A}_R\left({k}_1,{k}_2,{k}_3\right){e}^{\,j{\theta}_R\left({k}_1,{k}_2,{k}_3\right)}, \end{array}$

(14)

$\begin{array}{lll} F\left({k}_1,{k}_2,{k}_3\right)&={\displaystyle \sum}_{n_1,{n}_2,{n}_3}f\left({n}_1,{n}_2,{n}_3\right){W}_{N_1}^{k_1{n}_1}{W}_{N_2}^{k_2{n}_2}{W}_{N_3}^{k_3{n}_3}\\&={A}_F\left({k}_1,{k}_2,{k}_3\right){e}^{\,j{\theta}_F\left({k}_1,{k}_2,{k}_3\right)}, \end{array}$

(15)

where k ₁ = − M ₁, …, M ₁, k ₂ = − M ₂, …, M ₂, k ₃ = − M ₃, …, M ₃ and ${W}_{N_1}={e}^{-j\frac{2\pi }{N_1}},{W}_{N_2}={e}^{-j\frac{2\pi }{N_2}},{W}_{N_3}={e}^{-j\frac{2\pi }{N_3}}$ . A _R(k ₁,k ₂,k ₃) and A _F(k ₁,k ₂,k ₃) are amplitude components, and ${e}^{j{\theta}_R\left({k}_1,{k}_2,{k}_3\right)}$ and ${e}^{j{\theta}_F\left({k}_1,{k}_2,{k}_3\right)}$ are phase components. The normalized cross spectrum $\widehat{P}\left({k}_1,{k}_2,{k}_3\right)$ is defined as

$\widehat{P}\left({k}_1,{k}_2,{k}_3\right)=\frac{R\left({k}_1,{k}_2,{k}_3\right)\overline{F\left({k}_1,{k}_2,{k}_3\right)}}{\left|R\left({k}_1,{k}_2,{k}_3\right)\overline{F\left({k}_1,{k}_2,{k}_3\right)}\right|}={e}^{j\left\{{\theta}_R\left({k}_1,{k}_2,{k}_3\right)-{\theta}_F\left({k}_1,{k}_2,{k}_3\right)\right\}},$

(16)

where F(k ₁,k ₂,k ₃) denotes the complex conjugate of F(k ₁,k ₂,k ₃). The POC function $\widehat{p}\left({n}_1,{n}_2,{n}_3\right)$ between r(n ₁,n ₂,n ₃) and f(n ₁,n ₂,n ₃) is the 3D inverse DFT (3D IDFT) of $\widehat{P}\left({k}_1,{k}_2,{k}_3\right)$

Only gold members can continue reading. Log In or Register to continue