False Discovery Rate in Signal Space for Transformation-Invariant Thresholding of Statistical Maps

where X is an m-by-n ($$m>n$$” src=”/wp-content/uploads/2016/09/A339424_1_En_10_Chapter_IEq2.gif”></SPAN>) full column-rank matrix, <SPAN class=EmphasisTypeItalic>Y</SPAN> is a column vector of length <SPAN class=EmphasisTypeItalic>m</SPAN>, <SPAN id=IEq3 class=InlineEquation><IMG alt= is a column vector of length n, and $$\varepsilon _{i}$$s ($$i=1,\cdots ,m$$) are identical and independently distributed (i.i.d.) variables following a Gaussian distribution $$N(0,\sigma ^{2})$$. In research applications, Y may be an fMRI time serials at a voxel location and X may be the design matrix of functional tasks, or in a tensor-based morphometry (TBM) study Y may be subjects’ Jacobian determinant at a voxel location and X may be a factor matrix of age, gender or disease states, etc. Please note:




  • We assume that X is full column-rank so that the inverse of $$X^{\intercal }X$$ exists.


  • We assume that $$m>n$$” src=”/wp-content/uploads/2016/09/A339424_1_En_10_Chapter_IEq8.gif”></SPAN> so that it is possible to estimate <SPAN id=IEq9 class=InlineEquation><IMG alt=.


The maximum-likelihood and unbiased estimate of $$\beta $$ is


$$\begin{aligned} \hat{\beta }\equiv & {} \left( X^{\intercal }X\right) ^{-1}X^{\intercal }Y=\beta +\left( X^{\intercal }X\right) ^{-1}X^{\intercal }\varepsilon . \end{aligned}$$
The residuals and the unbiased estimate of $$\sigma ^{2}$$ are


$$\begin{aligned} {\left\{ \begin{array}{ll} \hat{\varepsilon } &{} \equiv Y-X\hat{\beta }=\left[ I-X\left( X^{\intercal }X\right) ^{-1}X^{\intercal }\right] \varepsilon ,\\ \hat{\sigma }^{2} &{} \equiv \frac{\hat{\varepsilon }^{\intercal }\hat{\varepsilon }}{df},\text { where }df=m-n. \end{array}\right. } \end{aligned}$$
Let us focus on the probabilistic and geometric properties of $$\hat{\varepsilon }$$.



1.

It follows a $$\sigma ^{2}$$-variance isotropic multi-variate Gaussian distribution embedded in the null space of X’s columns. More specifically, there exists an m-by-df matrix Z which satisfies $$X^{\intercal }Z=0$$ and $$Z^{\intercal }Z=I$$, such that $$\tilde{\varepsilon }\equiv Z^{\intercal }\hat{\varepsilon }\sim N(0,\sigma ^{2}I_{df\times df})$$ and $$\hat{\varepsilon }=Z\tilde{\varepsilon }$$.

 

2.

It is independent of $$\hat{\beta }$$.

 

3.

Its normalized vector $$\hat{u}=\hat{\varepsilon }/|\hat{\varepsilon }|$$ uniformly distributes on a unit hyper-sphere in the null space of X’s columns, independent of $$\sigma ^{2}$$ and $$\hat{\sigma }^{2}$$.

 

Property 1 is the most insightful and it easily derives properties 2 and 3. We outline its proof as follows:

1.

Because X is a full column-rank m-by-n matrix, its column null space has $$df=m-n$$ dimensions.

 

2.

Define Z as an m-by-df matrix whose columns are a set orthonormal bases of the null space of X’s columns. By definition, Z satisfies $$X{}^{\intercal }Z=0$$ and $$Z{}^{\intercal }Z=I$$.

 

3.

Define $$\tilde{\varepsilon }\equiv Z^{\intercal }\hat{\varepsilon }$$, then this df-element random vector follows a Gaussian distribution $$N(0,\sigma ^{2}I_{df\times df})$$, because:



  • $$\tilde{\varepsilon }\equiv Z^{\intercal }\hat{\varepsilon }=Z^{\intercal }\left[ I-X\left( X^{\intercal }X\right) ^{-1}X^{\intercal }\right] \varepsilon =Z^{\intercal }\varepsilon $$, as a linear combination of $$\varepsilon $$, follows a multi-variate Gaussian distribution;


  • The expected value of $$\tilde{\varepsilon }$$ is $$E\tilde{\varepsilon }=Z^{\intercal }E\varepsilon =0$$;


  • The variance of $$\tilde{\varepsilon }$$ is $$E\tilde{\varepsilon }\tilde{\varepsilon }^{\intercal }=E\left[ Z^{\intercal }\varepsilon \varepsilon ^{\intercal }Z\right] =Z^{\intercal }E\left[ \varepsilon \varepsilon ^{\intercal }\right] Z=I\sigma ^{2}$$.

 

4.

Z also satisfies $$ZZ^{\intercal }=I-X\left( X^{\intercal }X\right) ^{-1}X^{\intercal }$$ because:



  • Both $$ZZ^{\intercal }\left[ \begin{array}{cc} X&Z\end{array}\right] $$ and $$\left[ I-X\left( X^{\intercal }X\right) ^{-1}X^{\intercal }\right] \left[ \begin{array}{cc} X&Z\end{array}\right] $$ equal $$\left[ \begin{array}{cc} 0&Z\end{array}\right] $$;


  • $$\left[ \begin{array}{cc} X&Z\end{array}\right] $$ is a full-rank m-by-m matrix so its inverse exists;


  • Both $$ZZ^{\intercal }$$ and $$\left[ I-X\left( X^{\intercal }X\right) ^{-1}X^{\intercal }\right] $$ equal $$\left[ \begin{array}{cc} 0&Z\end{array}\right] \left[ \begin{array}{cc} X&Z\end{array}\right] ^{-1}$$.

 

5.

$$\hat{\varepsilon }$$ equals $$Z\tilde{\varepsilon }$$ because


$$\begin{aligned} \hat{\varepsilon }&=\left[ I-X\left( X^{\intercal }X\right) ^{-1}X^{\intercal }\right] \varepsilon =ZZ^{\intercal }\varepsilon =Z\tilde{\varepsilon }. \end{aligned}$$

 



2.2 Weighted FDR in Volume


Let $$R_{pos}$$ denote the detected region, $$R_{tru}$$ the underlying truth, and $$\left| \bullet \right| $$ the volume of a region. The volume-based FDR is defined as follows


$$\begin{aligned} FDR\equiv E\left[ \frac{\left| R_{pos}\setminus R_{tru}\right| }{\left| R_{pos}\right| }\right] \text { where }\frac{\left| R_{pos}\setminus R_{tru}\right| }{\left| R_{pos}\right| }\equiv 0\text { if }\left| R_{pos}\right| =0. \end{aligned}$$

(1)
Genovese, Lazar, and Nichols [9] defined the volumetric measure in the image space, and consequently it can be translated as the number of voxels in voxel-based analysis. Benjamini and Hochberg’s step-up procedure [2] was applied to control the FDR. The step-up procedure finds


$$\begin{aligned} k^{*}=\max \{k|\frac{p_{(k)}N}{k}\leqslant q\}, \end{aligned}$$
where q is the user specified FDR level, $$p_{(k)}$$ is the k-th smallest voxel p-value, and N is the number of voxels. This step-up procedure is able to handle positive dependence among tests, as Benjamini and Yekutieli discussed in [4]. For more general dependence among tests, please refer to [4].

Benjamini and Hochberg (1997) [3] upgraded it to a weighted version whose FDR and control procedure are


$$\begin{aligned} FDR\equiv E\left[ \frac{\sum _{i\in R_{pos\setminus R_{tru}}}w_{i}}{\sum _{i\in R_{pos}}w_{i}}\right] ,\text { }k^{*}=\max \{k|\frac{p_{(k)}\sum _{i=1}^{N}w_{(i)}}{\sum _{i=1}^{k}w_{(i)}}\leqslant q\}, \end{aligned}$$

(2)
where $$w_{i}$$ is the weight associated with a voxel.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Sep 16, 2016 | Posted by in GENERAL RADIOLOGY | Comments Off on False Discovery Rate in Signal Space for Transformation-Invariant Thresholding of Statistical Maps

Full access? Get Clinical Tree

Get Clinical Tree app for offline access