and inter subject analyses of brain functional Magnetic Resonance Images (fMRI)

associated to the m ^th stimulus type with the canonical HRF $\hat{x}_{c}$ , i.e. a composition of two gamma functions which reflects the BOLD signal best in the visual and motor cortices [8]. The GLM therefore reads:

$\displaystyle{ \left [\boldsymbol{y}_{1},\,\ldots,\,\boldsymbol{y}_{J}\right ] = \mathbf{X}\left [\boldsymbol{a}\,_{1},\,\ldots,\,\boldsymbol{a}\,_{J}\right ] + \left [\mathbf{b}_{1},\,\ldots,\,\mathbf{b}_{J}\right ] }$

(1)

where $\boldsymbol{y}_{j}$ is the fMRI time series measured in voxel V _jat times $(t_{n})_{n=1:N}$ and $\boldsymbol{a}\,_{j} \in \mathbb{R}^{M}$ defines the vector of BOLD effects in V _jfor all stimulus type m = 1: M. Noise $\mathbf{b}_{j}$ is usually modelled as a first-order autoregressive (i.e. AR(1)) process in order to account for the spatially-varying temporal correlation of fMRI data [5]: $b_{j,t_{n}} =\rho _{j}\,b_{j,t_{n-1}} +\varepsilon _{j,t_{n}},\,\forall j,\,t$ , with $\boldsymbol{\varepsilon }\,_{j} \sim \mathcal{N}\,\,(\mathbf{0}\,_{N},\sigma _{\varepsilon _{j}}^{2}\boldsymbol{I}\,_{N})$ , where $\mathbf{0}\,_{N}$ is a null vector of length N, and $\boldsymbol{I}\,_{N}$ stands for the identity matrix of size N. Then, the estimated BOLD magnitudes $\widehat{\boldsymbol{a}\,}_{j}$ in V _jare computed in the maximum likelihood sense by:

$\displaystyle{ \widehat{\boldsymbol{a}\,}_{j} =\mathop{ \mathrm{arg\,min}}_{\boldsymbol{a}\,\in \mathbb{R}^{M}}\|\boldsymbol{y}_{j} -\mathbf{X}\boldsymbol{a}\,_{j}\|_{\widehat{\sigma }_{\varepsilon _{ j}}^{-2}\widehat{\boldsymbol{\varLambda }\,}_{j}}^{2}, }$

(2)

where $\widehat{\sigma }_{\varepsilon _{j}}^{-2}\widehat{\boldsymbol{\varLambda }\,}_{j}$ defines the inverse of the estimated autocorrelation matrix of $\mathbf{b}_{j}$ ; see for instance [9] for details about the identification of the noise structure. Later, extensions that incorporate prior information on the BOLD effects $(\boldsymbol{a}\,_{j})_{j=1:J}$ have been developed in the Bayesian framework [10]. In such cases, vectors $(\widehat{\boldsymbol{a}\,}_{j})_{j=1:J}$ are computed using more computationally demanding strategies [10]. However, all these GLM-based contributions consider a unique and global model of the HRF shape while intra-individual differences in its characteristics have been exhibited between cortical areas [11].

2.2 Flexible GLM models

Intra-individual differences in the characteristics of the HRF have been exhibited between cortical areas in [11–14]. Although smaller than inter-individual fluctuations, this regional variability is large enough to be regarded with care. To account for these spatial fluctuations at the voxel level, one usually resorts to hemodynamic function basis. For instance, the canonical HRF $\hat{x}_{c}$ can be supplemented with its first and second time derivatives ( $\left [\hat{x}_{c}\,\vert \,\hat{x}_{c}'\,\vert \,\hat{x}''_{c}\right ]$ ) to model eg. differences in time-to-peak. Although powerful and elegant, flexibility is achievable at the expense of fewer effective degrees of freedom and decreased sensitivity in any subsequent statistical test. Importantly, in a GLM involving several regressors per condition, the BOLD effect becomes multivariate (i.e. $\boldsymbol{a}\,_{j}^{m} \in \mathbb{R}^{P}$ ) and the Student-t statistic can no longer be used to infer on differences $\boldsymbol{a}\,_{j}^{m}\! -\!\boldsymbol{a}\,_{j}^{n}$ between the m ^th and n ^th stimulus types. Rather, an unsigned Fisher statistic has to be computed, making direct interpretation of activation maps more difficult. Indeed, the null hypothesis is actually rejected whenever any of the contrast components deviates from zero and not specifically when the difference of the response magnitudes is far from zero.

2.3 Beyond parametric modelling

The localisation of brain activation strongly depends on the modelling of the brain response and thus of its estimation. Of course, the converse also holds: HRF estimation is only relevant in voxels that elicit signal fluctuations correlated with the paradigm. Hence, detection and estimation are intrinsically linked to each other. The key point is therefore to tackle the two problems in a common setting, i.e. to set up a formulation in which detection and estimation enter naturally and simultaneously. This setting cannot be the classical hypothesis testing framework. Indeed, the sequential procedure which consists in first estimating the HRF on a given dataset and then building a specific GLM upon this estimate for detecting activations in the same dataset, entails statistical problems in terms of sensitivity and specificity: the control of the false positive rate actually becomes hazardous due to the use of an erroneous number of degrees of freedom. Instead, we explore a Bayesian approach that provides an appropriate framework, called the Joint Detection Estimation (JDE) framework in what follows, to address both detection and estimation issues in the same formalism.

Regional non-parametric modelling of the BOLD signal

A spatially varying HRF model is necessary to keep a single regressor per condition, and thus enable direct statistical comparison ( $\widehat{\beta }_{j}^{m} -\widehat{\beta }_{j}^{n}$ ). The JDE framework proposed in [15, 16] allows us to introduce such a spatially adaptive GLM in which a local estimation of $\hat{x}$ is performed. To conduct the analysis efficiently, HRF estimation is carried out at a regional scale coarser than the voxel level. To properly define this scale, the functional brain mask is divided in K functionally homogeneous parcels using the parcellation technique proposed in [17]. This algorithm relies on the minimisation of a compound criterion reflecting both the spatial and functional structures and hence the topology of the dataset. The spatial similarity measure favours the closeness in the Talairach coordinates system. The functional part of this criterion is computed on parameters that characterise the functional properties of the voxels, for instance the fMRI time series themselves.

The number of parcels K is set by hand. The larger the number of parcels, the higher the degree of within-parcel homogeneity but potentially the lower the signal-to-noise ratio (SNR). To objectively choose an adequate number of parcels, Bayesian information criterion (BIC) and cross validation techniques have been used in [18] on an fMRI study of ten subjects. The authors have shown converging evidence for K ≈ 500 for a whole brain analysis leading to typical parcel sizes around a few hundreds voxels. Importantly, since the parcellation is derived at the group level, there is a one-to-one correspondance of parcels across subjects, as shown in Fig. 1.

Fig. 1

Sagittal views of a color-coded multi-subject parcellation. Left: Subject 1. Right: Subject 2

The parcel-based model of the BOLD signal introduced in [15, 16] is illustrated in Fig. 2. As shown, this model supposes that the HRF shape $\hat{x}$ is constant within a parcel, while the magnitudes of activation a _j^mcan vary in space and across stimulus types. Moreover, the model is said non-parametric since no model is assumed for the impulse response $\hat{x}$ , which has therefore to be identified in each parcel. Let $\mathcal{P}\,\, = (V _{j})_{j=1:J}$ be the current parcel. Then, the generative BOLD model reads:

$\displaystyle{ \boldsymbol{y}_{j} =\sum _{ m=1}^{M}a_{ j}^{m}\mathbf{X}^{m}\hat{x} + \boldsymbol{P}\,\boldsymbol{\ell}\,_{ j} + \mathbf{b}_{j},\quad \forall \,j,\ V _{j} \in \mathcal{P}\,\,. }$

(3)

$\mathbf{X}^{m}$ denotes the N × (D + 1) binary matrix that codes the onsets of the m ^th stimulus. Vector $\hat{x} \in \mathbb{R}^{D+1}$ represents the unknown HRF shape in $\mathcal{P}\,\,$ . The term $\boldsymbol{P}\,\ell_{j}$ models a low-frequency trend to account for physiological artifacts and noise $\mathbf{b}_{j} \backsim \mathcal{N}\,\,(\mathbf{0}\,_{N},\sigma _{\varepsilon _{j}}^{2}\boldsymbol{\varLambda }\,_{j}\mathcal{M})$ stands for the above mentioned AR(1) process.

Fig. 2

Regional model of the BOLD signal in the JDE framework. The neural response levels a _j^mmatch with the BOLD effects a _j^m

Spatial mixture modelling and Bayesian inference

The HRF shape $\hat{x}$ and the associated BOLD effects $(\boldsymbol{a}\,_{j})_{j=1:J}$ are jointly estimated in $\mathcal{P}\,\,$ . Since no parametric model is considered for $\hat{x}$ , a smoothness constraint on the second order derivative is introduced to regularise its estimation; see [15]. On the other hand, our approach also aims at detecting which voxels in $\mathcal{P}\,\,$ elicit activations in response to stimulation. To this end, prior mixture models are introduced on $(\boldsymbol{a}\,^{m})_{m=1:M}$ to segregate activating voxels from the non-activating ones in a stimulus-specific manner (i.e. for each m). In [16], it has been shown that Spatial Mixture Models (SMMs) make it possible to recover clusters of activation instead of isolated spots and hence to account for spatial correlation in the activation detection process without smoothing the data. As our approach stands in the Bayesian framework, other priors are formulated upon every other sought object in model (3). The reader is referred to [15, 16] for their expressions. Finally, inference is based upon the full posterior distribution $p(\hat{x},(\boldsymbol{a}\,_{j}),(\boldsymbol{\ell}\,_{j}),\boldsymbol{\varTheta }\,\,\vert \,\mathbb{Y})$ , which is sampled using a Gibbs sampling scheme [16]. Posterior mean (PM) estimates are therefore computed from these samples according to: $\widehat{x}^{\mathrm{PM}} =\sum _{ k=L_{0}}^{L_{1}}x^{(k)}/L,\quad \forall \,x \in \left \{\hat{x},(\boldsymbol{a}\,_{j}),\boldsymbol{\varTheta }\,\right \}$ where $L = L_{1} - L_{0} + 1$ and L ₀ stands for the length of the burn-in period. Note that this estimation process has to be repeated over each parcel of each subject’s brain. Since the fMRI data are considered spatially independent across parcels, parallel implementation makes the computation faster: whole brain analysis is achievable in about 60 mn for N = 125 and K = 500.

2.4 Illustration on a real fMRI dataset

Data acquisition

Real fMRI data were recorded in fifteen volunteers during an experiment, which consisted of a single session of N = 125 scans lasting $\mathit{TR} = 2.4$ s each. The main goal of this experiment was to quickly map several brain functions such as motor, visual and auditory responses, as well as higher cognitive functions like computation. Here, we only focus on the auditory and visual experimental conditions and so on the auditory-visual contrast of interest (referenced as A. − V. ).

Results

We compare the BOLD effect estimates for the two within-subject analyses under study. Fig. 3 clearly emphasizes for the A. − V. contrast that the JDE method achieves a better sensitivity (bilateral activations) in comparison with GLM-based inference when processing unsmoothed data. Indeed, the BOLD effects $\widehat{d}_{j}^{\mathrm{A.-V.}}$ have higher values in Fig. 3(b) and appear more enhanced. This is partly due to the modeling of spatial correlation using SMM in the JDE framework. As shown in Fig. 3(c)-[red line], notice that the HRF estimate $\widehat{\hat{x}}_{\mathcal{P}\,\,}$ computed in the mostly activating parcel deviates from the canonical shape depicted in Fig. 3(c)[green line].