The Role of Centering in Dual Regression

Dual regression is maybe the simplest way to obtain subject-specific estimates of ICA-based resting-state networks (RSNs).  RSNs are regions of the brain that tend to act in a coordinated manner in the absence of a specific task, such as the three shown below (from a 50-component ICA of the Human Connectome Project). 

Popular ICA toolboxes for fMRI analysis like GIFT and MELODIC can perform dual regression, but such a simple approach is also straightforward to implement one one’s own.  I end up doing this a lot (and would love to hear in the comments whether others do too or prefer to use existing implementations).  This post is about the important role of centering in dual regression, which is easy to overlook (at least for me) when implementing this simple technique.

In short, given a set of group-level independent components (ICs) each representing a resting-state network, dual regression estimates the subject-level ICs and their time courses by fitting two regression models.  In Regression 1, the group ICs are regressed against the subject’s fMRI data to estimate the subject-level time courses associated with each IC.  In Regression 2, those estimated time courses are regressed against the subject’s fMRI data to obtain estimates of subject-level ICs (spatial maps).  To formalize it a bit:

T – number of volumes (time points) in fMRI timeseries

V – number of vertices or voxels within the brain

Q – number of ICs

\mathbf{Y}_i\ ({T\times V}) – fMRI timeseries for subject i

\mathbf{S}_{grp}\ ({Q\times V}) – group-level ICs

\mathbf{S}_i\ ({Q\times V}) – subject-level ICs (to be estimated)

\mathbf{A}_i\ ({T\times Q}) – subject-level IC time courses (to be estimated)

Based on the probabilistic ICA model \mathbf{Y}_i =\mathbf{A}_i\mathbf{S}_i +\mathbf{E}_i, dual regression fits the following two regression models:

Regression 1: \mathbf{Y}_i' =\mathbf{S}_{grp}'\mathbf{A}_i' +\mathbf{E}_{i1}'.

Note that \mathbf{Y}_i has been transposed so that the V locations are treated as data observations.  \mathbf{S}_{grp}' is the design matrix, and we obtain the OLS estimates of the coefficients, \hat{\mathbf{A}}_i. This is a multivariate (and multiple) regression model, since the response \mathbf{Y}_i' is a matrix with T columns. In the absence of additional assumptions, this is just equivalent to fitting T separate linear regression models, each with V data observations.  (This makes it very computationally convenient, since we can use matrix operations to estimate the entire coefficient matrix in one step as \hat{\mathbf{A}}_i' = (\mathbf{S}_{grp}\mathbf{S}_{grp}')^{-1}\mathbf{S}_{grp}\mathbf{Y}_i'.)

Regression 2: \mathbf{Y}_i = \hat{\mathbf{A}}_i\mathbf{S}_i +\mathbf{E}_{i2}.

Note that in this model, \mathbf{Y}_i is not transposed, so that the T time points, rather than the V locations, are treated as data observations.  \hat{\mathbf{A}}_i serves as the design matrix, and we obtain the coefficient estimates \hat{\mathbf{S}}_i. Again, this can be estimated efficiently in one step using matrix operations as \hat{\mathbf{S}}_i = (\hat{\mathbf{A}}_i'\hat{\mathbf{A}}_i)^{-1}\hat{\mathbf{A}}_i'\mathbf{Y}_i.

Since neither regression model includes an intercept, both the data and design matrix must be centered in order to avoid model misspecification.  And since each model involves a different data matrix and design matrix, we need to worry about centering 4 different matrices. First, though, a brief review of the role of the intercept and centering in linear regression. 

Why center in linear regression?

The goal of simple linear regression is to describe the linear relationship between an explanatory variable x and a response variable y.  The left-hand scatterplot below shows a classic motivating example. In this model, the true intercept is not zero, and fitting the model without the intercept results in a flawed estimate of the relationship between x and y (red line). The model that does include the intercept (blue line) correctly captures the relationship.

Generally, an intercept is required in linear regression, unless (1) there is reason to believe that the intercept is actually zero in truth, or (2) both the response and (each) predictor have been centered.  An example of the former might be the relationship between gestational age of a fetus and its size (although that relationship would likely not be linear).  If the response and predictor(s) have been centered, however, the intercept is exactly zero and can therefore be excluded from the model.  The right-hand scatterplot below shows the same data after centering x and y, overlaid with the estimated linear regression lines with and without an intercept.  The two fits are now exactly the same, showing that the intercept can be dropped from the model.

Dual regression requires centering across time and space

Coming back to dual regression, then, we should either include an intercept in each model OR properly center both the “data” and the “design matrix” in both regressions.  I find it simpler to do the latter and will therefore focus on centering.  A key point is that we need to center across observations in both regressions, where in Regression 1 the observations are the V locations, and in Regression 2 the observations are the T time points. Therefore, the data must be centered across both space (remove the global or gray matter signal) and time (remove the mean image, centering each time course at zero).

This double-centering can be performed in two steps, and it turns out that it doesn’t matter which is done first.  To see this, allow me to get a bit matrix-y for a moment.  Note that the mean across time for each location is \tfrac{1}{T}\mathbf{1}_T'\mathbf{Y}_i, and the data centered across time is \tilde{\mathbf{Y}}_i = \mathbf{Y}_i -  \tfrac{1}{T}\mathbf{1}_T\mathbf{1}_T'\mathbf{Y}_i =(\mathbf{I}_T -  \tfrac{1}{T}\mathbf{1}_T\mathbf{1}_T')\mathbf{Y}_i.  This thing (\mathbf{I}_T -  \tfrac{1}{T}\mathbf{1}_T\mathbf{1}_T') is a centering matrix, which is symmetric and idempotent.  Now to center the other way, we can do the same thing to \tilde{\mathbf{Y}}_i after transposing.  The data centered across time then space is \tilde{\tilde{\mathbf{Y}}}_i' = (\mathbf{I}_V -  \tfrac{1}{V}\mathbf{1}_V\mathbf{1}_V')\tilde{\mathbf{Y}}_i' =(\mathbf{I}_V -  \tfrac{1}{V}\mathbf{1}_V\mathbf{1}_V')\mathbf{Y}_i'(\mathbf{I}_T -  \tfrac{1}{T}\mathbf{1}_T\mathbf{1}_T').  By doing it the other way (centering across space first, then time), we would have ended up with (\mathbf{I}_T -  \tfrac{1}{T}\mathbf{1}_T\mathbf{1}_T')\mathbf{Y}_i(\mathbf{I}_V -  \tfrac{1}{V}\mathbf{1}_V\mathbf{1}_V'), which is just \tilde{\tilde{\mathbf{Y}}}_i.  This also shows that the second centering does not somehow “mess up” the first centering, since centering in either order leads to the same result.

This double-centering prior to Regression 1 (along with centering the group ICs \mathbf{S}_{grp} across locations, e.g. \tilde{\mathbf{S}}_{grp}' =(\mathbf{I}_V -  \tfrac{1}{V}\mathbf{1}_V\mathbf{1}_V')\mathbf{S}_{grp}') also happens to take care of the second regression, because \hat{\mathbf{A}}_i ends up centered.  To see this, note that \hat{\mathbf{A}}_i' = (\tilde{\mathbf{S}}_{grp}\tilde{\mathbf{S}}_{grp}')^{-1}\tilde{\mathbf{S}}_{grp}\tilde{\tilde{\mathbf{Y}}}_i'.  This can be written \hat{\mathbf{A}}_i' = (\tilde{\mathbf{S}}_{grp}\tilde{\mathbf{S}}_{grp}')^{-1}\tilde{\mathbf{S}}_{grp}\tilde{\mathbf{Y}}_i'(\mathbf{I}_T -  \tfrac{1}{T}\mathbf{1}_T\mathbf{1}_T'), where\tilde{\mathbf{Y}}_i' is only centered across locations (required for Regression 1).  The part of \hat{\mathbf{A}}_i' preceding the centering matrix is what we would have gotten if we had not also already centered \mathbf{Y}_i across time also.  Therefore, by centering \mathbf{Y}_i across time as well as space, our estimate of \mathbf{A}_i is also centered across time.  So both the data and design matrix for Regression 2 are centered, and we’re good to go.

What happens if we don’t center both ways?

Let’s see what happens if we fail to center the fMRI timeseries data both ways.  Using subject 100206 of the Human Connectome Project, we’ll use dual regression to estimate the subject-level IC’s associated with three group IC’s we showed at the beginning.  Each slideshow below shows one of the IC’s (motor, auditory and executive control), and toggles through the case when the data was centered across space only, time only, or both space and time.

The most blatantly obvious problem occurs when we center across space only, essentially removing the global or gray matter signal.  We haven’t removed the “mean image”, which is then essentially absorbed into the IC estimates.  When we center across time only, we remove the mean image but not the global signal.  In this case, we see more widespread “activation” in all three of the IC’s.  We’d also expect (perhaps even greater) contamination of the time courses associated with each IC in this case, since by failing to center across space, Regression 1 will be misspecified.

This slideshow requires JavaScript.

This slideshow requires JavaScript.

This slideshow requires JavaScript.

So in conclusion, before performing dual regression using your own implementation, center, center, and center some more!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: