I was recently looking online for an explicit derivation of leverage — the diagonals of the “hat matrix” — in simple linear regression. If X is the n-times-p matrix of explanatory variables in a linear model, then the hat matrix is H=X(X’X)-1X’, so called because it puts the “hat” on the predicted values, since Ŷ = HY.
Leverage (hi) has a lot of nice properties that can be quite informative for model diagnostics. For example, if Yi were to change by 1 unit, then Ŷi will change by hi. Leverage is also proportional to the uncertainty in the predicted values Ŷi, since Var(Ŷ)=σ2H, where σ2 is the variance of the model residuals.
In the simple linear regression (SLR) case, if your model contains an intercept then leverage is of the form
Therefore, leverage is similar to Mahalanobis distance, which measures distance from the “center” of the data, standardized by the “scale” or variance. The same is true for multivariate linear regression with an intercept. However, the relationship isn’t entirely obvious (at least to me) from the hat matrix formula, and I couldn’t find the proof anywhere online (I’m sure it’s in someone’s linear models class notes but I didn’t find it from my digging around). So I wrote it up and posted it to my Resources page.
One interesting thing to note: if your model does not contain an intercept, this relationship doesn’t hold. However, if you center the X’s, the relationship above again holds, but without the 1/n term at the beginning.