Home

Mean Squared Error (MSE)

In plain English: When you are fitting a model to your data, you are naturally interested in how “far” your estimated model parameters are from the true values. The Mean Squared Error (MSE) of an estimator (for instance,  β0-hat and  β1-hat for a simple linear regression) is the difference between the values of an estimator (β1-hat) and the true values of the quantity being estimated ( β1). MSE is a risk function, which quantifies this error by averaging the squares of errors of the corresponding values.

It is the second moment (about the origin) of the error, and thus incorporates both the variance and the bias of the estimator.

If we have an unbiased estimator, therefore, the MSE is equal to the variance of the estimator.

The gist is that MSE is one of the most popular ways to quantify the “performance” of an estimator. It is a known, computed quantity given a particular sample, and is therefore “sample-dependent.”

When used in regression analysis, we use the expression MSE to refer to the residual sum of squares ∑( y  –  y-hat )² divided by either the sample size of “external data” or the degrees of freedom. In the latter case, the MSE gauges the performance of a regression model in the sense that y-hat is an unbiased estimator for y.

Source: Wikipedia, often the best resource out there.

(I will continue to post these as they come up in my work and studies. In my mind, these atomic pieces of knowledge in statistics are useful because they complete the details of the map.)