Mean Squared Error (MSE)

In plain English: When you are fitting a model to your data, you are naturally interested in how “far” your estimated model parameters are from the true values. The Mean Squared Error (MSE) of an estimator (for instance,  β0-hat and  β1-hat for a simple linear regression) is the difference between the values of an estimator (β1-hat) and the true values of the quantity being estimated ( β1). MSE is a risk function, which quantifies this error by averaging the squares of errors of the corresponding values.

It is the second moment (about the origin) of the error, and thus incorporates both the variance and the bias of the estimator.

If we have an unbiased estimator, therefore, the MSE is equal to the variance of the estimator.

The gist is that MSE is one of the most popular ways to quantify the “performance” of an estimator. It is a known, computed quantity given a particular sample, and is therefore “sample-dependent.”

When used in regression analysis, we use the expression MSE to refer to the residual sum of squares ∑( y  –  y-hat )² divided by either the sample size of “external data” or the degrees of freedom. In the latter case, the MSE gauges the performance of a regression model in the sense that y-hat is an unbiased estimator for y.

Source: Wikipedia, often the best resource out there.

(I will continue to post these as they come up in my work and studies. In my mind, these atomic pieces of knowledge in statistics are useful because they complete the details of the map.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s