# Relationship between standard deviation and root mean square error

### Standard deviation of residuals or Root-mean-square error (RMSD) (video) | Khan Academy

gested that the RMSE is not a good indicator of average model performance and might be a misleading The ratio of. RMSE to MAE its “truth” or “exact solution ”, with its standard deviation within 5 % to its truth, i.e. SE = 1. Variance, Root Mean Square (RMS), Mean Square Error and. Standard Error. definition of the sample variance leads to a biased estimate of population Note that the variance a2 is the average squared difference of a. In statistics, why is the standard deviation the most used measure of in ancient times, the square root of the square was easier to compute (say on a And in that situation, squared error really will give you the best estimate. we have a linear relationship between two variables, corrupted by cauchy.

## Mean squared error

We could consider this to be the standard deviation of the residuals and that's essentially what we're going to calculate. You could also call it the root-mean-square error and you'll see why it's called this because this really describes how we calculate it.

So, what we're going to do is look at the residuals for each of these points and then we're going to find the standard deviation of them. So, just as a bit of review, the ith residual is going to be equal to the ith Y value for a given X minus the predicted Y value for a given X.

## Difference between RMSE and SEE

Now, when I say Y hat right over here, this just says what would the linear regression predict for a given X? And this is the actual Y for a given X.

- Standard deviation of residuals or Root-mean-square error (RMSD)
- Root-mean-square deviation

So, for example, and we've done this in other videos, this is all review, the residual here when X is equal to one, we have Y is equal to one but what was predicted by the model is 2. Now, the residual over here you also have the actual point being higher than the model, so this is also going to be a positive residual and once again, when X is equal to three, the actual Y is six, the predicted Y is 2.

**Standard deviation of residuals or root mean square deviation (RMSD) - AP Statistics - Khan Academy**

So, you have six minus 5. So, once again you have a positive residual.

Now, for this point that sits right on the model, the actual is the predicted, when X is two, the actual is three and what was predicted by the model is three, so the residual here is equal to the actual is three and the predicted is three, so it's equal to zero and then last but not least, you have this data point where the residual is going to be the actual, when X is equal to two is two, minus the predicted.

Well, when X is equal to two, you have 2.

### Mean squared error - Wikipedia

So, two minus three is equal to negative one. This is no longer so, because the calculations are done by computer.

Now, to address your concerns about the standard deviation. To be honest, yes, the standard deviation is superficially less intuitive than the mean deviation. I don't think anyone's arguing that point.

However, the reason we teach it is because it is vastly more useful. There wouldn't be any point teaching you something, and saying, "Well, this is easy to understand, but there's no real reason to use it". The standard deviation just fits in nicely.

### New View of Statistics: RMSE

It's slightly more advanced than school stuff, but really, it'd be horrible to see how you'd have to state, say, the central limit theorem or Chebyshev's inequality using the mean deviation if you even could.

But here's a slightly more accessible illustration of why the standard deviation is in some way "good". Let's say that, instead of taking the standard deviation from the mean, we'll take it from an arbitrary point x.

That is, we take all the differences between our data points and x, square them, add them up, and square root. We get something that looks like a standard deviation, if we could pick the mean wherever we wanted. Now, if we plot the graph we get from this, it turns out there's a single point where the "standard deviation" is minimised. Any guesses as to what this point is in relation to our sample data?