Sample Standard Deviation vs. Population Standard Deviation

I have an HP 50g graphing calculator and I am using it to calculate the standard deviation of some data. In the statistics calculation there is a type which can have two values:


I didn’t change it, but I kept getting the wrong results for the standard deviation. When I changed it to “Population” type, I started getting correct results!

Why is that? As far as I know, there is only one type of standard deviation which is to calculate the root-mean-square of the values!

Did I miss something?


There are, in fact, two different formulas for standard deviation here: The population standard deviation σ and the sample standard deviation s.

If x1,x2,,xN denote all N values from a population, then the (population) standard deviation is
where μ is the mean of the population.

If x1,x2,,xN denote N values from a sample, however, then the (sample) standard deviation is
where ˉx is the mean of the sample.

The reason for the change in formula with the sample is this: When you’re calculating s you are normally using s2 (the sample variance) to estimate σ2 (the population variance). The problem, though, is that if you don’t know σ you generally don’t know the population mean μ, either, and so you have to use ˉx in the place in the formula where you normally would use μ. Doing so introduces a slight bias into the calculation: Since ˉx is calculated from the sample, the values of xi are on average closer to ˉx than they would be to μ, and so the sum of squares Ni=1(xiˉx)2 turns out to be smaller on average than Ni=1(xiμ)2. It just so happens that that bias can be corrected by dividing by N1 instead of N. (Proving this is a standard exercise in an advanced undergraduate or beginning graduate course in statistical theory.) The technical term here is that s2 (because of the division by N1) is an unbiased estimator of σ2.

Another way to think about it is that with a sample you have N independent pieces of information. However, since ˉx is the average of those N pieces, if you know x1ˉx,x2ˉx,,xN1ˉx, you can figure out what xNˉx is. So when you’re squaring and adding up the residuals xiˉx, there are really only N1 independent pieces of information there. So in that sense perhaps dividing by N1 rather than N makes sense. The technical term here is that there are N1 degrees of freedom in the residuals xiˉx.

For more information, see Wikipedia’s article on the sample standard deviation.

Source : Link , Question Author : Rafid , Answer Author : Mike Spivey

Leave a Comment