**THE STANDARD DEVIATION**

** of the mean is that they think that this attitude will bring them happiness. It will not.**

The standard deviation of a set of data is a measure of the spread of the data about the arithmetic mean value. It represents the average amount that each value differs from the mean.

* Alarm and despondency are spreading through the ranks of the RSPB (Royal Society for the Protection of Beards). The trustees of the organization are deeply concerned that stubbly “designer beards” have been observed infecting the chins of some members. These “shorty” appendages are strictly illegal, (minimum permissible length of 0.47m and willingness to house a homeless badger being the basic requirements for members). You have been employed to investigate suspect groups and report back to the Minimum Standards (Pogonophilia) Committee.*

## You collect the following data:

Membership Category |
Beard lengths in metres |
||||||||||

Ecologists |
0.1 | 0.9 | 0.4 | 0.5 | 0.5 | 0.5 | 0.5 | 0.6 | 0.5 | 0.5 | |

Druids |
0.4 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.6 | |

Eco-Warriors |
0.1 | 0.2 | 0.5 | 0.8 | 0.5 | 0.7 | 0.4 | 0.5 | 0.6 | 0.7 |

Plot out the data for each category of RSPB member as a line plot:

Axis is in metres. Mean Ecologist beard-length = 0.5m

Axis is in metres. Mean Druid beard-length = 0.5m

Axis is in metres. Mean Eco-Warrior beard-length = 0.5m

Notice that the arithmetic mean beard length of all groups is the same (0.5m). The data for each group are however obviously different in the way that individual data points are scattered about the mean.

Nearly all Ecologists have beards of 0.5m (the mean value) but there are a couple of illegally short beards and two bigger growths (one you will note is of such massive proportions that it could conceal a veritable menagerie of nesting creatures).

Druids are very consistent in their beard lengths. Eight out of ten measured growths are of 0.5m. There is just one shorter beard (probably a trainee) and one slightly bigger version belonging to a non-conformist druid.

The Eco-Warrior’s beard lengths are much more spread out around the mean, reflecting an inconsistant approach to bristle growth and (in 4 cases) *blatent* disregard for the constitution of the RSPB.

How can we describe the differences in the 3 data sets? What is needed is a measure that indicates the average amount that each piece of data is different from the mean. The standard deviation is just such a number.

Here is how it’s calculated:

x = a piece of data x-bar = the mean sigma = the sum of

The first column (from the left) is each piece of raw data.

The second column from the left is the distance of each piece of raw data from the mean (i.e. Each piece of data minus the mean).

Note that if we just added this column up the values would cancel each other out and we end up with a value of zero. Obviously this is no use as a measure of scatter of the data about the mean.

The way around this problem is to square each of the values from the second column. The negative values now become positive values and we can add them up. This has been done at the bottom of the third column (= 0.34).

So we now have a value (0.34) that represents the total deviation of all our Ecologists beard-lengths from the mean. If we divided that value by the number of items of data we would have a measure of the mean amount that each bit of data varies from the mean.

There is a slight complication though. If we could measure every single ecological beard in the world we could calculate the actual mean length of the whole population (**μ)**.

Our sample however, has only ten measurements in it. So to give us a more realistic estimate of the real mean we divide by one less than the number of samples. In our case that = 10 – 1 = 9. Statisticians call this value (n – 1) the degrees of freedom. You almost always use n – 1 to calculate standard deviation.

0.34/9 = 0.037

We now have a number (0.037) representing the scatter or spread of our Ecologist’s beard lengths about the mean value. Statisticians call this number the variance of the data and it is used in numerous ways that do not immediately concern us here.

Remember we squared all our differences from the mean to get rid of the negative values. To complete our calculation of the standard deviation we must take the square root to convert the number back to its original units.

√0.037 = 0.192 = standard deviation of Ecologist’s beard lengths in metres

The calculation we have just done can be represented by this formula:

Now, think back to our original beard length data. We needed a measure of the scatter of data points about the mean value. We now have it: The standard deviation. If we do the calculations for the Druids and the Eco-Warriors as well we get the following values:

Standard deviation (Ecologists) = 0.192

Standard deviation (Druids) = 0.047

Standard deviation (Eco-Warriors) = 0.221

Notice that the Druids (whose beards were all clustered around the mean) have a very small value.

The Ecologists (whose beards were mostly clustered around the mean but not to the same extent as the Druids) had a bigger value

The Eco-Warriors whose beard lengths were all over the place (i.e. widely scattered about the mean) have the biggest value of all.

Popular computer programmes (like Microsoft Excel for instance) will do all this and much more. Scientific calculators will also save you the tedium of multiple calculations if you know how to operate their statistical functions. Remember though that it might be a requirement that you show your working.

### Look out for Blog 21…….. it won’t be about statistics.

## Leave a Reply