<< Chapter < Page | Chapter >> Page > |
If the numbers come from a census of the entire population and not a sample, when we calculate the average of the squared deviations to find the variance, we divide by N , the number of items in the population. If the data are from a sample rather than a population, when we calculate the average of the squared deviations, we divide by n – 1 , one less than the number of items in the sample.
In these formulas, f represents the frequency with which a value appears. For example, if a value appears once, f is one. If a value appears three times in the data set or population, f is three. Two important observations concerning the variance and standard deviation: the deviations are measured from the mean and the deviations are squared. In principle, the deviations could be measured from any point, however, our interest is measurement from the center weight of the data, what is the "normal" or most usual value of the observation. Later we will be trying to measure the "unusualness" of an observation or a sample mean and thus we need a measure from the mean. The second observation is that the deviations are squared. This does two things, first it makes the deviations all positive and second it changes the units of measurement to that of the mean and the original observations. If the data are weights then the mean is measured in pounds, but the variance is measured in pounds-squared. One reason to use the standard deviation is to return to the original units of measurement by taking the square root of the variance. Further, when the deviations are squared it explodes their value. For example, a deviation of 10 from the mean when squared is 100, but a deviation of 100 from the mean is 10,000. What this does is place great weight on outliers when calculating the variance.
In a fifth grade class, the teacher was interested in the average age and the sample standard deviation of the ages of her students. The following data are the ages for a SAMPLE of n = 20 fifth grade students. The ages are rounded to the nearest half year:
9; 9.5; 9.5; 10; 10; 10; 10; 10.5; 10.5; 10.5; 10.5; 11; 11; 11; 11; 11; 11; 11.5; 11.5; 11.5;
The average age is 10.53 years, rounded to two places.
The variance may be calculated by using a table. Then the standard deviation is calculated by taking the square root of the variance. We will explain the parts of the table after calculating s .
Data | Freq. | Deviations | Deviations 2 | (Freq.)( Deviations 2 ) |
---|---|---|---|---|
x | f | ( x – ) | ( x – ) 2 | ( f )( x – ) 2 |
9 | 1 | 9 – 10.525 = –1.525 | (–1.525) 2 = 2.325625 | 1 × 2.325625 = 2.325625 |
9.5 | 2 | 9.5 – 10.525 = –1.025 | (–1.025) 2 = 1.050625 | 2 × 1.050625 = 2.101250 |
10 | 4 | 10 – 10.525 = –0.525 | (–0.525) 2 = 0.275625 | 4 × 0.275625 = 1.1025 |
10.5 | 4 | 10.5 – 10.525 = –0.025 | (–0.025) 2 = 0.000625 | 4 × 0.000625 = 0.0025 |
11 | 6 | 11 – 10.525 = 0.475 | (0.475) 2 = 0.225625 | 6 × 0.225625 = 1.35375 |
11.5 | 3 | 11.5 – 10.525 = 0.975 | (0.975) 2 = 0.950625 | 3 × 0.950625 = 2.851875 |
The total is 9.7375 |
The sample variance, s 2 , is equal to the sum of the last column (9.7375) divided by the total number of data values minus one (20 – 1):
The sample standard deviation s is equal to the square root of the sample variance:
which is rounded to two decimal places, s = 0.72.
Notification Switch
Would you like to follow the 'Business statistics -- bsta 200 -- humber college -- version 2016reva -- draft 2016-04-04' conversation and receive update notifications?