Table of Contents
What is Standard Deviation?
Standard deviation is the root of sum of the squares of deviations divided by their numbers. It is also called ‘Mean error deviation’, Mean square error deviation or Root mean square deviation. It is a second moment of dispersion. Since the sum of squares of deviations from the mean is a minimum, the deviations are taken only from the mean (but not from median and mode).
The standard deviation is Root Mean Square (RMS) average of all the deviations from the mean. It is denoted by sigma (σ).
Standard Deviation Formula
For discrete series without frequency it is given by:
\[Variance=\frac{\sum{{{\left( X-\overline{X} \right)}^{2}}}}{N}………..(1)\]
\[\sigma =\sqrt{Variance}\]
For discrete series with frequency, it is given by:
\[Variance=\frac{\sum{{{\left( X-\overline{X} \right)}^{2}}f}}{\sum{f}}……….(2)\]
\[\sigma =\sqrt{Variance}\]
Where, ‘X’ is the mid value of class interval for continuous series.
In case of grouped data, alternative form (1) & (2) are the followings –
\[For(1)\to Variance=\frac{\sum{{{d}^{2}}}}{\sum{f}}-{{\left( \overline{d} \right)}^{2}}\]
\[\sigma =\sqrt{Variance}\]
\[For(2)\to Variance=\left( \frac{\sum{f{{d}^{2}}}}{\sum{f}}-{{\left( \frac{\sum{fd}}{\sum{f}} \right)}^{2}} \right)\times {{\left( C.f. \right)}^{2}}\]
\[\sigma =\sqrt{Variance}\]
\[Where,d=\frac{X-A}{C.F.}\]
\[A=Assumed-mean,C.F.=Class-width\]
Note: The Square of standard deviation is called variance. It is denoted by σ2.
Properties of Standard Deviation
1. It is independent of origin but not independent of scale.
2. Standard deviation is always non negative value.
3. It is the least of all root-mean-square deviations.
Combined Standard Deviation
Suppose the mean of n1 values is X’1 and that of n2 values is X’2 and standard deviation of the n1 and n2 values is σ1 and σ2 respectively. Then the combined standard deviation of both the values is given by:
\[Variance=\frac{{{n}_{1}}\left( {{\sigma }_{1}}^{2}+{{d}_{1}}^{2} \right)+{{n}_{2}}\left( {{\sigma }_{2}}^{2}+{{d}_{2}}^{2} \right)}{{{n}_{1}}+{{n}_{2}}}\]
\[\sigma =\sqrt{Variance}\]
\[Where,{{d}_{1}}=\overline{X}-\overline{{{X}_{1}}},and,{{d}_{2}}=\overline{X}-\overline{{{X}_{2}}}\]
\[\overline{X}=combined-mean-of,{{n}_{1}}\And {{n}_{2}}\]
Advantages of Standard Deviation
It is –
1. Rigidly defined
2. Based on all values
3. Capable of further algebraic treatment
4. Not very much affected by sampling fluctuations
Disadvantages of Standard Deviation
It is –
1. Difficult to understand
2. Gives undue weightage for extreme values
3. Can’t be calculated for classes with open end interval
Example 01 |
The means of two samples of sizes 50 and 100 respectively are 54.1 and 50.3 and there standard deviations are 8 and 7 respectively. Obtain the SD for combined group.
Solution:
\[Given,{{n}_{1}}=50,\overline{{{X}_{1}}}=54.1,{{\sigma }_{1}}=8\]
\[{{n}_{2}}=100,\overline{{{X}_{{}}}}=50.3,{{\sigma }_{2}}=7\]
\[Now,\overline{X}=\frac{{{n}_{1}}\overline{{{X}_{1}}}+{{n}_{2}}\overline{{{X}_{2}}}}{{{n}_{1}}+{{n}_{2}}}\]
\[\Rightarrow \overline{X}=\frac{\left( 50\times 54.1 \right)+\left( 100\times 50.3 \right)}{50+100}=51.57\]
\[Variance=\frac{{{n}_{1}}\left( {{\sigma }_{1}}^{2}+{{d}_{1}}^{2} \right)+{{n}_{2}}\left( {{\sigma }_{2}}^{2}+{{d}_{2}}^{2} \right)}{{{n}_{1}}+{{n}_{2}}}\]
\[i.e.,{{\sigma }^{2}}=\frac{{{n}_{1}}\left( {{\sigma }_{1}}^{2}+{{d}_{1}}^{2} \right)+{{n}_{2}}\left( {{\sigma }_{2}}^{2}+{{d}_{2}}^{2} \right)}{{{n}_{1}}+{{n}_{2}}}\]
\[Where,{{d}_{1}}=\overline{X}-\overline{{{X}_{1}}},and,{{d}_{2}}=\overline{X}-\overline{{{X}_{2}}}\]
\[\therefore {{d}_{1}}=54.1-51.57=2.53\]
\[\therefore {{d}_{2}}=50.3-51.57=-1.27\]
\[\Rightarrow {{d}_{1}}^{2}=6.40,{{d}_{2}}^{2}=1.61\]
\[\therefore {{\sigma }^{2}}=\frac{50\left( {{8}^{2}}+6.40 \right)+100\left( {{7}^{2}}+1.61 \right)}{50+100}\]
\[\Rightarrow {{\sigma }^{2}}=\frac{3520+5061}{150}=\frac{8521}{150}=57.20\]
\[\therefore \sigma =\sqrt{57.20}=7.56\]
Therefore, the Standard Deviation is 7.56.
Example 02 |
Ten students of a class have obtained the following marks in a particular subject out of 100. Calculate SD: 5, 10, 20, 25, 40, 42, 45, 48, 70, 80
Solution:
\[\overline{X}=\frac{\sum{X}}{N}=\frac{385}{10}=3805\]
Sl. No. | Marks (X) | \[d=X-\overline{X}\] | \[{{\left( X-\overline{X} \right)}^{2}}\] |
1. | 5 | -33.5 | 1122.25 |
2. | 10 | -28.5 | 812.25 |
3. | 20 | -18.5 | 342.25 |
4. | 25 | -13.5 | 182.25 |
5. | 40 | 1.5 | 2.25 |
6. | 42 | 3.5 | 12.25 |
7. | 45 | 6.5 | 42.25 |
8. | 48 | 9.5 | 90.25 |
9. | 70 | 31.5 | 992.25 |
10. | 80 | 41.5 | 1722.25 |
∑ X = 385 | \[\sum{{{\left( X-\overline{X} \right)}^{2}}=\sum{{{d}^{2}}=5320.50}}\] |
\[\therefore Variance=\frac{\sum{{{\left( X-\overline{X} \right)}^{2}}}}{N}\]
\[\Rightarrow \sigma =\sqrt{\frac{\sum{{{\left( X-\overline{X} \right)}^{2}}}}{N}}=\sqrt{\frac{5320.50}{10}}=23.066\]
Hence the SD is 23.066
Example 03 |
The diastolic blood pressures of men are distributed as shown in table. Find the SD and variance.
Pressure | 78-80 | 80-80 | 82-84 | 84-86 | 86-88 | 88-90 |
No. of Men | 3 | 15 | 26 | 23 | 9 | 4 |
Solution:
The table represents the frequency distribution of data required for calculating the standard deviation.
Class Interval | Mid Value (X) | Frequency (f) | X-83 | d=(X-83)/2 | d2 | fd | fd2 |
78-80 | 79 | 3 | -4 | -2 | 4 | -6 | 12 |
80-82 | 81 | 15 | -2 | -1 | 1 | -15 | 15 |
82-84 | 83 | 26 | 0 | 0 | 0 | 0 | 0 |
84-86 | 85 | 23 | 2 | 1 | 1 | 23 | 23 |
86-88 | 87 | 9 | 4 | 2 | 4 | 18 | 36 |
88-90 | 89 | 4 | 6 | 3 | 9 | 12 | 36 |
∑ f = 80 | ∑ fd = 32 | ∑ fd2 = 122 |
\[\therefore {{\sigma }^{2}}=\left( \frac{\sum{f{{d}^{2}}}}{\sum{f}}-{{\left( \frac{\sum{fd}}{\sum{f}} \right)}^{2}} \right)\times {{\left( C.f. \right)}^{2}}\]
\[\therefore {{\sigma }^{2}}=\left( \frac{122}{80}-{{\left( \frac{32}{80} \right)}^{2}} \right)\times {{\left( 2 \right)}^{2}}=\left( 1.525-0.16 \right)\times 4=5.46units\]
Therefore, Variance = 5.46 units
SD = σ = 2.337
;