ADVERTISEMENTS:
The following points highlight the four main measures of dispersion. They are: 1. Range2. Quartile Deviation3. Average Deviation or AD 4. Standard Deviation or SD.
1. Range:
Range is the difference between the highest and the lowest scores in a series. In the words of R. W. Marks, the range is “a set of all values that can be represented by a given variable or function in a specific mathematical statement.” According to Garrett, “Range is the interval between the highest and the lowest scores.” The range is the most simple and general measure of distribution.
The formula for calculating range is given below:
ADVERTISEMENTS:
Range = The highest score – the lowest score.
Suppose in a distribution, a highest score is 80 and the lowest score is 15.
Range = The highest score – the lowest score i.e., 80 – 15 = 65.
If in a series, the scores are 6, 18, 9, 57,43, 62, 98 then the range will be calculated as under:
ADVERTISEMENTS:
The highest score = 98
The lowest score = 6
The Range = The highest score – The lowest score
= 98-6 = 92
Steps in computing Range:
1. Find the highest score of the data.
2. Find the lowest score of the data.
3. Subtract the lowest score from the highest score.
4. Reporting the result.
Characteristics of Range:
It is only useful to compare groups roughly, when the measures are small.
It takes into account the extreme measures only.
It is not a reliable measure of variability.
Limitations of Range:
1. The range is not useful for the purpose of comparison when the groups are very large.
2. It is also not useful when there are many gaps in the distribution.
3. It is not considered as a realiable measure of dispersion.
When to use Range?
1. When the data are too scant or too scattered to justify the computation of a more precise measure of variability.
2. When a knowledge of extreme scores or of total spread is all that is required.
2. Quartile Deviation or Q:
Quartile Deviation or Q is sometimes known as the Semi-inter-quartile Range.
The Quartile Deviation or Q is one half of the scale distance between the 75th and 25th percentiles in a frequency distribution. The 25th percentile or Q, is the first quartile on the score scale, the point below which lie 25% of the scores. The 75th percentile or Q3 is the third quartile on the score scale, the point below which lie 75% of the scores. According to English & English, “Quartile deviation or Q is half of the distance between quartiles one and three.”
When we have these two points the Quartile Deviation or Q is found from the following formula:
Q = Q3 – Q1/2
(Quartile deviation or Q calculated from a frequency distribution).
To find Q, it is clear that we must first calculate the 75th and 25th percentiles. These statistics are found in exactly the same way as was the median, which is, of course, the 50th percentile or Q2. The only difference is that ¼ of N is counted off from the low end of the distribution to find Q1 and that ¾ N is counted off to find Q3.
The formulas are as under:
where,
L = the exact lower limit of the interval in which the quartile falls
i = the length of the internal
cum f= Cumulative frequency up to the interval which contains the quartile.
fq = The frequency on the interval containing the quartile.
Table I shows the computations needed to get Q in the distribution of 60 scores. First to find Q1, we count off ¼ of N or 15 from the low score end of the distribution. When the scores are added in order, the first three intervals (30- 34 through 40-44) contains a scores and take up upto 44.5. Q1, must fall on the next interval (45-49) which contains 7 scores.
From Table I we have that:
L = 44.5 exact lower limit of the interval on which Q1, falls
Cum f= 9, cumulated scores up to the interval containing Q1
fq = 7, the f on the interval on which Q1, falls
i= 5, the length of the interval
Substituting in formula of Q1 we have that
Q1 = 44.5+{15-9/7}5= 48.79
To find Q3 we count off ¾ N from the low score end of the distribution.
From Table I it is clear that ¾ N is 45; and that f’s on interval 30-40 through 60- 64, inclusive total 44. Q3 must fall on the next interval (65-69) which contains 7 scores.
Substituting the necessary data from Table I we have that:
L = 64.5, exact lower limit of interval which contains Q3
Table I:
The calculation of the Quartile Deviation or Q from data grouped into a frequency distribution:
A second illustration of the calculation of Q from a frequency distribution is given in Example 2. Table II. To find Q1 we count off ¼ of N (200) or 50 scores from the low-score end of the distribution. The intervals 103.5-107.5 and 107.5- 111.5, taken together, include 25 scores. Q1 therefore, must fall on the next interval, 111.5115.5, which contains 27 scores. These 27 scores when added to the 25 counted off total 52 -just 2 more than the 50 wanted.
From Table we find that:
L = 111.5, exact lower limit of the interval containing Q1,
¼ N = 50
Cum f = 25, sum of the scores upto the interval upon which Q1 falls
Fq = 27, number of scores on the interval containing Q1
Substituting in formula of Q1
The find Q3, we count off ¾ of N or 150 from the low-score end of the distribution, the first 4 intervals include 101 scores and Q3 falls on the next interval 119.5-123.5, which contains 52 scores.
Data from Table II are:
L = 119.5, exact lower limit of interval containing Q3
Cum f= 101, sum of scores upto interval which contains Q3
fq = 52,f on the interval on which Q3 falls
Substituting in formula of Q3
A second illustration of the calculation of Q from a frequency distribution is given in Example 2. Table II. To find Q1, we count off ¼ of N(200) or 50 scores from the low-score end of the distribution. The intervals 103.5-107.5 and 107.5- 111.5, taken together, include 25 scores. Q1, therefore, must fall on the next interval, 111.5-115.5, which contains 27 scores. These 27 scores when added to the 25 counted of total 52 – just 2 more than the 50 wanted.
From Table we find that:
L= 111.5, exact lower limit of the interval containing Q1
¼ N = 50
Cum f= 25, sum of the scores upto the interval which Q1 falls
fq = 27, number of scores on the interval containing Q1
Substituting in formula of Q1
To find Q3, we count off ¾ of N or 150 from the low-score end of the distribution, the first 4 intervals include 101 scores and Q3 falls on the next interval 119.5-123.5, which contains 52 scores. Data from table II are:
L = 119.5, exact lower limit of interval containing Q3
¼ N = 3/4×200=150
Cum f = 101, sum of scores upto interval which contains Q3
i=4
Substituting in formula of Q3
Substituting in formula of Q
3. Average Deviation or AD:
Example under Table III illustrates the calculation of the Average Deviation AD or MD in a frequency distribution. The calculation of the mean of the data is the first requisite. Then we take our deviations (x’s) of each separate score around this mean. Since the scores have been grouped into class-intervals, we are unable to get the deviation of each separate score the mean. In lieu of separate score deviations, therefore, we take the deviation of the midpoint of each interval from the mean.
In example under Table III when we start from above, the x for the first class interval is X-M= (82-57) 25; the x for the second class interval is (77-57) 20; and so on we can complete the column of x’s. From interval 50- 54 downwards all x’s are minus, as the midpoints of these intervals are all small than 57.
Now the number of scores varies from interval to interval. There are more scores on some intervals than on others, hence each midpoint deviation must be multiplied by the number of scores (f) which it represents. This gives us tire fx column. The first value of fx is 1 x 25 = 25; the second value is 3 x 20 = 60; and so on. All the values of fx are thus obtained by multiplying, in each case, the x by its corresponding fx. Now the fx column is added without regard to sign, and the result sum is divided by N to obtain the AD or MD.
In the above example AD is:
Summary of Steps in Calculation of AD from Grouped Data:
1. Find out the N.
2. Computation of midpoint of the class-interval.
3. Find out the total number of frequencies.
4. Find out the fX.
5. Find out the x (or X- M) for each class interval.
6. Find out the fx for each class interval.
7. Find out the ∑ fx by adding all fx values without regard to sign.
8. Devide ∑fx by N.
9. Checking and reporting result.
Note:
For calculation of the AD, the mean may also be computed by Assumed mean method.
Characteristics of the Average Deviation or Mean Deviation:
1. Average deviation is rigidly defined and its value is definite and precise.
2. It has to be ascertained whether it has been calculated by using the mean, median or mode as the central tendency.
3. The calculation is also not very difficult.
4. It is readily understood.
5. It is a clear average of the separate deviations.
6. It is calculated from all the observations of a series. It is not affected by the extreme items of the series.
7. It ignores the algebraic signs of the deviations and as such, it is not capable even if a single figure is missing from the data.
9. Average Deviation is not a very accurate measure of dispersion particularly when it is calculated from the median or mode.
10. A large Average Deviation or Mean Deviation signifies that the scores of the distribution are widely scattered around the central tendency, a small Average Deviation or Mean Deviation signifies that the scores of the distribution tend to be concentrated within a relatively narrow range.
When to Use the Average Deviation or the Mean Deviation:
This measure of dispersion is not in common use.
It may be used:
1. When it is desired to weight all deviations from the mean according to their size.
2. When extreme deviations would influence Standard Deviation unduly.
4. Standard Deviations or SD:
Of all the measures of variability, the Standard Deviation or SD is the most important and the most widely used measure of variability. It is commonly used in experimental work and research studies especially as an estimate of dispersion. Its value is least fluctuating and shows reliability.
The Standard Deviation may be defined as the square root of the mean of the sum squares of the deviations from the Mean. It is a popular measure of spread of scores in a distribution. It gives us greater stability. It is based upon square of deviations from the mean. It should be noted that the squared deviations used in computing the SD are always taken from the mean of the distribution, and never from the median or mode. The conventional symbol for the SD is the Greek letter sigma (a).
According to Tinker and Russell, “The square root of the mean of the squared deviation taken from the arithmetic mean of the distribution is called standard deviation.
Calculation of the SD from Ungrouped Data:
The formula for calculating the SD from ungrouped scores is as under:
We may illustrate the calculation of SD for an ungrouped set of data with the help of example given below:
Steps in the calculation of SD from Ungrouped Data:
1. First of all, calculate the mean of the data.
2. Find out deviation from the mean.
3. Square each deviation and find out the total.
4. Divide the sum and find out square root of the quotient. The result would be the standard deviation of the series.
Calculation of Standard Deviation (SD) from Grouped Data:
From the grouped data standard deviation can be calculated by two methods:
(A) Long Method.
(B) Short or Coded Method.
(A) Calculation of Standard Deviation (SD) by Long Method:
We may illustrate the calculation of SD from group data with the help of following example:
Steps for Calculating Standard Deviation (SD) by Long Method:
1. Find out midpoints (X) of the intervals.
2. Find out mean by long method.
3. Find out deviations of the midpoints from the meant (X) by using formula X-M and square them (x2)
4. Find out fx2.
5. Add all the fx1 and that will be ∑fx2.
6. Divide the S fx2 by the total number of cases {∑fx2/N}.
7. Find out square root of the quotient. The result would be the standard Deviation or SD.
(B) Calculation of Standard Deviation (SD) by Short or Coded Method:
The formula for calculating SD by short method is:
We may illustrate the calculation of SD by short method with the help of following example:
Steps for calculating standard deviation (SD) by short or coded method:
1. Assume a mean as near the centre of the distribution as possible and preferably on the interval containing the largest frequency. Place o against interval which contains mean.
2. Place 1,2,3, and so on above and below the zero coded value (x’), Minus signs are assigned to the figures written against the intervals below the intervals which contains zero.
3. Find out product of frequency.
4. Find out the coded value i.e., fx’ and the sum of fx’ i.e., the ∑fx’.
5. Find out fx’2 and the sum of fx’2 i.e., the ∑fx’2
6. Find out ∑fx’2/N and ∑fx’/N
7. Square the quotient of ∑fx’/N
8. Subtract the square of ∑fx’/N from ∑fx’2/N
9. Find out the square root of the remainder and multiply that by the size of the class interval. The product will be the SD.
Characteristics of the Standard Deviation (SD):
1. The Standard Deviation possesses most of the characteristics which an ideal measure of dispersion should have.
2. Its value is always definite and rigidly defined.
3. It is based on all the observations of the data.
4. It is amenable to further algebraic treatment and is used for calculating the higher statistics like Variance, Standard Scores, Standard Error, Co-efficient of variability and coefficient of correlation etc.
5. It is less affected by the fluctuations of sampling than most of other measures of dispersion.
6. As already stated the squaring of the deviations makes them positive and the difficulty about algebraic signs which was expressed in average or mean deviation is most found here.
7. The SD is larger than the AD or MD which is in turn larger than Q. These relationships supply a rough check upon the accuracy of the measures of variability.
8. The Standard Deviation has a definite relation to the Normal Distribution. The Standard Deviation is the distance from the Mean to the point of’ inflection’ on the shoulder of the curve.
9. In a normal distribution a distance of one SD on either side of the mean includes 34.13% of the curve or the number of cases, i.e., 68.26% of cases in a normal distribution curve lie not more than one SD above and below the Mean.
10. Very few cases fall more than 3 SD’s from the Mean, when the distribution is normal.
11. On the basis of the Mean and the Standard Deviation it is possible to assume the distribution.
12. In a normal curve in a moderately skrewed curve the Standard Deviation is about 1/6th of the range.
13. In a normal curve, the Mean Deviation is 7979 times the Standard Deviation.
14. The Standard Deviation is used to know the homogeneity or heterogeneity of two groups, when the same test has been administered on them.
However SD is not easy to calculate. Moreover it is not easily understood. It is most cumbersome in its calculation than the other measures of dispersion. It gives greater weightage to extreme items and less to those which are near the Mean, because the squares of the deviations, which are big in size, would be proportionately greater than the squares of those deviations which are comparatively small.
When to Use the Standard Deviation (SD):
1. When more accuracy and stability is required in the statistics.
2. When extreme deviations should exercise a proportionately greater effect upon the variability.
3. When coefficients of correlation and other statistics are subsequently to be computed.