M10. Dispersion, Skewness Correlation and Regression:

1. List of Formulae of partition values:

I. Individual Series:
1) Mean(ˉX) = Xn;           [∵ n = total numbers of observation]. 
2) Median (Md) = (n+12)th items. 
3) Mode = Maximum no of repeation. 
4) Q1 = Value of (n+14)th items. 
5) Q2 = Value of (n+12)th items.               [Q2 = Median] 
6) Q3 = Value of (3(n+1)4)th items. 
7) D3 = Value of (3(n+1)10)th items.           [called 3rd decile] 
8) P30 = Value of (30(n+1)100)th items.         [called 30th percentile]

II. Discrete Series:
1) Mean(ˉX) = fXN;           [∵ N = f = total frequency]. 
2) Median (Md) = (N+12)th items. 
3) Mode = Maximum no of repeation. 
4) Q1 = Value of (N+14)th items. 
5) Q2 = Value of (N+12)th items.               [Q2 = Median] 
6) Q3 = Value of (3(N+1)4)th items. 
7) D4 = Value of (4(N+1)10)th items.           [called 4th decile] 
8) P20 = Value of (20(N+1)100)th items.         [called 20th percentile]

III. Continuous Series:
1) Mean(ˉX) = fXN;           [∵ N = f = total frequency]. 
2) Median (Md) = l+N2c.ffh  [∵ Class interval is lies in (N2)th. items] 
[∵ l = lower limit of the class; N = total frequency; f = corresponding frequency; c.f. = cumulative frequency preceding the class; h = class size]. 
3) Mode(M0) = l+Δ1Δ1+Δ2h   [∵ Δ1 = f1f0;  Δ2 = f1f2
[∵ f1 = Maximum frequency; f0 = frequency preceding model class; f2 = frequency following model class] 
4) Mode = 3Median - 2Mean. 
5) Q1 = l+N4c.ffh 
6) Q2 = l+N2c.ffh          [Q2 = Median] 
7) Q3 = l+3N2c.ffh 
8) D8 = l+8N10c.ffh

2. Method of Measuring Dispersion:
I. - Range = L - S;  [∵ Largest item - Smallest item]. 
    - Coefficient of Range = LSL+S 
II. - Semi interquartile (Quartile Deviation) = Q3Q12 
     - Coefficient  of  Q. D. = Q3Q1Q3+Q1
III. Mean Deviation (M. D.)/Average Deviation:
1) M. D. from Mean = |XˉX|n          [for Discrete] 
                                  = f|XˉX|N        [for Continuous] 
2) M. D. from Median = |XMd|n      [for Discrete] 
                                     = f|XMd|N    [for Continuous] 
          3) Coefficient of M.D. from Mean = M.DfromMeanMean  
4) Coefficient of M.D. from Median = M.DfromMedianMedian

IV. Standard Deviation (S.D.):
1) Standard Deviation (σ) = (XˉX)2n=X2n(Xn)2   [for Discrete] 
2)  Standard Deviation (σ) = f(XˉX)2N=fX2N(fXN)2   [for Continuous] 
3) Variance (σ)2 = f(XˉX)2N = fX2N(fXN)2  
4) Coefficient of S.D. = S.D.Mean = σˉX 
5) Coefficient of Variation  =  S.D.Mean * 100   =  σˉX100

3. Skewness:
(Measure of central tendency gives the information about the concentration of the items around the central value). 
I. Measures of Skewness:
1) Karl Pearson's coefficient of Skewness (Sk(P)) = MeanModeStd.Deviation = ˉXM0σ 
(It is also called Pearsonial coefficient of Skewness). 
2)  Sk(P) = 3(MeanMedian)Std.Deviation = 3(ˉXMd)σ 
[∵ Mean - Mode = 3(Mean - Median)]

4. Correlation:
1) Karl Pearson's correlation Coefficient (r) = Cov(X,Y)Var(x).Var(Y) 
                                                                       = Σ(XˉX).(YˉY)Σ(XˉX)2.Σ(YˉY)2 
                                                                       = ΣxyΣx2.Σy2 
[∵ Where x=XˉX and y=YˉY]
2) Karl Pearson's correlation Coefficient (r) = nΣXYΣX.ΣYnΣX2(ΣX)2nΣY2(ΣY)2

5. Regression:
The regression is a mathematical measure of the average relationship between two or more variables in terms of the original data.
Graph: Regression Line - y on x
Line of Regression: In scatter diagram, we find the point of cluster around a curve called regression curve.
If a curve is a st. line it is called the regression line. 
The regression line can be expressed by two different algebraic equation, such as follows: 
i) Regression equation of y on x is y=a+bx; where b is known as regression coefficient (byx) of y on x. 
ii) Regression equation of x on y is x=a+by; where b is known as regression coefficient (bxy) of x on y. 
1) Correlation Coefficient between the two variables x and y is; 
r=byx.bxy ................................ (i)
2) Regression equation of y on x
yˉy=byx.(xˉx) .................................... (ii)  
[Note: Let the regression equation of y on x be, 
y=a+bx                                   (i)
Σy=na+bΣx
yn=a+bΣxn
ˉy=a+bˉx           (ii) 
Subtracting (ii) from (i); 
yˉy=byx(xˉx)             (iii) 
This equation (iii) is the regression equation of y on x. ]
3) Regression coefficient (byx) of y on x is, 
byx=nΣxyΣx.ΣynΣx2(Σx)2 .................................... (iii)
4) Regression equation of x on y
xˉx=bxy.(yˉy) .................................... (iv)
5) Regression coefficient (bxy) of x on y is, 
bxy=nΣxyΣx.ΣynΣy2(Σy)2 .................................. (v)
Return to Main Menu