In the preceding chapter, the tests were based
on the assumption that the samples were drawn from a normally or approximately
normally distributed population. These types of tests are said to be parametric
tests. In some situations where it is not possible to assume a particular type
of population distribution from which samples are drawn. It is required to test
the frequency of objects falling in specified ranges. For example, in socio-economic
studies, we assign a number of families in different income levels, in production
management studies, we assign a number of defective products, in market research, we assign a number of favored and disfavored products, etc.
These types of tests are possible with the help
of a non-parametric test or Chi-square test. It is denoted by the Greek letter x2
and was developed by Karl Pearson in 1990. This theory describes
Where
O = Observed frequency
E = Expected frequency
The degree of freedom is merely the numbers of data
which are given as variables in a row or column or frequencies that are put in a contingency
table and can be calculated independently. It is denoted by v. The degree of freedom
of χ2 is developed in two independent
ways:
Case I:
If observed
frequencies are presented in series (i.e., in the form of row or column), in this
case degree of freedom is developed by v = n – 1. Here 'n' is the number of variables
in the series in a given row or column.
Case II:
If observed frequencies are presented in the form
of a contingency table (i.e. in the form of rows as well as columns), in this case, the degree of freedom is developed by v = (r - 1) (c - 1). Here 'r' and 'c'
represent the numbers of rows and columns respectively.
Note:
The standard form of degree of freedom in χ2 distribution is given by v = n-1-
k1 - k2.
Here,
(i) 1d.f. is lost due to the linear constraints
(ii) k1 d.f. is lost
due to the number of estimated parameters when parameters are not given in binomial
and Poisson distribution. If parameters are given, we take k1 as
zero.
(iii) k2 d.f. is lost due to the pooling of theoretical frequencies which are less than 5. If no one frequency
is less than 5, in this case, we take k2 as zero.
Before going to test the χ2 the following precautions are necessary:
1. The constraints on the cell frequencies should be linear such as 𝚺0= 𝚺E=N.
2. Sample observations must be drawn randomly from the population
` 4. The observations should be expressed in original units, rather than in percentage or ratio form. Such precaution helps in precaution helps in comparison attributes of interest.
5. Each cell (a group of result)
should contain at least 5 observations. If it is less than 5, the value shall be overestimated, resulting in the rejection of the null hypothesis.
Hence if any theoretical frequency is less than 5, we cannot apply a chi-square
test. If found less than 5, we use the technique of pooling in which
frequencies which are less than 5 are added with preceding or succeeding
frequency/frequencies so as to get the resulting sum greater than 5 and degrees
of freedom are adjusted accordingly.
6. All the individual observations
in a sample should be independent.
Properties of Chi-Square Distribution
1. χ2 distribution which lies between 0 to
2. Since χ2 is the sum of squares, the value cannot be negative.
3. The value of χ2
will be zero if the difference of each pair is zero.
4. For different degrees of freedom, the shape of
the curve will be different as shown in the following figure.
5. χ2 is always based on a one-tailed test of the right-hand side of the standard normal curve.
6. χ2 distribution is always positively skewed. [
Since d.f. ≥ 1]
7. For chi-square distribution with v d.f., we
have mean = v, variance = 2v and mode = v-2.
8. Median of χ2 distribution divides total data into two equal
parts.
Application of Chi-Square Distribution
The basic applications of the χ2 test are as follows:
1. Test of goodness of fit
2. Test of independence of attributes
3. Yates correction for continuity
4. Test for the population variance
5. Test for homogeneity
Here we will discuss only the test of goodness of fit;
Test of independence of attributes; and Yates correction for continuity.
Test of Goodness of Fit
If a researcher needs to understand whether an
observed sample frequency distribution coincides with a theoretical frequency distribution,
χ2 goodness of fit enables us to understand
the situation of observed and expected frequency distributions. The observed
frequencies come from the sample of fields and expected frequencies come from
the theoretical hypothesized distribution. The goodness of fit describes the differences
between the observed and expected frequency distributions. The small
differences between the observed and expected frequency distributions are
assumed to be resulting from sampling error. On the other side, the large
differences between the observed and expected frequency distributions throw
doubt on the assumption that the hypothesized theoretical frequency
distribution is correct.
The test goodness of fit' is also used to test
the significant difference between the observed and expected frequency
distributions of binomial, Poisson, and normal, etc.
Basic steps for the goodness of fit are as
follows:
Step 1: Null hypothesis: There is no significant difference between observed and expected frequency distributions.
Step 2: Alternative hypothesis: There is a significant difference between observed and expected frequency distributions.
Step 3: Test Statistic under H0, the
statistic is
Where, O = Observed frequency (from the field)
Under H
(i) E= 𝚺0/n (for equal proportion)
(ii) E= N × Proportion (for unequal
proportion)
Here, N = Total frequency = Total observed
data = 𝚺0
(iii) E=N × nCr pr
qr (for binomial distribution)
(iv) E=N
× e-m mr /r! (for
Poisson distribution)
Step 4: Level of significance:
Step 5: Degree of freedom: (n-1)
Step 6: Critical value: We have to determine the
tabulated value of χ2 at α % level of significance
for (n - 1) degree of freedom from x' χ2 table.
Step 7: Decision: If the calculated value of χ2 ≤ tabulated value of χ2, we accept the null hypothesis
otherwise, we reject the null hypothesis.
0 Comments