Friday, December 10, 2010

Statistics Tutorial: Chi-Square Distribution

Statistics Tutorial: Chi-Square Distribution
Suppose we conduct the following statistical experiment. We select a random sample of size n from a normal population, having a standard deviation equal to σ. We find that the standard deviation in our sample is equal to s. Given these data, we can define a statistic, called chi-square, using the following equation:
Χ2 = [ ( n - 1 ) * s2 ] / σ2
If we repeated this experiment an infinite number of times, we could obtain a sampling distribution for the chi-square statistic. The chi-square distribution is defined by the following probability density function:
Y = Y0 * ( Χ2 ) ( v/2 - 1 ) * e-Χ2 / 2
where Y0 is a constant that depends on the number of degrees of freedom, Χ2 is the chi-square statistic, v = n - 1 is the number of degrees of freedom, and e is a constant equal to the base of the natural logarithm system (approximately 2.71828). Y0 is defined, so that the area under the chi-square curve is equal to one.
In the figure below, the red curve shows the distribution of chi-square values computed from all possible samples of size 3, where degrees of freedom is n - 1 = 3 - 1 = 2. Similarly, the the green curve shows the distribution for samples of size 5 (degrees of freedom equal to 4); and the blue curve, for samples of size 11 (degrees of freedom equal to 10).
The chi-square distribution has the following properties:
  • The mean of the distribution is equal to the number of degrees of freedom: μ = v.
  • The variance is equal to two times the number of degrees of freedom: σ2 = 2 * v
  • When the degrees of freedom are greater than or equal to 2, the maximum value for Y occurs when Χ2 = v - 2.
  • As the degrees of freedom increase, the chi-square curve approaches a normal distribution.
Cumulative Probability and the Chi-Square Distribution
The chi-square distribution is constructed so that the total area under the curve is equal to 1. The area under the curve between 0 and a particular chi-square value is a cumulative probability associated with that chi-square value. For example, in the figure below, the shaded area represents a cumulative probability associated with a chi-square statistic equal to A; that is, it is the probability that the value of a chi-square statistic will fall between 0 and A.

Statistics Tutorial: Chi-Square Distribution
Suppose we conduct the following statistical experiment. We select a random sample of size n from a normal population, having a standard deviation equal to σ. We find that the standard deviation in our sample is equal to s. Given these data, we can define a statistic, called chi-square, using the following equation:
Χ2 = [ ( n - 1 ) * s2 ] / σ2
If we repeated this experiment an infinite number of times, we could obtain a sampling distribution for the chi-square statistic. The chi-square distribution is defined by the following probability density function:
Y = Y0 * ( Χ2 ) ( v/2 - 1 ) * e-Χ2 / 2
where Y0 is a constant that depends on the number of degrees of freedom, Χ2 is the chi-square statistic, v = n - 1 is the number of degrees of freedom, and e is a constant equal to the base of the natural logarithm system (approximately 2.71828). Y0 is defined, so that the area under the chi-square curve is equal to one.
In the figure below, the red curve shows the distribution of chi-square values computed from all possible samples of size 3, where degrees of freedom is n - 1 = 3 - 1 = 2. Similarly, the the green curve shows the distribution for samples of size 5 (degrees of freedom equal to 4); and the blue curve, for samples of size 11 (degrees of freedom equal to 10).
The chi-square distribution has the following properties:
  • The mean of the distribution is equal to the number of degrees of freedom: μ = v.
  • The variance is equal to two times the number of degrees of freedom: σ2 = 2 * v
  • When the degrees of freedom are greater than or equal to 2, the maximum value for Y occurs when Χ2 = v - 2.
  • As the degrees of freedom increase, the chi-square curve approaches a normal distribution.
Cumulative Probability and the Chi-Square Distribution
The chi-square distribution is constructed so that the total area under the curve is equal to 1. The area under the curve between 0 and a particular chi-square value is a cumulative probability associated with that chi-square value. For example, in the figure below, the shaded area represents a cumulative probability associated with a chi-square statistic equal to A; that is, it is the probability that the value of a chi-square statistic will fall between 0 and A.


No comments:

Post a Comment