"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

November 21, 2015

Stats - Chi-Square Goodness of Fit Test

Purpose -  Test association of variables in two-way tables

The chi-square test is defined for the hypothesis:
H0: The data follow a specified distribution
Ha: The data do not follow the specified distribution
This means that if the significance value is less than 0.05, you reject the null hypothesis; if significance is greater than or equal to 0.05, you don't reject the null hypothesis

Formula is
I liked the example mentioned in notes

Problem - Testing an octadedral die to see if it is biased

Score 1 2 3 4 5 6 7 8
Frequency 7 10 11 9 12 10 14 7 (Observed)

Degree of Freedom = Number of entries - 1. Here is is 8-1 = 7
Test the hypothesis H0 - The Die is Fair
H1: Die is not fair
Significance level alpha = 0.005

Expected frequency is uniform distribution of Ei = Sum of all observed scores / 8(Number of items)
= 80/8 = 10

The expected values will be
Score 1 2 3 4 5 6 7 8
Frequency 10 10 10 10 10 10 10 10 (Expected)

To compute the score we need to find values of (Oi-Ei ), ((Oi-Ei )*(Oi-Ei ))/ Ei

For each element between  both the arrays


Compute chisquare value (R Command)
1-pchisq(4,df=7)
0.7797774

This is above significance level > 0.05. So we cannot reject null hypothesis

Answer - The Die is Fair

Happy Learning!!!

No comments: