It took couple of months to completely Analyse and Arrive at Hypothesis testing learnings
- Formulating and Identifying NULL Hypothesis and Alternate Hypothesis
- Computing the Normal Distribution (Left Side, Right Side Both Side Tests)
- Identifying Area under the region (Using pnorm in R language)
- Compute T value or Z value
- Compute P value
- If p value < 0.05 then reject Null Hypothesis
- If p value > 0.05 then accept Null Hypothesis (we fail to reject the null hypothesis. Or, to put it another way, if the p-value is high, the null will fly)
Finding P-Values Here we use the pnorm function.
Usage: P-value = pnorm(zx¯, lower.tail = ).
- Left-Tailed Tests: P-value = pnorm(zx¯, lower.tail=TRUE)
- Right-Tailed Tests: P-value = pnorm(zx¯, lower.tail=FALSE)
- Two-Tailed Tests: P-value = 2 * pnorm( abs(zx¯), lower.tail=FALSE)
For below two problems Applying the above logic
R and Hypothesis Tests
Problem #1 - P Test Case
A rental car company claims the mean time to rent a car on their website is 60 seconds with a standard deviation of 30 seconds. A random sample of 36 customers attempted to rent a car on the website. The mean time to rent was 75 seconds. Is this enough evidence to contradict the company's claim? What is the p-value
H0 = No change in mean time
Ha > mean time is greater than 60 seconds
Population Mean = 60
Population SD = 30
Sample Population Mean = 75
Sample Count = 36
Considering - Population Mean = 60, Population SD = 30
SError of sample = sd / number of samples
Standard Error = 30 / sqrt(36)
Standard Error = 30 / 6 = 5
Z score = Sample Mean - Population Mean / Standard Error
Z score = 75-60/5 = 3
Two tailed tests since it has <> symbol
2*pnorm(75, mean=60, sd=5, lower.tail=FALSE)
p value = 0.002699796
Since p value is less than 0.05, you reject the null hypothesis
Problem #2
An outbreak of Salmonella related illness was attributed to ice cream produced at a certain factory. Scientists measured the level of Salmonella in 9 randomly sampled batches of ice cream. The levels (in MPN/g) were: 0.593 0.142 0.329 0.691 0.231 0.793 0.519 0.392 0.418. Is there evidence that the mean level of Salmonella in the ice cream is greater than 0.3 MPN/g? What is the p-value
H0 = mean is 0.3 MPN
Ha = mean is > 0.3 MPN
Option #1
Using R t-test
x = c(0.593, 0.142, 0.329, 0.691, 0.231, 0.793, 0.519, 0.392, 0.418)
t.test(x, alternative="greater", mu=0.3)
p-value = 0.02927, P value < 0.5 so we can reject null hypothesis
Option #2
populationmean = 0.3
samplemean = 0.4564444
standarddeviation = 0.2128439
9 random samples, degree of freedom = 8
collectedsample=c(0.593,0.142,0.329,0.691,0.231,0.793,0.519,0.392,0.418)
samplemean = mean(collectedsample)
standarddeviation = sd(collectedsample)
populationmean = 0.3
sdx = standarddeviation/3
t = (0.4564444-0.3)/(sdx)
t
df = 8
t value is 2.205058
pvalue = pt(-abs(t),df=8)
pvalue = pt(-abs(2.205058),df=8)
pvalue = 0.0292652
Since sample size is < 30 we cannot use pnorm function here
Happy Learning!!!