Statement of Completion#693232a3
Introduction to Inferential Statistics
medium
Hypothesis testing one sample (t- test)
Resolution
Activities
Project.ipynb
Examples¶
Testing New Teaching Methods:
Background: It is hypothesized that a new teaching method improves student test scores. Historical data suggest that the average score is 70 points.
Data: A trial of the new method with 25 students resulted in an average score of 74 points. The sample standard deviation was 12 points.
Significance Level $\alpha$: A significance level of $(\alpha = 0.05)$ is chosen, representing a 5% risk of concluding that a difference exists when there is no actual difference.
Hypotheses:
- $H_0$: The new method does not improve test scores $(\mu = 70)$.
- $H_A$: The new method improves test scores $(\mu > 70)$.
Test Statistic Calculation: $$ t = \frac{(74 - 70)}{(12/\sqrt{25})} = \frac{4}{2.4} \approx 1.67 $$
P-value Calculation: Using a one-tailed test, the p-value is found by looking up the t-statistic on the t-distribution table or using a function in a statistical software.
In [1]:
import scipy.stats as stats
# Calculating p-value
p_value = 1 - stats.t.cdf(1.67, df=24)
print(f'T-statistic: 1.67')
print(f'P-value: {p_value:.5f}')
T-statistic: 1.67 P-value: 0.05396
Section 2¶
Activity 1: We have data that follows a normal distribution of unknown mean $\mu$ and and unknown variance $\sigma$. Suppose we collect the same data as: 1, 2, 3, 6, −1¶
$H_0$: $\mu = 0$
$H_A: \mu > 0$.
At a significance level of α = 0.05, should we reject the null hypothesis?
In [10]:
import numpy as np
In [9]:
data = [1, 2, 3, 6, -1]
t = (np.mean(data)-0) / (np.std(data)/np.sqrt(len(data)))
t
p=stats.t.cdf(t, 4)
print(1-p)
0.9496040535955521
Activity 2: Suppose data is drawn from a normal distribution with unknown mean μ and unknown standard deviation. We make the following hypotheses:¶
There are 110 points with a sample mean of 3 and sample variance of 26. For a one-sample t-test let $H_0: \mu=2$ and $H_A: \mu>2$.
- What is the value of the t statistic?
In [15]:
t = (3-2) / (26/np.sqrt(110))
print(t)
p=stats.t.cdf(t, 4)
print(1-p)
0.40338801852698136 0.3536477266560937
In [ ]: