The t-Test — Comparing Means and Making Statistical Inferences

🧠 Introduction

In statistics, we often want to know whether two groups differ significantly in their means.
For example:

Do male and female students score differently in mathematics?
Does a new teaching method improve test performance compared to the traditional one?
Is the average lifetime of a new bulb different from 1000 hours as claimed?

To answer such questions, we use the t-test, one of the most commonly used inferential tests.

⚙️ What is a t-Test?

A t-test compares the means of one or two groups to determine whether the difference between them is statistically significant or could have occurred by random chance.

It is based on the Student’s t-distribution, introduced by William Sealy Gosset under the pseudonym “Student” in 1908.

📘 When to Use a t-Test

Use a t-test when:

The dependent variable is continuous (e.g., marks, weight, time).
The data are approximately normal.
The sample size is small (n < 30).
The population standard deviation (σ) is unknown.

🧾 Types of t-Tests

Type	Purpose	Data Condition
1. One-sample t-test	Compare the sample mean with a known or claimed population mean.	One sample
2. Independent two-sample t-test	Compare the means of two independent groups.	Two unrelated samples
3. Paired t-test (Dependent)	Compare the means of two related groups (e.g., before-after).	Two related samples

⚡ Formulae

1️⃣ One-Sample t-Test

t=Xˉ−μ0s/nt = \frac{\bar{X} – \mu_0}{s / \sqrt{n}}t=s/nXˉ−μ0

where:

Xˉ\bar{X}Xˉ: sample mean
μ0\mu_0μ0: hypothesized population mean
sss: sample standard deviation
nnn: sample size

2️⃣ Independent Two-Sample t-Test

If variances are assumed equal: t=Xˉ1−Xˉ2sp1n1+1n2t = \frac{\bar{X}_1 – \bar{X}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}t=spn11+n21Xˉ1−Xˉ2

where pooled standard deviation sps_psp is: sp=(n1−1)s12+(n2−1)s22n1+n2−2s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 – 2}}sp=n1+n2−2(n1−1)s12+(n2−1)s22

3️⃣ Paired t-Test

t=dˉsd/nt = \frac{\bar{d}}{s_d / \sqrt{n}}t=sd/ndˉ

where:

dˉ\bar{d}dˉ: mean of differences
sds_dsd: standard deviation of differences
nnn: number of pairs

🎯 Hypotheses

Type	Null Hypothesis (H₀)	Alternative Hypothesis (H₁)
One-sample	μ = μ₀	μ ≠ μ₀, μ > μ₀, or μ < μ₀
Independent	μ₁ = μ₂	μ₁ ≠ μ₂, μ₁ > μ₂, or μ₁ < μ₂
Paired	μ_d = 0	μ_d ≠ 0, μ_d > 0, or μ_d < 0

📊 Example 1 — One-Sample t-Test

A class of 10 students scored as follows in a test:
75, 78, 72, 70, 69, 82, 80, 68, 74, 77

Test whether the mean score is different from 70 at a 5% significance level.

Step 1:

Xˉ=74.5,s=4.77,n=10\bar{X} = 74.5, \quad s = 4.77, \quad n = 10Xˉ=74.5,s=4.77,n=10

Step 2:

t=74.5−704.77/10=2.99t = \frac{74.5 – 70}{4.77 / \sqrt{10}} = 2.99t=4.77/1074.5−70=2.99

Step 3:

df = 9, tcrit=2.262t_{crit} = 2.262tcrit=2.262 (two-tailed, α = 0.05)

✅ Since 2.99 > 2.262 → Reject H₀
The mean score is significantly different from 70.

📊 Example 2 — Independent Two-Sample t-Test

Group	Scores	Mean	SD	n
A	75, 78, 82, 80	78.75	3.30	4
B	68, 72, 70, 74	71.0	2.58	4

Step 1:

sp=(3.3)2(3)+(2.58)2(3)6=2.96s_p = \sqrt{\frac{(3.3)^2(3) + (2.58)^2(3)}{6}} = 2.96sp=6(3.3)2(3)+(2.58)2(3)=2.96 t=78.75−71.02.9614+14=3.69t = \frac{78.75 – 71.0}{2.96 \sqrt{\frac{1}{4} + \frac{1}{4}}} = 3.69t=2.9641+4178.75−71.0=3.69

Step 2:

df = 6, tcrit=2.447t_{crit} = 2.447tcrit=2.447

✅ Since 3.69 > 2.447 → Reject H₀.
The two group means differ significantly.

📊 Example 3 — Paired t-Test

Students’ marks before and after special coaching:

Student	Before	After	Difference (d)
1	70	75	5
2	68	74	6
3	72	76	4
4	71	73	2
5	69	70	1

dˉ=3.6,sd=2.07,n=5\bar{d} = 3.6, \quad s_d = 2.07, \quad n = 5dˉ=3.6,sd=2.07,n=5 t=3.62.07/5=3.88t = \frac{3.6}{2.07 / \sqrt{5}} = 3.88t=2.07/53.6=3.88

df = 4, tcrit=2.776t_{crit} = 2.776tcrit=2.776

✅ Since 3.88 > 2.776 → Reject H₀.
Coaching significantly improved scores.

💻 t-Test in Python

from scipy import stats
import numpy as np

One-sample t-test

data = np.array([75, 78, 72, 70, 69, 82, 80, 68, 74, 77])
t_stat, p_val = stats.ttest_1samp(data, 70)
print(“t-statistic:”, t_stat, “p-value:”, p_val)

Independent two-sample t-test

group1 = np.array([75, 78, 82, 80])
group2 = np.array([68, 72, 70, 74])
t_stat, p_val = stats.ttest_ind(group1, group2)
print(“t-statistic:”, t_stat, “p-value:”, p_val)

Paired t-test

before = np.array([70, 68, 72, 71, 69])
after = np.array([75, 74, 76, 73, 70])
t_stat, p_val = stats.ttest_rel(before, after)
print(“t-statistic:”, t_stat, “p-value:”, p_val)

📈 Assumptions of t-Test

Data are continuous (interval/ratio).
Independence of observations.
The data are approximately normally distributed.
For independent samples: equal variances (for the pooled version).

If variances are unequal → use Welch’s t-test.

⚠️ Common Mistakes

🚫 Using t-test for categorical data.
🚫 Ignoring normality and equal variance assumptions.
🚫 Confusing paired and independent samples.
🚫 Interpreting a non-significant result as “no difference at all” (could be due to small sample size).

🧭 Final Thoughts

The t-test is one of the most fundamental and widely applied statistical tools for comparing means.
It bridges descriptive and inferential statistics and lays the groundwork for more complex methods like ANOVA, Regression, and Machine Learning model testing.

Understanding which t-test to apply, and interpreting its results correctly, is an essential skill for any data analyst, researcher, or student of applied statistics.

🧠 Introduction

⚙️ What is a t-Test?

📘 When to Use a t-Test

🧾 Types of t-Tests

⚡ Formulae

1️⃣ One-Sample t-Test

2️⃣ Independent Two-Sample t-Test

3️⃣ Paired t-Test

🎯 Hypotheses

📊 Example 1 — One-Sample t-Test

Step 1:

Step 2:

Step 3:

📊 Example 2 — Independent Two-Sample t-Test

Step 1:

Step 2:

📊 Example 3 — Paired t-Test

💻 t-Test in Python

One-sample t-test

Independent two-sample t-test

Paired t-test

📈 Assumptions of t-Test

⚠️ Common Mistakes

🧭 Final Thoughts

Related Posts

Leave a Comment Cancel Reply