The Chi-Squared Goodness of Fit Test — Testing How Well Data Fit a Theoretical Model

🧠 Introduction

In real-world statistics, we often want to know whether our observed data follow a theoretical or expected pattern.

For example:

  • Does a die give all faces with equal probability?
  • Are colors chosen by customers equally preferred?
  • Do the grades of students follow a normal or expected distribution?

To answer such questions, statisticians use the Chi-Squared Goodness of Fit Test, one of the most important non-parametric hypothesis tests.


📘 What is the Chi-Squared Goodness of Fit Test?

The Goodness of Fit test checks whether the observed frequency distribution of a categorical variable matches an expected (theoretical) frequency distribution.

It’s based on the Chi-Squared (χ²) distribution and measures the discrepancy between what you observe and what you expect.


🧮 Formula

χ2=∑(Oi−Ei)2Ei\chi^2 = \sum \frac{(O_i – E_i)^2}{E_i}χ2=∑Ei​(Oi​−Ei​)2​

where:

  • OiO_iOi​: Observed frequency in category i
  • EiE_iEi​: Expected frequency in category i

The greater the difference between OiO_iOi​ and EiE_iEi​, the larger the χ² statistic, indicating a poorer fit to the expected model.


🎯 Hypotheses

  • Null hypothesis (H₀): The observed data follow the expected distribution.
  • Alternative hypothesis (H₁): The observed data do not follow the expected distribution.

⚙️ Steps to Perform the Test

  1. State hypotheses.
  2. Collect observed data (O).
  3. Determine expected frequencies (E).
    • Based on a known or theoretical probability model.
  4. Compute the test statistic: χ2=∑(O−E)2E\chi^2 = \sum \frac{(O – E)^2}{E}χ2=∑E(O−E)2​
  5. Find degrees of freedom: df=k−1−mdf = k – 1 – mdf=k−1−m where:
    • k = number of categories
    • m = number of estimated parameters (often 0 for simple models)
  6. Find the p-value or compare χ² with the critical value from the Chi-Squared table.
  7. Make a decision:
    • If χcalc2>χcrit2\chi^2_{calc} > \chi^2_{crit}χcalc2​>χcrit2​, reject H₀.
    • Otherwise, fail to reject H₀.

📊 Example 1: Testing a Fair Die

A die is rolled 60 times, and results are recorded as:

Face123456
Observed (O)8109111210

Step 1:

Expected frequency per face E=60/6=10E = 60 / 6 = 10E=60/6=10.

Step 2:

Compute (O−E)2/E(O – E)^2 / E(O−E)2/E:

FaceOE(O−E)²/E
18100.4
210100.0
39100.1
411100.1
512100.4
610100.0
Total1.0

Step 3:

χ2=1.0\chi^2 = 1.0χ2=1.0 df=6−1=5df = 6 – 1 = 5df=6−1=5

At 5% significance level, χcrit2(5,0.05)=11.07\chi^2_{crit}(5, 0.05) = 11.07χcrit2​(5,0.05)=11.07.

✅ Since 1.0 < 11.07 → Fail to reject H₀.
There is no significant difference — the die appears fair.


📈 Example 2: Unequal Expected Probabilities

Suppose a company claims the proportion of customers choosing four product colors are:

  • Red: 40%
  • Blue: 30%
  • Green: 20%
  • Yellow: 10%

Out of 200 customers, the observed data are:

ColorObserved (O)Expected %Expected (E)(O−E)²/E
Red9040%801.25
Blue6030%600.00
Green3020%402.5
Yellow2010%200.00
Total200100%2003.75

χ2=3.75,df=4−1=3\chi^2 = 3.75, \quad df = 4 – 1 = 3χ2=3.75,df=4−1=3

At α=0.05\alpha = 0.05α=0.05, critical χ² = 7.815.
✅ Since 3.75 < 7.815, we fail to reject H₀.
Observed distribution fits the expected model.


🧾 Assumptions

  1. Data are frequencies (not percentages or continuous values).
  2. Observations are independent.
  3. Expected frequency ≥ 5 in each category.
  4. Categorical data only.

If any expected frequency < 5, merge categories or use Fisher’s Exact Test.


💻 Chi-Squared Goodness of Fit in Python

import numpy as np
from scipy.stats import chisquare

Observed frequencies

observed = np.array([8, 10, 9, 11, 12, 10])

Expected frequencies

expected = np.array([10, 10, 10, 10, 10, 10])

Perform test

chi2, p = chisquare(observed, expected)

print(“Chi-Squared Statistic:”, round(chi2, 3))
print(“P-value:”, round(p, 4))

Output:

Chi-Squared Statistic: 1.0
P-value: 0.9616

✅ Since p > 0.05, we fail to reject H₀.
There is no significant difference between observed and expected frequencies.


⚠️ Common Mistakes

🚫 Using percentages instead of raw frequencies
🚫 Ignoring small expected frequencies (< 5)
🚫 Applying the test to continuous or correlated data
🚫 Misinterpreting p-values (small p-value → evidence against H₀)


📚 Real-World Applications

  • Quality control: Checking if defect rates follow expected proportions.
  • Elections: Do votes align with expected voter share?
  • Manufacturing: Are production defects evenly distributed across machines?
  • Marketing: Are customer color or product preferences as predicted?
  • Education: Do grade distributions match the expected curve?

🧭 Final Thoughts

The Chi-Squared Goodness of Fit Test provides a powerful, simple way to test how well data match a theoretical expectation.
It’s a cornerstone of inferential statistics, especially when analyzing categorical or frequency data.

For students and analysts, understanding this test not only strengthens your grasp of hypothesis testing but also builds a foundation for more advanced tools like Chi-Squared Tests of Independence, Regression Residual Analysis, and Model Fit Diagnostics.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top