The Chi-Squared (χ²) test is used in biology to determine whether the difference between observed and expected frequencies is statistically significant.
The formula is:
Where:
The null hypothesis is a default assumption that there is no significant effect or difference in the data being tested. It represents the idea that any observed variation is due to random chance rather than a real underlying cause. In statistical tests, we use the null hypothesis as a baseline to compare against the alternative hypothesis, which suggests that an effect or difference does exist. The goal of hypothesis testing is to determine whether there is enough evidence to reject the null hypothesis in favour of the alternative.
For this example, we're observing the genders of people who caught a disease.
Setting the null hyposthesis: "There will be no statistically significant relationship between gender and the number of people who have a certain disease."
Group | Observed (O) | Expected (E) | (O-E) | (O-E)2 | (O−E)² / E |
---|---|---|---|---|---|
Males | 435 | 511 | -76 | 5776 | 11.3 |
Females | 543 | 467 | 76 | 5776 | 12.4 |
Sum | 23.7 |
X2 value is calculated to be 23.7, but what does this mean? To determine if there is a statistically significant relationship in our data, we need to compare our X2 answer to a critical values table.
A critical values table for χ2 allows us to compare our test statistic to a threshold value to determine whether to reject the null hypothesis. This threshold, known as the critical value, depends on the desired significance level (commonly 5%) and the degrees of freedom involved in the test.
The "dampening factor" refers to the degrees of freedom (df), which affect the shape of the Chi-Squared distribution. We calculate the degrees of freedom by doing number of rows, minus 1, so for our example, 2 (number of rows) - 1 = 1.
By comparing the calculated Chi-Squared statistic to the critical value from the table, we can assess whether the observed results differ significantly from what was expected.
Statistical table for chi squared test:
Dampening Factor (df) | Critical Value (p = 0.05) |
---|---|
1 | 3.841 |
2 | 5.991 |
3 | 7.815 |
4 | 9.488 |
5 | 11.070 |
Conclusion: The χ² value is 23.7, we used a dampening factor of 1, so our critical value at 95% probability is 3.841. 23.7 is much greater than the critical value of 3.841 , this means there's a significant difference between observed and expected results. The null hypothesis is rejected.
Calculate the chi-squared value for the following data: