[Data Analysis]

Chi-Squared Test

 

The chi-squared test allows us to determine the difference between two data sets.

When to use the Chi-squared test:

a) It may only be used on raw data counts, not on such data as percents or rates.

b) It is used to compare an experimental result with an expected or theoretical outcome.

c) It is more reliable when the sample size is greater than 20. The greater the sample size, the greater the reliability of the chi-squared test!

The chi-squared test is frequently used in genetics to find out whether the observed results differ significantly from the expected results.


The significance level is generally shown as a percentage. A 5% significance level means that there is a 5 % or 1 in 20 chance that the observed data is by chance unrepresentative of the predicted data. Thus, this data has a 95% confidence level in representing a real difference between the data and not due to chance.

Results which show less than 95% confidence in the data are usually deemed unacceptable. In other words, there is too great a chance that the differneces are due to chance for the conclusion to be trusted!


Ý = sum of


Slichter