Suppose we want to answer the age old question: Are women better drivers than men? There may be all sorts of ways to start answering the question but at the end we want a simple yes, no or maybe.
We can perform a hypothesis test, suppose as the 5% level:
H0: no difference in pass rates for men and women
H1: The is a difference in pass rates for men and women
We can draw up a contingency table to show all the outcomes for all the subjects.
This is the OBSERVED table:
|
|
Pass |
Fail |
Total |
|
Men |
14 |
43 |
57 |
|
Women |
31 |
28 |
59 |
|
Total |
45 |
71 |
116 |
If there were no difference between the pass rates for men and women, we would expect the number of of men who pass would be for example, equal to![]()
In general in fact, to find the expected numbers in the table, given no difference in pass rates for men and women, we find
We obtain the EXPECTED table:
|
|
Pass |
Fail |
Total |
|
Men |
(45*57)/116=22.11 |
(71*57)/116=34.89 |
57 |
|
Women |
(45*59)/116=22.89 |
(71*59)/116=36.11 |
59 |
|
Total |
45 |
71 |
116 |
We now find:
![]()
![]()
The distribution of
is a
distribution with
degree of freedom.
From the
tables,
we reject H0:. From the table, women have a higher pass rate.