Sometimes it happens that wehave have a list of data. We can calculate the mean easily, but themean is specific to each sample. If we take another sample andcalculate a new mean, the new mean and the old mean may be different.We might want to know how reliable our estimate of the mean is. Arewe 90% confident? 95% confident? 99% confident?
To answer this we can find aconfidence interval: if for example, we construct a 90% confidenceinterval, we can say, if we take many samples and find the mean ofeach one, then 90% of the time, the true mean will lie in theconfidence interval.
If it is know that theunderlying distribution is from a normal distribution, and we knowthe true or population standard deviation, then we can use theexpression for a normal confidence interval:
where
isthe mean of the sample,
isthe sample size, and
isthe population standard deviation.
Suppose then we have thesample 2.3, 4.2, 5.3, 2.4, 2.6, 4.7 for lengths of french snails andwe know that the lengths of snails are normally distributed with astandard deviation, of 1.7.
The mean of our sample is .We need to find the value of
correspondingto a confidence interval of 90% or 0.9. This means a rejection regionof area
atthe upper and lower ends.
Fromthe Normal tables, for
Theconfidence interval is then
Suppose instead that wedidn't know the population standard deviation, but we knew that theunderlying distribution of the lengths was normal. We can can find anestimate for the standard deviation, called the sample standarddeviation, from the original sample. We label it
Now though, since weestimated wemust use the t-distribution with (n-1)=5 degrees of freedom,t0.05=,5=2.015. The confidence interval is