An estimatorfora statistical parameter
issaid to be biased if
Bias is often impossible toavoid in practice and must be taken into account when statisicalcalculations are performed.
Example: To estimate thenumber,ofnesting birds, scientists catch 100, tag them and release them. Thefraction of birds with tags is
Later,they catch another 100 and count the number with tags. Suppose
ofthese birds have tags, so the probability of a randomly picked birdfrom this sample having a tag is
Equating these two fractionsgivessothat
and
In fact
isa random variable since it will vary between samples, so
isan estimate for
We write
The expected value foris
but it is possible that
sothat
usingthis estimator. In fact,
isobviously at most
sothat the estimator
isbiased.
More subtle examples of biasare give by considering the mode and median as estimators for themean.
Suppose we have 100 people.80 of the people are labelled with a 1 and 20 are labelled with a 0(probably signifying, like me, that their net wealth is zero).
The mode is 1 but the meanis 0.8 times 1 + 0.2 times 0 = 0.8
The bias of an estimatorfora parameter
is
The bias of the mode as anestimator for the mean is 1-0.8=0.2
The 100 people are lined upin numerical order. First in line are those twenty people labelledwith a zero, and then the 80 people labelled with a 1.
The median is obviously 1,but the mean is 0.8, as before.
The bias of the median as anestimate for the mean is 1-0.8=0.2