Suppose we havewithandwe want to findIfthere areorless successes inattempts,then there must beormore failures. We can represent the distribution of the number offailures in n attempts aswiththenwe find Wecan look upinthe binomial tables so find
Suppose with the samedistribution that we want to findThereareormore successes soorless failures. As before model the number of failures bythenWecan look upinthe binomial tables.
]]>The Normal Distribution – can be used to modelsymmetric bell shaped distributions. It cannot be theoretically usedto model a distribution if there are restrictions on the values thedistribution may take. For example, we cannot theoretically use it tomodel the lengths of snails, because there exists a lower limit of 0,but in practice the normal distribution is often used to model suchsituations. The normal distribution can be either a continuous, orusing the continuity correction, discrete distribution.
The Binomial Distribution – can be used in anysituation where the probability of success is fixed. The binomialdistribution models the number of successes in n trial, where n is afixed number. The binomial distribution is a discrete distributionsince in n trials the number of successes is an integer.
The Poisson Distribution – can be used to model anydistribution where events happen at a certain rate per unit time, ormisprints happen on average at so many per page. Under somecircumspances, where p is small – less thanand n is large – greater than 30  the Poisson distribution can beused as an approximation to the binomial distribution. The Poissondistribution is an integer since the number of events in each timeperiod is an integer.
The Geometric Distribution – used to model the numberof attempts until the first success. The probability p has to befixed so this distribution cannot be used to model learning games. The Geometric distribution is a discrete distribution, since thenumber of attempts must be an integer.
The Uniform Distribution. The probability of eachoutcome is the same. Theset of values that may be taken has finite upper and lower limits,meaning that any observed value must be between two numbers. Theuniform distribution can be either continuous or discrete.
]]>It is a bell shaped curve symmetrical about the mean. It has noskew.
It is very unlikely, though possible, that values occur which areless or more than three standard deviations from the mean.
There are no upper or lower cut off points. In theory a randomvariable which has a normal distribution may take any values fromto
Of course, if we take a random sample of some variable we wouldnot expect the random sample to be exactly randomly distributed, norexactly bell shaped and maybe there are some quite extreme valueswell below or above three standard deviations from the mean, but ifthe conditions are 'approximately' met or a histogram of values is'not too far' from that we would expect from a normal distributionthen we can often take the normal distribution to be suitable.
The histogram on the left indicates a normal distribution isplausible. The histogram on the right indicates positive skew so anormal distribution is unlikely to be unsuitable.
We may be able to reject a normal distribution as suitable ontheoretical grounds if the set of possible values is limited in someway. Heights and weights may not take values less than zero so intheory these can not be modelled by a normal distribution, though infact they often are.
]]>Concisely we solvewhereisthe lower limit of the range of
Example: Find the median of the distribution given by
]]>The median of a probability distribution is the halfwaypoint. Half the values lie either side of the median.
Example: Find the mode of the probability distribution
We need to solveWeexpand the brackets to obtain
The mode is the midpoint of the interval [0,1] over which thedistribution is defined. This is to be expected since the functionissymmetric about
Example: Find the median of the probability distribution
We need to solveWeexpand the brackets to obtain
The expression above factorises to give
The median is the midpoint of the interval [0,1] over which thedistribution is defined. This is to be expected since the functionissymmetric aboutsothat half the area is on either side.
The mean is given bywhereandarethe upper and lower limits of the distribution, the minimum andmaximum values the random variable can take respectively.
In general the lower or upper limits of the above integrals may beinfinity.
]]>Typically the statement is about the mean of adistribution or the probability of an event occurring. The nullhypothesis is the value that we suppose this mean or probability hasfor a certain probability distribution. This can be because, forexample:
The mean(orprobability)has had this value,(or)for a while, and we want to see if the latest set of data indicatesa change. In this case, our null hypothesiswouldbe that the mean(orprobability)has this longstanding value and the alternative hypothesis is(or).
That some manufacturer has made a claim about thesuperiority of his product over the product of some othermanufacturer. He might claim that 80% of cats prefer the 'Catlove'brand of catfood, manufactured by his company. In this casecouldbeandthe alternative hypothesis could be
When the null and alternative hypotheses are drawn up,there is often a claim that is to be tested. In the first claimabove, there is no claim of increase of the mean, so the hypothesistest is conducted merely in order to see if there is evidence thatthe mean (or probability) has changed, not specifically increased ordecreased. Of course, in order to change, the mean (or probability)must either increase or decrease, but the assumption is not part ofthe test. In the second example above, the manufacturer of cat foodis making a suspect claim about the love of cats for his company'sbrand of cat food, and it must be suspected that in fact less than80% of cats prefer his company's brand. In this case therefore, asstated above, the null hypothesis would be that
]]>The numbers 1 – 49 are listed below.
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
29 
40 
41 
42 
43 
44 
45 
46 
47 
48 
49 

The digits 1 to 4 each appear 15 times, but 1 appears twice in 11, 2 appears twice in 22, 3 appears twice in 33 and 4 appears twice in 44. Counting each of these as one occurrence, the digits 1 to 4 each appear 14 times.
The numbers 5 to 9 each appear 5 times.
The digit 0 appears 4 times.
There are 4*14+5*5+4=85 occurrences altogether.
The probability that 0 will appear just once in six balls is {jatex options:inline}6 \times (\frac{4}{85})(\frac{81}{84})(\frac{80}{83}) (\frac{79}{82})(\frac{78}{81}) (\frac{177}{80}){/jatex}.Using a Test Statistic
We have a null distributionanda single observationWe findSupposewe are conducting a 5% test. We find a test statistic andcorresponding to 5% – assuming a two tailed test  bycalculatingandlooking up the corresponding value ofinthe normal distribution table, we findIfthe value of z we found usingisbigger than this we reject the null hypothesis.
Using a Probability
We have a null distributionanda single observationWe findSupposewe are conducting a 5% test. Everything so far is as it was above.But now, instead of finding the test statistic corresponding to 5% assuming a two tailed test – we use the calculated value of z tofind a probability. If the probability we find is less than 2.5%sincewe are conducting a two tailed test  we reject the nullhypothesis.
Things to Remember
Big test statistic implies reject null hypothesis and small teststatistic implies do not reject null hypothesis.
Small probability implies reject null hypothesis and bigprobability implies do not reject null hypothesis.
The test statistic is related to the probability of the nullhypothesis being true – if the test statistic is large theprobability of the null hypothesis being true is small, so reject thenull hypothesis  if the test statistic is small the probability ofthe null hypothesis being true is large so do not reject the nullhypothesis.
]]>Suppose we are conducting a 5% one tailed test. Thenull hypothesis is that the mean is 9. We are conducting a two tailedtest based on a sample size of 1, so we split the 5% into two partsof 2.5% each. We now have to find the set of observationscorresponding to these two 2.5% per cents. Actually we find:
a valuesuchthat
a valuesuchthator
From the cumulative tables for the Poisson distribution we find,for the lower critical valuethat the closest probability to 0.025 and also less than 0.025 at thelower end is 0.021 corresponding toandfor the upper critical valuethatthe closest probability to 0.975 that is also greater than 0.975 is0.978, and this issincethe Poisson tables are cumulative, so we takeasthe critical region andasthe critical value.
The significance level is the sum of the areas of the upper andlower critical regions: 0.021+0.022=0.043. The significance is alwaysless than or equal to the stated required value at the start of thehypothesis test.
Hence for a Poisson distribution, we reject the null hypothesisthatifwe have a single observationor
Example. Bulbs are packed in boxes of 20. Over a long period oftime it has been observed that too of the bulbs are faulty – lessonbeing, don't buy from China. The factory is revamped and themanagement wants to assume that such a disgraceful state of affairsnever occurs again. They say that the probability of a bulb beingfaulty must be no higher than 0.10. Find the critical region andcomment.
This will be a one tailed test since we wish to protect againstthe possibility of the proportion of faulty bulbs rising.
From the binomial distribution tables, assuming p=0.1 we findThecritical value is
The test is not really useful since we would reject the nullhypothesis always. To improve the test, we need to increasesothat the test is based on a sample of more than 20.
]]>