## Computer Arithmetic

Consider a computer that uses 20-bit floating point numbers of the form with a 1-bit sign indicator, a 7-bit exponent, and a 12-bit mantissa, stored as binary numbers. The most significant bit of the mantissa must be 1. is a bias subtracted from n to

represent both positive and negative exponents.

Note that for positive numbers and for negative numbers and the maximum value of the 7-bit exponent is i.e. The length of the exponent controls the range of numbers that can be represented. To ensure

however that numbers with small magnitude can be represented as accurately as numbers with

large amplitude, we subtract the bias from the exponent Thus, the effective

range of the exponent is not but The minimum value of and its maximum value is Thus, The absolute value of the largest oating point number that can

be stored in the computer is Computations involving larger numbers, e.g. produce an overow error.

The smaller absolute number that can be stored is Similarly computations involving smaller numbers, e.g. produce an underflow error.

Consider the number represented by

 Sign Exponent Mantissa 0 1001001 110100010011

That is The exponent is so the effective exponent i.e. The mantissa gives So, the machine number represents The next floating point number that we can store in this machine is

 Sign Exponent Mantissa 0 1001001 110100010100

The sign and the exponent remain unchanged and we simply add 1 to the least significant bit of the mantissa. The new number is so our primitive computer would be unable to store exactly any number between 836.75 and 837, leading to a relative uncertainty equal to At worst, the relative uncertainty in the value of floating point numbers that this primitive computer can store is equal to Suppose that we perform a calculation to which the answer is There are two ways to approximate this:

1. the most accurate is rounding to the nearest floating point number, 2. Many computers simply chop off the expression at the bit length of the mantissa

and ignore the extra digits, giving an answer of  