The numbers that we deal with daily are usually written in decimal notation. In decimal notation, the value of a number is a sum of small multiples of powers of 10. For example. the decimal number 5703 is
5703 = 5*103 + 7*102 + 0*101 + 3*100.
The multipliers (5, 7, 0, and 3 in the example) are called digits. They must be between 0 and 9.
There is nothing special about the number 10 except that we have 10 fingers. Numbers can be expressed in a similar form using powers of any integer greater than 1. For example, the decimal number 75 can be expressed as a sum
75 = 1*43 + 0*42 + 2*41 + 3*40.
Here, the multipliers (1, 0, 2, and 3) are called the base 4 digits of the decimal number 75. They must be between 0 and 3 (one less than the base 4). If you worked with base 4 on a daily basis, you would want to shorten the notation for numbers, so that decimal 75 is written as 1023. Then 1023 is the base 4 (or radix 4) notation for the decimal number 75.
If you work with several bases, this gets confusing. To avoid the confusion, the bases can be indicated with a subscript on the number. Then you can make sense of equating numbers with different bases. For example, we write 7510 = 10234.
The ability to use different bases is important in computer science because computers do not work efficiently with decimal numbers. Computers circuits are better at working with data that has two possible values, so they do arithmetic with base 2 numbers, which are also called binary numbers. For reasons that will become clear later, using any base that is a power of 2 gives us a representation for numbers that is closely related to binary. Because binary notation is so bulky and hard to read, most computer scientists prefer base 16 (hexadecimal or hex), or occasionally, base 8 (octal). For hexadecimal numbers you need digits with values from 0 to 1510. The letters A through F are used to represent the numbers bigger than 9 as shown in the following table.
| hex | binary | decimal | hex | binary | decimal | |
|---|---|---|---|---|---|---|
| 0 | 0000 | 0 | 8 | 1000 | 8 | |
| 1 | 0001 | 1 | 9 | 1001 | 9 | |
| 2 | 0010 | 2 | A | 1010 | 10 | |
| 3 | 0011 | 3 | B | 1011 | 11 | |
| 4 | 0100 | 4 | C | 1100 | 12 | |
| 5 | 0101 | 5 | D | 1101 | 13 | |
| 6 | 0110 | 6 | E | 1110 | 14 | |
| 7 | 0111 | 7 | F | 1111 | 15 |
For any base r>1, a nonnegative integer n can be written in the form
n = dk*rk + dk-1*rk-1 + ... + d0*r0, with each di satisfying 0 <= di < r - 1.
Factoring out r from all of the terms on the right except the last, we get
n = r*(dk*rk-1 + dk-1*rk-2 + ... + d1*r0) + d0.
This shows that if we divide n by r we get d0, the last digit of n in base r, as a remainder. Furthermore, the quotient for the division is
dk*rk-1 + dk-1*rk-2 + ... + d1*r0.
Thus we can divide again by r to get another base r digit, d1, as a remainder, and a new quotient. We can get all of the base r digits of n by repeating the division process as in the following algorithm.
do
divide n by r
replace n by the quotient
the remainder is the next base r digit
while n > 0
Since you do arithmetic in you favorite base, the remainders will be numbers in that base. You may need to convert them to base r digits. For example, if your favorite base is ten and you are converting to hexadecimal then you need to convert remainder values for 0 to 15 to hexadecimal digits. To do this, you can use the table Converting hexadecimal digits to binary and decimal in reverse.
The repeated division algorithm works no matter what your favorite base is. For humans, the favorite base is usually ten, so humans can use this algorithm to convert from decimal to base r.
Since, computers do arithmetic in binary, they use this algorithm to convert numbers to strings of decimal characters. This conversion requires converting each remainder to a character. If digits characters are consecutive and in order in the character coding (this is true in ASCII) then the conversion can be obtained by adding the character '0' to each remainder.
222 13 0
/---- /--- /--
16 / 3555 16 / 222 16 / 13
32 16 0
-- -- --
35 62 13 (D)
32 48
-- --
3 (3) 14 (E)
Reading the digits from right to left gives 355510 = DE316.
You can convert from base r to your favorite base by reversing the steps of the repeated division algorithm.
In this algorithm, the base r digits are taken from left to right.
set n to 0
while there are more base r digits
multiply n by r
add the next base r digit to n
end while
The answer n appears in whatever base you use to do arithmetic. As in the repeated division algorithm, you may need to convert base r digits to your favorite base.
Humans will normally use this algorithm to convert a base r number to decimal. Computers can use this algorithm to convert a string of decimal digits into binary form. To do this, '0' is subtracted from each digit character to yield its numerical value.
16 16 16
x 0 x 13 x 222
-- -- ---
0 48 32
+ 13 (D) 16 32
-- --- 32
13 208 ----
+ 14 (E) 3552
--- + 3 (3)
222 ----
3555
Or, in short,
16 x 0 + 13 = 13
16 x 13 + 14 = 222
16 x 222 + 3 = 3555
Comparing the arithmetic in this example with the arithmetic in the repeated division algorithm example will illustrate how the repeated multipication algorithm reverses the steps of the repeated division algorithm.
Note that when multiplying by 16, the 16 is placed on top. If you do this, the partial sums will be 16 times a decimal digit (0 to 9). After doing several examples, you will start remembering these multiples. They are also useful for the repeated division algorithm.
Using the fact that 8 = 23 and 16 = 24, we can convert binary numbers to octal or hexadecimal by forming groups of 3 (for octal) or 4 (for heaxadecimal) bits, starting with the rightmost bit. Then convert each group of 3 or 4 bits to an octal or hexadecimal digit. For example,
101101101000112 = 10 1101 1010 0011 = 2DA316,
and
101101101000112 = 10 110 110 100 011 = 266438.
To convert from octal or hexadecimal to binary, you expand each octal digit to 3 bits, or each hexadecimal digit to 4 bits. You can drop leading 0 bits. For example,
3A4F16 = 0011 1010 0100 1111 = 111010010011112,
and
647028 = 110 100 111 000 010 = 1101001110000102.
When you need to convert large integers from binary and decimal or from decimal to binary, consider using hexadecimal as an intermediate form. This reduces the number of multipications or divisions by a factor of 4, and may reduce the number of mistakes. Try out numbers of different sizes both ways to get a feel for what works best.
Computers generally use fixed length representations for numbers. For example, in C and C++, there are three types for unsigned integers: unsigned int (usually 32 bits), unsigned short (usually 16 bits), and unsigned char (8 bits). Recent compilers also support unsigned long long (64 bits).
At times, it is necessary to coerce an unsigned integer from one type to another, resulting in a change in the number of bits used to represent the number. When an unsigned integer is coerced from a larger length to a smaller length, high order (leftmost) bits are discarded. If any of the duscarded bits are 1 then the coerced value is incorrect. Compilers will generally warn you of this possibility.
When an unsigigned integer is coerced from a smaller length to a larger length the original bits are padded on the left with 0 bits. This is called zero extension.