Binary

From LQWiki
Jump to navigation Jump to search

Binary (also known as base 2) is the representation of numbers with a base--or radix--of two. Binary numbers are therefore represented with two symbols, usually written as a string of 1's and 0's e.g. 0001010011101.

In the more familiar world of decimal counting, there are ten values that a single symbol can represent (0-9, giving it a radix of 10). To express higher numbers, a "positional notation" is used. For example, the 1 in 21936 represents a quantity of 1 * 10^3, or 1000.

Binary works in much the same way, with the position of a bit representing its magnitude. A number like 10110 represents:

    1 * 2^4 (16)
  + 0 * 2^3 (8)
  + 1 * 2^2 (4)
  + 1 * 2^1 (2)
  + 0 * 2^0 (1)

  16 + 4 + 2 = 20 

In case this is unclear, here is the same process for a decimal number 87,103:

    8 * 10^4 (10,000)
  + 7 * 10^3 (1000)
  + 1 * 10^2 (100)
  + 0 * 10^1 (10)
  + 3 * 10^0 (1)

  80,000 + 7000 + 100 + 3 = 87,103.

One limitation of the system as it is currently described: There is no way to represent negative numbers. To overcome this limitation, it is standard practice to use the highest visible bit to represent the sign of the number. There are four different ways to represent negative numbers: sign-and-magnitude, one's complement, two's complement, and biased notation. Two's complement dominates the field of computing today, because it makes it easy to perform arithmetic operations. However, biased notation has some very useful properties, and makes an appearance in the discussion of floating point numbers.

sign and magnitude

While sign-and-magnitude is a lousy way to implement a computer, it's the easiest notation to understand. After all, it's the way we represent decimal numbers in daily life. The highest bit (going from left to right) tells you which way to travel from zero. The rest of the number is read as one would read an unsigned binary number. For example, 0101 would translate to 5, and 1101 would be -5.

Notice that 0000 and 1000 both represent zero, much as if I wrote +0 and -0 in decimal. One's complement also has two representations of zero.

one's complement

The complement of a number is the number of the same magnitude but opposite sign (the number times -1). In one's complement, the complement is found by simply swapping all the zeros with ones, and vice versa. So the complement of 01101 (13) is 10010 (-13). Here are a few examples around zero:

   00011 +3
   00010 +2
   00001 +1
   00000  0
   11111 -0
   11110 -1
   11101 -2
   11100 -3

It has rarely been implemented in hardware, but did appear in the [Control Data 160 series], perhaps in others.

One's complement most easily implemented with subtraction being the basic operation, rather than addition as is usual in two's complement. Remember that subtraction is minuend - subtrahend. Addition was performed by complementing all the bits to form the negative of the subtrahend and subtracting it from the minuend.

But there is a catch... subtraction involves "borrowing" from the column to the left, where addition involves carries to the left. In binary, it is usual to throw away carries out of the left, and unless there's an overflow, this does the right thing. You can't do that with borrows in one's complement.

Examples:

1 + 1 is

     00001
   + 00001

But we complement in order to allow subtraction

     00001
   - 11110
     -----
    b0011 where the b means a borrow.  Note that the result at this point is wrong: 3 instead of 2.

The solution is an "end-around borrow", and the "b" is borrowed from the low-order bit.

     00001
   - 11110
     ----
    b00011
     00010 is the right answwer.
===============================

But does that really work? Let's try 1 - 3

     00001 1
   - 00011 3
     =====
    b11110
     11101 -2

How about -1 + 1?

     11110 -1
   + 00001 1
     ====

becomes

     11110
   - 11110
     =====
     00000 no borrow, so the right answer

How about 1 + -1

     00001 1
     11110 -1
     =====

becomes

     00001
   - 00001
     =====
     00000

This all looks pretty normal. But what about that bit pattern 11111 that shows up as -1 in 2's complement? Here it's the complement of zero, and is treated as such if it ever shows up, but it rarely does. In fact it is never ever the result of 1's complement arithmetic. It only shows up ephemerally, in the case of complementing zero.

Let's try 1+0

     00001
   + 00000
     =====

becomes

     00001
   - 11111 here is the -0, in the only place it shows up, as an intermediary form
     =====
    b00010 remember the end-around borrow?
     00001

And the result is 1 as it should be.

In our 5-bit world here, 01111 is the most positive number (15) and 10000 is the most negative (-15) so, unlike 2's complement, every number has a complement including the most negative one. There's a representation for -0 but it never shows up as the result of any arithmetic, just logical operations like complement.

two's complement

In two's complement notation, getting the complement of a number requires putting ones where the zeros are and zeros where the ones are, and then adding 1 to the result. The complement of 01001 (9) would be 10110 + 1, or 10111 (-9). While that may seem like a bit of needless complexity when compared to the one's complement, from a computer's-eye view it amounts to a great simplification.

Under this notation, the highest visible bit of a number implicitly defines all the higher bits. 01001 is simply a shorthand way of saying ...00000000000001001. For negative numbers, 10111 means ...11111111111110111. This should be slightly confusing because, as I mentioned earlier, binary numbers are represented in positional notation. The question is, what can an infinite string of 1's represent?

For a negative number, the positional notation works as follows: The highest bit represents -1 * 2^(position). Each subsequent bit represents 1 * 2^(position). Here's the breakdown for 111 (-1), and for 111111 (-1).

  -1 * 2^2 (-4)
 + 1 * 2^1 (2)
 + 1 * 2^1 (1)

 -4 + 2 + 1 = -1


  -1 * 2^5 (-32)
 + 1 * 2^4 (16)
 + 1 * 2^3 (8)
 + 1 * 2^2 (4)
 + 1 * 2^1 (2)
 + 1 * 2^1 (1)

 -32 + 16 + 8 + 4 + 2 + 1 = -1

Adding and subtracting is fairly straightforward in two's complement. For addition, simply use the grade school algorithm of adding the bits in a position, and carrying the overflow to the next higher position:

       1 11    <-- carry
    01001001 (73)
  + 00101011 (43)
  ----------
    01110100 (116)


Subtraction simply requires taking the first number and adding it to the twos complement of the second. 0101 - 0110 (5 - 6) would be the same as 5 + (-6):

  Two's complement of 0110:
      1   <--carry
    1001
  +    1
  ------
    1010
    0101 (5)
  + 1010 (-6)
  ------
    1111 (-1)

This is very similar to the method by which modern processors perform the same operations.

biased notation

In biased notation, a string of zeros represents the smallest possible number the system can express, while 1111 represents the largest.


One of the consequences you may have noticed is that the same string of bits can represent several different numbers, depending on whether the number is interpreted as an unsigned, ones complement, or twos-complement number. There is yet another interpretation of a given series: as a floating point number.


floating point

Floating point math on computers has a long and painful history of conflicting representations and implementations. Today sanity reigns, after the industry converged on a single standard called IEEE754.



Colloquially, a binary is a compiled computer program. They are referred to as such, because they are represented in a form that a computer (which speaks binary machine code) understands, rather than source, which a human does.


A binary format is also, in UNIX speak, any file format which is not text based. Of course, text is itself stored using binary digits, but is not thought of as being binary because it can be human read and edited.

See also