*For a brief introduction to this blog series, start here.*

Let’s start at the beginning. Not the Big Bang, or even the first electronic computer, but rather at the most basic element of digital computing: the binary switch.

## Binary switches

What is a binary switch? It’s a switch that has two settings: on and off. An on/off switch is the basis of everything in computers. Everything a computer does is created by a series of binary switches… a *lot* of binary switches.

In the early days of computers, these binary switches were vacuum tubes. But vacuum tubes are big and just like light bulbs, they can burn out. A computer with enough vacuum tubes to do useful calculations could fill a warehouse and would need constant maintenance to replace the burned-out vacuum tubes.

Eventually the vacuum tubes were replaced with transistors, which are much smaller and far more reliable. Over time the transistors became smaller and smaller, and today more transistors can fit on a silicon chip the size of your fingernail than the number of vacuum tubes that would fill up a warehouse 50 or 60 years ago.

## Bits and bytes

The smallest unit in computing is called a **bit.** A bit is a very small piece of information: the state of a binary switch. Is it on or is it off? Now combine trillions of these switches and you have the capabilities of a modern computer, to store huge amounts of data, display photographs, show videos, make music, play games, write software.

But a huge collection of bits by itself is hard to work with. So we combine bits into larger groups. The next unit is called a **byte**, and it consists of 8 bits. While a bit can only store 2 possible states: on or off, a byte can store all of the possible permutations of those 8 on/off switches:

`2 x 2 x 2 x 2 x 2 x 2 x 2 x 2 = 256`

Just 8 bits taken together can produce *256* possible values for a byte. Bytes are the smallest bits of information we typically work with in computers. In the early days of personal computers, before graphical displays, a set of 256 characters called ASCII was used to represent all of the upper and lowercase letters in the Latin alphabet, along with numbers, punctuation marks, accented letters and special characters like spaces and line breaks. Today’s computers use various “multi-byte” character sets such as UTF-8 to display millions of different letters from every alphabet and language, plus a huge number of other symbols and even emoji.

## Bases

When most humans do math, we use **decimal** numbers, also known as “base 10″. It is believed that we settled on base 10 numbers because we have 10 fingers! We have 10 different digits we use to represent numbers:

`0 1 2 3 4 5 6 7 8 9`

And we combine those digits to make bigger numbers:

`10 100 1,234,567,890`

In elementary school we learn about “places” in numbers that have more than one digit. From right to left, we add digits to represent ten of the next smaller unit.

Thousands | Hundreds | Tens | Ones | Total | |
---|---|---|---|---|---|

5 | 1 | = | 51 | ||

1 | 2 | 3 | = | 123 | |

1 | 0 | 0 | 0 | = | 1000 |

But base 10 is not the only way of handling numbers. Remember that everything in computers is based on binary switches, so computers use **binary numbers** also known as “base 2″.

In base 10 we have ten digits. In base 2 we have two digits:

`0 1`

The 0 and 1 correspond to “off” and “on” for the binary switches. You can make binary numbers have multiple digits just like decimal numbers, but remember that you only have 0 and 1 to work with, so the places are different.

In decimal numbers, each new place equals ten times the value of the place before it, because it represents ten of that unit. Likewise in binary numbers, each new place equals *two* times the value of the place before it:

Eights | Fours | Twos | Ones | Total | Decimal Equivalent | |
---|---|---|---|---|---|---|

1 | 0 | = | 10 | 2 | ||

1 | 0 | 1 | = | 101 | 5 | |

1 | 0 | 0 | 0 | = | 1000 | 8 |

You can see that it takes many more digits in binary to represent the same value from decimal numbers, and very quickly it becomes hard for humans to read a binary number. For instance, what is the decimal equivalent of this binary number?

`1001011010`

It is only `602`

in decimal numbers.

## Exponents

You might have noticed that in both binary and decimal numbers, each place in a multi-digit number represents the number of possible digits times the next smaller unit, which is the number of digits times *its* next smaller unit, and so on. In decimal numbers it works this way:

Place | Unit | Multiplied |
---|---|---|

1 | Ones | 1 |

2 | Tens | 10 x 1 |

3 | Hundreds | 10 x 10 x 1 |

4 | Thousands | 10 x 10 x 10 x 1 |

All of those 1’s are not really necessary for the multiplication, so let’s just remove them, and while we’re at it, knock down the place numbers by one each as well:

Exponent | Value | Multiplied |
---|---|---|

0 | 1 | 1 |

1 | 10 | 10 |

2 | 100 | 10 x 10 |

3 | 1000 | 10 x 10 x 10 |

4 | 10,000 | 10 x 10 x 10 x 10 |

In math, multiplying a number times itself a certain number of times is called an **exponent.** You may have done squares or cubes in math class, which is represented like this:

`10`

^{2} = 100

10^{3} = 1000

Note that the exponent — the small number to the right of the main number — is the same as the number in the first column of the table above. We talk about “squares” and “cubes” but in general with exponents you will say “to the *nth* power”, so 10 squared is also “10 to the 2nd power” or 10 cubed is “10 to the 3rd power”.

Powers of two work in the same way, with the results being equal to the place names in a binary number. Because the powers of two are so commonly used in so many aspects of computing, it can be useful to memorize as many of them as possible:

Exponent | Binary Value | Decimal Value | Bytes |
---|---|---|---|

2^{0} |
1 | 1 | 1 B |

2^{1} |
10 | 2 | 2 B |

2^{2} |
100 | 4 | 4 B |

2^{3} |
1000 | 8 | 8 B |

2^{4} |
10000 | 16 | 16 B |

2^{5} |
100000 | 32 | 32 B |

2^{6} |
1000000 | 64 | 64 B |

2^{7} |
10000000 | 128 | 128 B |

2^{8} |
100000000 | 256 | 256 B |

2^{9} |
1000000000 | 512 | 512 B |

2^{10} |
10000000000 | 1024 | 1 KB |

2^{11} |
100000000000 | 2048 | 2 KB |

2^{12} |
1000000000000 | 4096 | 4 KB |

2^{13} |
10000000000000 | 8192 | 8 KB |

2^{14} |
100000000000000 | 16,384 | 16 KB |

2^{15} |
1000000000000000 | 32,768 | 32 KB |

2^{16} |
10000000000000000 | 65,536 | 64 KB |

You might have noticed that in decimal notation, each power of 10 is equal to 1 followed by that number of 0s. (For example, `100`

is `10`

or a 1 followed by two 0s.) As you can see here, the same is true for the powers of two in binary notation.^{2}

You might also notice that in the column showing the number of bytes, I switched from **B** (for “bytes”) to **KB** (for “kilobytes”) at 1024. One kilobyte equals 1024 bytes. (Even though “kilo” means one thousand, in computing the prefix is applied at 1024 instead, to stay consistent with the binary numbers.) One kilobyte is 1024 bytes; one megabyte is 1024 kilobytes (or 1024 x 1024 = 1,048,576 bytes); one gigabyte is 1024 megabytes (1,048,576 kilobytes or 1,073,741,824 bytes), and so on.

## Hexadecimals

Reading binary numbers is easy for computers — in fact, it’s pretty much all they do! But since it’s so difficult for humans, we convert these binary numbers into a base that is easier to read. Unfortunately, binary numbers don’t correspond nicely with decimal numbers — there’s no “tens” place in binary — so we use “base 16″ numbers, also known as **hexadecimal.**

But there’s a problem for us in writing hexadecimal numbers. We’ve run out of digits! Our decimal system only includes ten digits, so to go beyond 9, we switch to letters. The 16 digits available to us in hexadecimal are:

`0 1 2 3 4 5 6 7 8 9 A B C D E F`

Since there’s a 16s place in binary numbers — in binary, 16 translates to `10000`

— we can easily convert binary to hexadecimal. More easily than to decimal, anyway.

As we learned earlier, one byte includes 256 possible combinations of 0 or 1. In binary numbers, that’s a range from `00000000`

to `11111111`

. In decimal numbers that’s a range of `0`

to `255`

— kind of a weird place to stop (and note that it’s 255 instead of 256, because of zero!) — but in hexadecimal, 255 is written as `FF`

. We’ve “filled up” both places. 256 written in hexadecimal is `100`

.

## 32-bit addressing

If you’re familiar with video game systems, you may know about the “8-bit” systems from the 1980s, the “16-bit” systems from the 1990s, and the “32-bit” and “64-bit” systems that followed. These “bit” numbers represent the smallest chunks of bits used in the processes those systems run, and also corresponds to “memory addressing” — basically, a map that lets the system’s processor find information that is currently stored in the system’s memory. It’s how a computer (or a video game system) remembers what it’s doing!

32-bit addressing allows for 2^{32} possible “locations” in memory. That’s a big number! 4,294,967,296 to be exact. It’s the same as 4 gigabytes. There are a lot of implications for this number: it means that 4 GB is the largest amount of memory a 32-bit system can support (although some clever tricks were devised to get around this limitation). This is also the largest number (integer) a 32-bit system can work with.

But remember, a 32-bit system has to be able to work with *negative* numbers, too, which means the biggest number a 32-bit system can *really* handle is 2,147,483,647. (If you do the math, you might think I’m off by 1, but don’t forget that zero is a number too! That also leaves one extra bit for remembering whether the number is positive or negative.)

Here’s something kind of fun to know about 32-bit numbers: UNIX-based computers calculate dates and times as a number of seconds before or after the “UNIX epoch”. UNIX was invented in the early 1970s, so they decided that 1 should represent midnight on January 1, 1970.

Sometimes you might come across a strange date and time on a computer: December 31, 1969 11:59:59 PM. Where’d *that* come from? Well… if time is stored in seconds, and 1 is midnight on January 1, 1970, then what do you think 0 would equal?

The downside of storing dates and times as number of seconds before or after midnight on the first day of 1970 is that it doesn’t really let you go very far. Specifically, it gets you a bit more than 68 years in either direction.

## The Y2K38 bug

Many computer programs in the 20th century were written using 2-digit years instead of 4-digit years. This was to save space when memory and hard disks were *really* expensive. They’re not anymore. But as the year 2000 approached, a lot of computer programs were still only using 2 digits for the year, and everyone became worried about the “Y2K” bug. What would happen when the year rolled over from 99 to 00? People worried that computer systems would think it was 1900. Lots of people spent lots of weeks and months rewriting computer software to avoid the bug, and in the end it turned out not to be much of a problem at all.

But UNIX systems have a *different* problem. 32-bit processors running UNIX can only calculate dates for 68 years after 1970 before they run out of numbers. The exact date when 32-bit dates will break is January 19, 2038.

Fortunately, many if not most UNIX systems are now running on 64-bit processors. These processors can handle digits up to 2^{64}. You might think that’s only buying us another 68 years, but remember how quickly exponents grow. 2^{32} is a little over 4 billion, but 2^{64} is so big it’s usually written in *scientific notation,* as 1.84 x 10^{19}. That’s a little over 18 *quintillion.* It’s enough to allow a 64-bit processor to access 18 exabytes (18 *billion* gigabytes) of memory, and means UNIX won’t run out of seconds to count for *584 billion years!*

## IP addresses

32-bit addressing is used somewhere else that anyone who’s spent much time online has encountered: IP addresses. Every device connected to the Internet is assigned an IP address. Much like memory addressing within a computer, IP addresses allow different devices connected to the Internet to communicate with each other.

Since an IP address is a 32-bit number, there are a little over 4 billion possible IP addresses available. With over 7 billion people in the world, we’re at risk of running out! IPv4, the version of IP addresses we are all most familiar with, has this limitation, but a new protocol called IPv6 allows for 2^{128} addresses. That’s an incomprehensibly huge number. But since IPv4 and IPv6 are not compatible with each other, the transition has been slow, and, just like with the 4 GB memory limitation of 32-bit computers, some tricks have been developed to get around this limitation, namely *Network Address Translation,* which allows all devices connected to a local network to share the same “external” IP address, while having different “internal” IP addresses within the network. A router is a specialized computer that creates a bridge between these internal and external networks.

An IP address could be represented in many different ways. Since it is really a 32-digit binary number, it could be represented as a 32-digit string of 0s and 1s. Or it could be represented as eight hexadecimal numbers. But in practice we represent an IP address as a group of four decimal numbers, separated by periods. 32 bits is equal to 4 bytes, and each of the four digits in an IP address is one of those bytes. That’s why each of the numbers in an IP address is in the range of 0 to 255.

But actually, it’s 0 to 254, because certain numbers have special roles in IP addressing. Something called the *subnet mask* is used by routers to determine whether a given IP address is within their own networks or not. Subnet masks allow the router to ignore a certain number of digits in the IP address. A subnet mask of `255.255.255.0`

tells the router that any IP address that has its first three numbers the same as the router’s own IP address is on the local network, and only the last number is used to separate the devices on the network. (That also means that at most the router’s network can support 254 devices, including the router itself.)

Some IP address “blocks” are reserved for use within internal networks. This allows huge numbers of devices to use the same IP addresses — not conflicting with each other because they’re only using them internally — and is the main way we can manage to have far more than 4 billion devices connected to the Internet at once around the world.

The IP blocks reserved for internal network use are:

`10.x.x.x`

169.254.x.x

172.16.x.x

192.168.x.x

## Domain names

One final thought relating to IP addresses, but a bit off topic from the “bits” where we started:

IP addresses are hard to remember.

In the early days of the Internet, someone (Paul Mockapetris) realized this and invented **DNS** — the Domain Name System.

Domain names are easy to remember, like “google.com”. The underlying system that makes them work is a network of domain name *servers* located all around the world, which keep a set of “zone files” that store information about which IP addresses each of these domains correspond to. Any time you type a URL in your web browser, the first thing that happens is your computer looks up the domain name portion of the URL.

It first checks its own cache of DNS information, and if you haven’t already visited this domain recently, it looks it up with a DNS server that you’re configured to use. Your DNS server may be the “authority” on some domains, and it also keeps its own cache of domains that you or anyone else who’s using it has recently looked up. If the domain name is in its cache, it tells your computer. If not, the DNS server sends a request to the authority server for the domain, gets the IP, stores it in its cache, and sends it to your computer, which also stores it in *its* cache.

All of this happens in a tiny fraction of a second. And once it’s done, your computer knows the IP address of the computer it’s trying to talk to, and it sends a request for the web page (for example) to that IP address, and the computer on the other end responds with the requested information.