The Computer Course #1: Bits

For a brief introduction to this blog series, start here.


Let’s start at the beginning. Not the Big Bang, or even the first electronic computer, but rather at the most basic element of digital computing: the binary switch.

Binary switches

What is a binary switch? It’s a switch that has two settings: on and off. An on/off switch is the basis of everything in computers. Everything a computer does is created by a series of binary switches… a lot of binary switches.

In the early days of computers, these binary switches were vacuum tubes. But vacuum tubes are big and just like light bulbs, they can burn out. A computer with enough vacuum tubes to do useful calculations could fill a warehouse and would need constant maintenance to replace the burned-out vacuum tubes.

Eventually the vacuum tubes were replaced with transistors, which are much smaller and far more reliable. Over time the transistors became smaller and smaller, and today more transistors can fit on a silicon chip the size of your fingernail than the number of vacuum tubes that would fill up a warehouse 50 or 60 years ago.

Bits and bytes

The smallest unit in computing is called a bit. A bit is a very small piece of information: the state of a binary switch. Is it on or is it off? Now combine trillions of these switches and you have the capabilities of a modern computer, to store huge amounts of data, display photographs, show videos, make music, play games, write software.

But a huge collection of bits by itself is hard to work with. So we combine bits into larger groups. The next unit is called a byte, and it consists of 8 bits. While a bit can only store 2 possible states: on or off, a byte can store all of the possible permutations of those 8 on/off switches:

2 x 2 x 2 x 2 x 2 x 2 x 2 x 2 = 256

Just 8 bits taken together can produce 256 possible values for a byte. Bytes are the smallest bits of information we typically work with in computers. In the early days of personal computers, before graphical displays, a set of 256 characters called ASCII was used to represent all of the upper and lowercase letters in the Latin alphabet, along with numbers, punctuation marks, accented letters and special characters like spaces and line breaks. Today’s computers use various “multi-byte” character sets such as UTF-8 to display millions of different letters from every alphabet and language, plus a huge number of other symbols and even emoji.

Bases

When most humans do math, we use decimal numbers, also known as “base 10″. It is believed that we settled on base 10 numbers because we have 10 fingers! We have 10 different digits we use to represent numbers:

0 1 2 3 4 5 6 7 8 9

And we combine those digits to make bigger numbers:

10 100 1,234,567,890

In elementary school we learn about “places” in numbers that have more than one digit. From right to left, we add digits to represent ten of the next smaller unit.

Thousands Hundreds Tens Ones Total
5 1 = 51
1 2 3 = 123
1 0 0 0 = 1000

But base 10 is not the only way of handling numbers. Remember that everything in computers is based on binary switches, so computers use binary numbers also known as “base 2″.

In base 10 we have ten digits. In base 2 we have two digits:

0 1

The 0 and 1 correspond to “off” and “on” for the binary switches. You can make binary numbers have multiple digits just like decimal numbers, but remember that you only have 0 and 1 to work with, so the places are different.

In decimal numbers, each new place equals ten times the value of the place before it, because it represents ten of that unit. Likewise in binary numbers, each new place equals two times the value of the place before it:

Eights Fours Twos Ones Total Decimal Equivalent
1 0 = 10 2
1 0 1 = 101 5
1 0 0 0 = 1000 8

You can see that it takes many more digits in binary to represent the same value from decimal numbers, and very quickly it becomes hard for humans to read a binary number. For instance, what is the decimal equivalent of this binary number?

1001011010

It is only 602 in decimal numbers.

Exponents

You might have noticed that in both binary and decimal numbers, each place in a multi-digit number represents the number of possible digits times the next smaller unit, which is the number of digits times its next smaller unit, and so on. In decimal numbers it works this way:

Place Unit Multiplied
1 Ones 1
2 Tens 10 x 1
3 Hundreds 10 x 10 x 1
4 Thousands 10 x 10 x 10 x 1

All of those 1’s are not really necessary for the multiplication, so let’s just remove them, and while we’re at it, knock down the place numbers by one each as well:

Exponent Value Multiplied
0 1 1
1 10 10
2 100 10 x 10
3 1000 10 x 10 x 10
4 10,000 10 x 10 x 10 x 10

In math, multiplying a number times itself a certain number of times is called an exponent. You may have done squares or cubes in math class, which is represented like this:

102 = 100
103 = 1000

Note that the exponent — the small number to the right of the main number — is the same as the number in the first column of the table above. We talk about “squares” and “cubes” but in general with exponents you will say “to the nth power”, so 10 squared is also “10 to the 2nd power” or 10 cubed is “10 to the 3rd power”.

Powers of two work in the same way, with the results being equal to the place names in a binary number. Because the powers of two are so commonly used in so many aspects of computing, it can be useful to memorize as many of them as possible:

Exponent Binary Value Decimal Value Bytes
20 1 1 1 B
21 10 2 2 B
22 100 4 4 B
23 1000 8 8 B
24 10000 16 16 B
25 100000 32 32 B
26 1000000 64 64 B
27 10000000 128 128 B
28 100000000 256 256 B
29 1000000000 512 512 B
210 10000000000 1024 1 KB
211 100000000000 2048 2 KB
212 1000000000000 4096 4 KB
213 10000000000000 8192 8 KB
214 100000000000000 16,384 16 KB
215 1000000000000000 32,768 32 KB
216 10000000000000000 65,536 64 KB

You might have noticed that in decimal notation, each power of 10 is equal to 1 followed by that number of 0s. (For example, 100 is 102 or a 1 followed by two 0s.) As you can see here, the same is true for the powers of two in binary notation.

You might also notice that in the column showing the number of bytes, I switched from B (for “bytes”) to KB (for “kilobytes”) at 1024. One kilobyte equals 1024 bytes. (Even though “kilo” means one thousand, in computing the prefix is applied at 1024 instead, to stay consistent with the binary numbers.) One kilobyte is 1024 bytes; one megabyte is 1024 kilobytes (or 1024 x 1024 = 1,048,576 bytes); one gigabyte is 1024 megabytes (1,048,576 kilobytes or 1,073,741,824 bytes), and so on.

Hexadecimals

Reading binary numbers is easy for computers — in fact, it’s pretty much all they do! But since it’s so difficult for humans, we convert these binary numbers into a base that is easier to read. Unfortunately, binary numbers don’t correspond nicely with decimal numbers — there’s no “tens” place in binary — so we use “base 16″ numbers, also known as hexadecimal.

But there’s a problem for us in writing hexadecimal numbers. We’ve run out of digits! Our decimal system only includes ten digits, so to go beyond 9, we switch to letters. The 16 digits available to us in hexadecimal are:

0 1 2 3 4 5 6 7 8 9 A B C D E F

Since there’s a 16s place in binary numbers — in binary, 16 translates to 10000 — we can easily convert binary to hexadecimal. More easily than to decimal, anyway.

As we learned earlier, one byte includes 256 possible combinations of 0 or 1. In binary numbers, that’s a range from 00000000 to 11111111. In decimal numbers that’s a range of 0 to 255 — kind of a weird place to stop (and note that it’s 255 instead of 256, because of zero!) — but in hexadecimal, 255 is written as FF. We’ve “filled up” both places. 256 written in hexadecimal is 100.

32-bit addressing

If you’re familiar with video game systems, you may know about the “8-bit” systems from the 1980s, the “16-bit” systems from the 1990s, and the “32-bit” and “64-bit” systems that followed. These “bit” numbers represent the smallest chunks of bits used in the processes those systems run, and also corresponds to “memory addressing” — basically, a map that lets the system’s processor find information that is currently stored in the system’s memory. It’s how a computer (or a video game system) remembers what it’s doing!

32-bit addressing allows for 232 possible “locations” in memory. That’s a big number! 4,294,967,296 to be exact. It’s the same as 4 gigabytes. There are a lot of implications for this number: it means that 4 GB is the largest amount of memory a 32-bit system can support (although some clever tricks were devised to get around this limitation). This is also the largest number (integer) a 32-bit system can work with.

But remember, a 32-bit system has to be able to work with negative numbers, too, which means the biggest number a 32-bit system can really handle is 2,147,483,647. (If you do the math, you might think I’m off by 1, but don’t forget that zero is a number too! That also leaves one extra bit for remembering whether the number is positive or negative.)

Here’s something kind of fun to know about 32-bit numbers: UNIX-based computers calculate dates and times as a number of seconds before or after the “UNIX epoch”. UNIX was invented in the early 1970s, so they decided that 1 should represent midnight on January 1, 1970.

Sometimes you might come across a strange date and time on a computer: December 31, 1969 11:59:59 PM. Where’d that come from? Well… if time is stored in seconds, and 1 is midnight on January 1, 1970, then what do you think 0 would equal?

The downside of storing dates and times as number of seconds before or after midnight on the first day of 1970 is that it doesn’t really let you go very far. Specifically, it gets you a bit more than 68 years in either direction.

The Y2K38 bug

Many computer programs in the 20th century were written using 2-digit years instead of 4-digit years. This was to save space when memory and hard disks were really expensive. They’re not anymore. But as the year 2000 approached, a lot of computer programs were still only using 2 digits for the year, and everyone became worried about the “Y2K” bug. What would happen when the year rolled over from 99 to 00? People worried that computer systems would think it was 1900. Lots of people spent lots of weeks and months rewriting computer software to avoid the bug, and in the end it turned out not to be much of a problem at all.

But UNIX systems have a different problem. 32-bit processors running UNIX can only calculate dates for 68 years after 1970 before they run out of numbers. The exact date when 32-bit dates will break is January 19, 2038.

Fortunately, many if not most UNIX systems are now running on 64-bit processors. These processors can handle digits up to 264. You might think that’s only buying us another 68 years, but remember how quickly exponents grow. 232 is a little over 4 billion, but 264 is so big it’s usually written in scientific notation, as 1.84 x 1019. That’s a little over 18 quintillion. It’s enough to allow a 64-bit processor to access 18 exabytes (18 billion gigabytes) of memory, and means UNIX won’t run out of seconds to count for 584 billion years!

IP addresses

32-bit addressing is used somewhere else that anyone who’s spent much time online has encountered: IP addresses. Every device connected to the Internet is assigned an IP address. Much like memory addressing within a computer, IP addresses allow different devices connected to the Internet to communicate with each other.

Since an IP address is a 32-bit number, there are a little over 4 billion possible IP addresses available. With over 7 billion people in the world, we’re at risk of running out! IPv4, the version of IP addresses we are all most familiar with, has this limitation, but a new protocol called IPv6 allows for 2128 addresses. That’s an incomprehensibly huge number. But since IPv4 and IPv6 are not compatible with each other, the transition has been slow, and, just like with the 4 GB memory limitation of 32-bit computers, some tricks have been developed to get around this limitation, namely Network Address Translation, which allows all devices connected to a local network to share the same “external” IP address, while having different “internal” IP addresses within the network. A router is a specialized computer that creates a bridge between these internal and external networks.

An IP address could be represented in many different ways. Since it is really a 32-digit binary number, it could be represented as a 32-digit string of 0s and 1s. Or it could be represented as eight hexadecimal numbers. But in practice we represent an IP address as a group of four decimal numbers, separated by periods. 32 bits is equal to 4 bytes, and each of the four digits in an IP address is one of those bytes. That’s why each of the numbers in an IP address is in the range of 0 to 255.

But actually, it’s 0 to 254, because certain numbers have special roles in IP addressing. Something called the subnet mask is used by routers to determine whether a given IP address is within their own networks or not. Subnet masks allow the router to ignore a certain number of digits in the IP address. A subnet mask of 255.255.255.0 tells the router that any IP address that has its first three numbers the same as the router’s own IP address is on the local network, and only the last number is used to separate the devices on the network. (That also means that at most the router’s network can support 254 devices, including the router itself.)

Some IP address “blocks” are reserved for use within internal networks. This allows huge numbers of devices to use the same IP addresses — not conflicting with each other because they’re only using them internally — and is the main way we can manage to have far more than 4 billion devices connected to the Internet at once around the world.

The IP blocks reserved for internal network use are:

10.x.x.x
169.254.x.x
172.16.x.x
192.168.x.x

Domain names

One final thought relating to IP addresses, but a bit off topic from the “bits” where we started:

IP addresses are hard to remember.

In the early days of the Internet, someone (Paul Mockapetris) realized this and invented DNS — the Domain Name System.

Domain names are easy to remember, like “google.com”. The underlying system that makes them work is a network of domain name servers located all around the world, which keep a set of “zone files” that store information about which IP addresses each of these domains correspond to. Any time you type a URL in your web browser, the first thing that happens is your computer looks up the domain name portion of the URL.

It first checks its own cache of DNS information, and if you haven’t already visited this domain recently, it looks it up with a DNS server that you’re configured to use. Your DNS server may be the “authority” on some domains, and it also keeps its own cache of domains that you or anyone else who’s using it has recently looked up. If the domain name is in its cache, it tells your computer. If not, the DNS server sends a request to the authority server for the domain, gets the IP, stores it in its cache, and sends it to your computer, which also stores it in its cache.

All of this happens in a tiny fraction of a second. And once it’s done, your computer knows the IP address of the computer it’s trying to talk to, and it sends a request for the web page (for example) to that IP address, and the computer on the other end responds with the requested information.

The Computer Course: An Introduction

This summer, my 11-year-old son is starting an informal internship here at Room 34 Creative Services, where I will be giving him some hands-on experience building websites, and also giving a series of — hopefully interesting — lectures on the basics of various aspects of computing.

Last Friday was his first day, and also the first lecture. Entitled “Bits”, it was an introduction to the most basic elements of computing — binary switches, or bits — and following through the mathematical implications of binary numbers to the specific applications of hex code colors in HTML/CSS and 32-bit IP addresses — along with a supplementary discussion of domain names and how they relate to IPs. (That supplementary discussion being the original intended topic of the whole lecture, but I digressed… or, I guess, regressed.)

I have decided to begin a companion series of blog posts, grouped under the new category The Computer Course, not replicating word for word the lectures themselves (since the lectures are mostly off the cuff), but using the same basic outlines as the lectures, and adapting the content in a way that suits the medium.

The Outside Scoop: Thoughts on Android Wear and a possible iWatch

The big news in tech today is Google’s announcement of Android Wear, a version of their Android OS specifically optimized for “wearables” like watches.

The tech media is erupting with ridiculously titled blog posts that refer to this as Google’s “answer” to the iWatch, a product that Apple has not announced, nor even acknowledged working on.

Surprisingly, for the first time I actually found one of these wearables mildly interesting, the Moto 360. But I am still skeptical of wearables in general, smart watches in particular, and especially the idea that Apple is working on one. But I’ve learned from my past mistakes, when I was convinced Apple was neither working on a smartphone in late 2006 nor a tablet in late 2009. So, in my world at least, my adamant belief that Apple is not developing a watch should probably be my biggest clue that they are.

So where is Apple’s “iWatch”? Aren’t all of these competitors eating Apple’s lunch (before it’s even cooked)? Perhaps. But consider this:

Remember the original iPod. It came into a market that already existed (but sucked), and delivered a radically superior user experience, and was a huge hit. Remember the iPhone. Once again, it came into a market that already existed (but sucked) and totally revolutionized it.

The thing is… a smart watch market doesn’t really exist (or didn’t when rumors of an “iWatch” first started to circulate). It almost seems like Apple got the wheels of the rumor mill turning deliberately, to goad their competition into creating the market, thinking they were beating Apple to the punch but in fact creating the exact environment of suck Apple needs to release a product into.

What’s that close paren doing after my video embed in WordPress?

Working on a new client site that has a lot of YouTube video embeds, I was alarmed this morning to find a stray close paren ) character in all of the posts right after the videos.

Knowing I had recently been tampering with the embed_handler_html and embed_oembed_html filters in the site theme, I figured it was something I had created. So I set about debugging my code but couldn’t get anywhere.

I decided to see if it was in fact a new problem in WordPress itself, by setting up a test post on this site with a YouTube video embed (this, of course). Sure enough, even on my unadulterated theme, the stray close paren appears.

Look at it!!!I mean, just look at it!!!

Anyway, I hope/assume this will get fixed in the WordPress core soon, but in the meantime if you are running into this problem and want a quick fix, and you’re not afraid of editing the functions.php file in your theme, have a go at this little addition that will strip out the offending punctuation:

function embed_fix_stray_parens($content) {
    return str_replace('</iframe>)','</iframe>',$content);
}
add_filter('the_content','embed_fix_stray_parens');

Update #1: I went to submit a bug report to the WordPress development team and found my report was a duplicate of this one. If I understand correctly, the close paren is actually being delivered by YouTube itself, not WordPress, via the oEmbed request. Isn’t the Internet fun?

Update #2: This really is YouTube’s problem… it even shows up in the embed code you get on their own site:

Screen Shot 2014-03-12 at 11.00.12 AM

This issue is also showing up on StackOverflow now, including a more efficient temporary workaround for WordPress sites than my own hasty solution.

Last, This, Next

As I was folding a week’s subset of my embarrassingly large collection of printed t-shirts, I reflected momentarily on the history of my pixelated Minnesota t-shirt. I bought that t-shirt last summer and wore it each time I went to the Minnesota State Fair last year, as my symbol of “Minnesota pride”.

Then I started thinking about sharing this story, and about referring to the Minnesota State Fair that took place in 2013 as the “last” Minnesota State Fair, and how the one that will take place “this” year, in 2014, is “this” State Fair, and so on.

Frequently conversations between SLP and me have resulted in confusion based on the different possible interpretations of “last”, “this” and “next” when referring to days, weeks, months, years or events. I tend to use “this” when I’m referring to any unit of time that occurs within the same larger unit of time, whether before or after the current one, although I may be likely to omit “this”.

For example, today is Thursday. The Super Bowl (or, if you prefer, the Suberb Owl) is happening in 3 days. It’s happening “this Sunday”. But what if today was (or is it “were”? I never get that right, either) already “Superb Owl Sunday” and I was (“were”?) talking about the 5K race I’m running in 7 days later? “This” Saturday seems a bit far off in that case. But “next” Saturday doesn’t feel right to me either. Or does it? Is it better for “next” Saturday to refer to a day that’s 6 days away, or 13?

As for my confusion with SLP, the fact that she lived her life on the September-to-June academic calendar for much longer than I did only exacerbated the situation. I’ve always been a stickler (to the point of ridiculousness) for precision in dates. The first day out of school isn’t the beginning of summer; the solstice is. The first day back in school isn’t the beginning of fall; the equinox is. And the first day back in school in late August or early September most definitely is not the beginning of the new year. (Although yes, Rosh Hashanah usually does occur in September so depending on the calendar you use, there’s an argument to be made.)

Ironically, it was only after SLP stopped organizing her life around the academic year that I embraced calling any of the days in early-to-mid June when our kids are out of school (but which are still technically in spring) “summer”, but I will never give up the idea that “this year” refers to the 4-digit number starting with a “2” that comes at the end of the current date. “This year,” to me, means January 1 to December 31. Period.

But what do I mean when I say “this winter”? Sure, winter technically only starts about 10 days before the new year, so it’s almost entirely in 2014. But let’s be honest. In Minnesota, “winter” usually starts in early December, or sometimes as early as October. By my logic, “winter” in Minnesota begins on whatever day snow falls and doesn’t melt away. We had a few light snows in November, but “this winter” began on December 2, 2013.

My point is: language is fuzzy. Assigning vague labels like “last”, “this” and “next” to our days and events relies on a great deal of tacit agreement between ourselves over meaning. This particular quirk of our language has been causing me trouble since I was a kid. Back then I had a lot of time, sitting around bored in school (which I didn’t even realize was the case until much later in life), to ponder and obsess over and get annoyed by things like this. I was trying to create in my mind a world of precision and clarity that didn’t, and couldn’t, exist. Our minds don’t work that way, the world doesn’t work that way, and language, a product of our minds used to help us understand and communicate with each other about the world, necessarily can’t work that way.

I didn’t understand that then, and I only barely do now. Each of us carries around an entire universe in our mind. It’s built on a foundation laid by our genes and constructed around our experiences — and our interpretations of those experiences. Our language can only achieve an approximation of a fraction of that universe, and we have to rely on the assumption that our own version of the language we use is a close enough approximation of the same things in our own mental universe as the language, and the mental universe it represents, of the others around us.

It’s a wonder we can communicate at all.