From the Stupid PHP Tricks files: rounding numbers and creeping inaccuracy

This morning as I walked to the studio I was doing what geeks do best: pondering a slightly esoteric mathematical quandary.

Glass Half Full by S_novaIngraining the American spirit of optimism at a young age, and under dubious circumstances, our schools always taught rounding numbers in a peculiar way. You always round your decimal values to the nearest integer. That part makes sense. But what if the decimal is .5 — exactly half? In my education, at least until late in high school (or was it college?), we were always taught to round up! The glass is half full. Optimism.

Eventually — far later than it should have been, I think — the concept was introduced that always rounding .5 up is not really that accurate, statistically speaking. It might be nice in the case of a single number to be an optimist and think a solid half is good as a whole, but in aggregate this thinking introduces a problem.

If you have a whole lot of numbers, and you’re always rounding your halves up, eventually your totals are going to be grossly inaccurate.

Of course, the same would happen if you were ever the pessimist and always rounded down.

The solution, I later learned, was to round halves up or down, depending upon the integer value that precedes them. Which way you go doesn’t really matter, as long as you’re consistent, but as it happens, I learned it as such: if the integer is odd, round up; if it is even, round down.

In my work, I write a lot of PHP code. Most of it is of the extremely practical variety; I’m building websites for clients, after all. But every once in a while I like to indulge my coding abilities in a bit of frivolous experimentation, and so today I produced a little PHP script that generates 10,000 random numbers between 1 and 100, with one decimal place, and then it shows the actual sum and average of those numbers, along with what you get as the sum and average if you go through all 10,000 numbers and round them to whole integers by the various methods described above. Try it for yourself!

Any time the rounded average is different from the “precise” (and I use that term somewhat loosely) average, it is displayed in red. Interestingly, and not at all surprisingly, when you always round halves in one direction or the other, at least one of those directions will (almost) always yield an incorrect average. Yet if you use the “even or odd” methods, both of those methods will almost always yield a correct average.

It’s all about the aggregate.

Fun with site usage stats, part two

Back in February, I wrote about web browser usage by visitors to my site. Some of the discussion over my recent redesign has prompted me to do it again. Here we go!

Web Browsers


Compare to last time: Firefox has jumped from 34% to 47%. That gain has come at the expense of both Safari and IE, which have dropped from 33% to 27% and from 28% to 17%, respectively. (Note, of course, that I’m rounding the actual percentages to whole numbers because talking about “16.88%” makes me feel like Spock on Star Trek, and I’m enough of a geek without that.)

Also worth noting: Chrome. It is stuck in fourth place, but its share has jumped by 4.1% from 1.44% to 5.54%. (OK, in this instance I needed to Spock it up a bit.)

Operating Systems


Once again, as a Mac user who also (unfortunately, despite my feeble efforts at self-promotion) represents a hugely disproportionate amount of the total traffic, I’m skewing the results here a bit. Still, I have not significantly altered my own usage of the site since February, but in that time Windows has nonetheless dropped from 56% to just under 50% of my total traffic, while the Mac has gone from 29% to 43%. Interestingly, in February, iPhone/iPod represented over 12% of the traffic but now they’re just over 4%. Linux has stayed pretty even, in between 2 and 3%.

OS/Browser Combinations


In February, IE/Windows was the dominant combination, at 28%. Now it has dropped to fourth place, at 17%. Firefox/Windows has gone from #2 to the top spot, even though it just inched up from 25% to 26%. Safari/Mac and Firefox/Mac each went up a spot as well, moving into second and third, and going from 21% to 24% and from 8% to 18%, respectively.


This is far too small and skewed a sample to say a whole lot about trends on the Internet as a whole, but what I’m seeing here overall is that Mac usage vs. Windows is up, and Firefox usage vs. anything else is also way up. Specifically I’m seeing a significant surge in Firefox/Mac… which may suggest, I suppose, that I have been visiting the site a lot more lately than I did in February. Or maybe not.

It’s also worthwhile to look at the raw total numbers in the traffic. In the time between then and now I’ve split up into a number of separate sites. The totals back in February were across the board on; for October we’re looking at stats strictly from The date range is the same: 30 days. (The original data was from January 19 to February 18; the new data is from September 20 to October 20.) Back in February, the data I analyzed represented 2,845 unique visits to my site. This month’s data represents 3,810 visits, an increase of 965, or 34%. Since the old stats included visits to a lot of pages that are now parts of other sites, the increase in blog traffic is even greater. So while it’s probably true that I’ve been spending more time looking at the blog myself in the past month, vs. February (considering I just did a redesign this weekend), the majority of the traffic increase is most likely not from me. In fact, it’s probably quite likely that my own percentage of the total traffic is quite a bit less than it was in February. Traffic here spiked on October 13-14, when I posted a reply to Derek Powazek’s blog on SEO — visits to that single page, just on October 13, represent more than 10% of the total traffic the entire site saw all month.

Let’s take a look at the OS/browser breakdown for just that one day, October 13, 2009:


The traffic from this one date was likely responsible for some overall skewing of the totals. Derek Powacek’s blog appeals most strongly to Mac users, which would explain why the Mac/Safari combination is in the top spot (Safari being far more popular in general on Macs than Firefox, for the same reason IE dominates Windows — it comes with the OS).

Lessons to be learned? Well, if I want traffic, I should write about SEO. The SEO bots (both human and software) seem to love it. But beyond that, I think there probably is some valid evidence here that there’s some real movement in the directions of both Mac and Firefox. Something that sits just fine with me!

Final Thought

What’s the deal with this “Mozilla Compatible Agent” on iPhone and iPod? I haven’t seen that before, but I assume it’s one of two things:

1. A Mozilla-derived alternative to Mobile Safari, available only on “jailbroken” iPhones.
2. An embedded client in an app like Facebook, which allows you to view web pages without leaving the app.

I’m inclined to guess that #1 is correct. I’d be surprised if any Apple-approved apps were running a Mozilla-based web browser; it seems it would be far easier and more logical to develop legit apps using the official WebKit/Mobile Safari engine. I haven’t seen any hard numbers (nor do I think it would be possible to obtain them) on the percentage of iPhones in use that are jailbroken, but if this assumption is correct, and we can assume that the ratio of “Mozilla Compatible Agent” to Safari on the iPhone/iPod platform represents at least the percentage of iPhones that are jailbroken (since I’d assume some jailbroken iPhone users still use Mobile Safari), then the numbers are staggering indeed.

However… given the fact that over 8% of the total traffic on October 13 came from this user agent, and I myself visited the site numerous times on that day from my (non-jailbroken) iPhone, to monitor and respond to comments, I suspect a much more innocuous explanation. But a brief yet concerted effort to find an explanation on Google turns up nothing. Anyone in-the-know out there care to shed some light on the situation?

Yes, it has been colder in Minneapolis this summer… except when it wasn’t

There’s a bit of a brouhaha afoot with regard to our weather in Minnesota this summer, and whether it proves or disproves climate change.

A good summary of the “debate” appeared yesterday on Alas!

It started with a Minneapolis-based wingnut blogger relying on anecdotal evidence to prove… something.

Statistics guru Nate Silver responded with a bunch of boring old facts that dispel the argument of a colder-than-normal summer.

I just have a few comments to add to the fray:

1. If climate change is real (and it’s pretty much impossible for an honest, rational person to deny at this point), anecdotal evidence of a chilly month of July in one city doesn’t do anything to disprove it. And if you’re not looking at hard numbers, it’s easy to endure this cold July and forget just how hot it really was at the end of June.

2. Rising global temperatures associated with climate change emphatically do not mean that the resulting weather change in any particular location will manifest as a simple 2-3 degree temperature increase, and identical weather as before. In fact what it means is that global weather patterns will change significantly, and unpredictably, with some parts of the globe experiencing significantly hotter temperatures, some cooler, and more severe weather events occurring in more places than before.

Forget red state/blue state: it’s really red browser/blue browser

Sean Tevis browser statsAnyone who’s read this blog for any period of time knows my political leanings pretty well. I’m about as liberal as they come in this country (which means I’m probably middle-of-the-road anywhere else). And the same reader(s) probably also know(s) how I feel about Internet Explorer 6.

Well it’s interesting to see that there seems to be a correlation between political viewpoint and web browser usage. As (almost) always, this comes from Daring Fireball. We’re looking at the decidedly non-traditional campaign blog of Kansas Democrat Sean Tevis. His campaign did a survey that, among other things, discovered that users of outdated browsers like Internet Explorer 6, AOL, “Don’t Know” and “No Internet” preferred, strongly, his Republican opponent, while users of Firefox, Chrome, Opera and Safari preferred Tevis. Interestingly, IE 7/8 users slightly favored Tevis.

It would be interesting to see the raw numbers, rather than just percent deviation, to get a sense of the relative proportions of the electorate who fell into each category, especially considering that Tevis apparently lost, by a small margin.

It’s also interesting to look at the strength of each group’s leanings. Those who most strongly favored the Republican candidate were the AOL users and non-Internet users, a.k.a. the Luddites. Chrome users (all on Windows) were the strongest Tevis supporters, followed by Safari (presumably all or nearly all Mac) users. Firefox users were slightly weaker supporters of Tevis. This makes sense to me in that I suspect there’s a high correlation between “average” Mac users (who almost all use Safari, just like most “average” Windows users run IE) and Democratic leanings, whereas users of Firefox (and of open source software in general) are as likely (or moreso) to be libertarian as liberal. Opera… well… I don’t know. Contrarians?

That IE 7/8 users slightly favored Tevis is most interesting to me. IE 7/8 represent by far the largest percentage of the Internet-using population. And the country as a whole moved slightly in the Democrats’ direction in the 2008 election. But Kansas is far more conservative than the US populace as a whole; combine that with the “No Internet” crowd, and a small margin of victory in favor of the Republican candidate makes sense.

P.S. Sean Tevis for President 2016.

A shout out to my international visitors, or at least their automaton surrogates

It’s been fun to study the data collected by Google Analytics about visitors to my site. It’s not terribly surprising when looking at the world map that the United States is dark green and all of the rest of the countries are either light gray (no visits) or very pale green (a few visits). Frankly, I’m quite surprised though that most of the countries are the pale green. Pretty much the only gray on the map is the majority of Africa (all but six countries), the cluster of former Soviet republics between Russia and Pakistan, Mongolia, a couple of smaller South American countries, and, vastly over-represented by Google’s use of Mercator projection, Greenland.

That’s pretty amazing. Nearly 2,500 (68%) of the slightly more than 3,700 visits my site received in the last month were from the United States, with the fairly obvious (for language reasons, if nothing else) U.K. and Canada following at 200 (5%) and 160 (4%) visits, respectively, and Australia in fifth place with 63 visits.

Among non-Anglophone nations, France was first, and fourth overall. Again, not terribly surprising. What is surprising is the sixth-place country: Poland, ahead of Germany by 10 visits. I’ve been to Poland. I enjoyed my visit; it’s a fine place; but I just didn’t expect much site traffic from there. Brazil, Italy and Spain round out the top ten countries with a combined total of 120 visits.


Looking at the top ten cities was even more surprising, to some extent. Well, OK, the top five cities were not surprising at all: Minneapolis, New York, Chicago, London and San Francisco. (I guess London was a little surprising, as the fourth most frequent source of visitors to my site. But, you know, it’s a big city.)

It was cities number eight and nine that really surprised me: La Victoria, Peru and Kissimmee, Florida. What? Well, OK. I wouldn’t be surprised if the Kissimmee visits were entirely due to this, but I’m at a loss as to what it might be about my site that is so uniquely appealing to the residents of a district in Lima. If you live in Peru, please share!

No one city jumps out from Poland in the same way. My popularity there is far broader! But no more easily explained.


Sadly, though, all of this enthusiasm over my burgeoning international popularity fizzled when I took a close look at the stats for a particular country: China. I had a mere 7 visits from the world’s most populous nation. But given that country’s reported restrictions on access to the Internet, the low number is not so surprising. What is revealing, though, is the duration of the visits. All but one of them were for precisely the same amount of time: 0 seconds. One determined soul in Shanghai did actually spend 19 minutes on 3 of my pages, but the rest were blips too small to measure. Which suggests to me that either Chinese web surfers are experts at frightfully clicking instantly away from questionable online subject matter, or these visits were not from humans at all, but spider bots.

I suspect if I were to dig deeper into these international visits (as well as some in the U.S., particularly from San Francisco), I would find that many if not most of them are from search engine spiders simply undertaking the thankless task of indexing my site for the benefit of Internet users in their countries who are thoroughly indifferent to my unengaging drivel.