From kottke.org: The country’s new robots.txt file.
Well, it’s probably easy to read too much into this. But the short story of it is that President Obama’s new whitehouse.gov website is blocking a lot less content from search engine “spiders” than that of Ex-President Bush (oh, that has a sweet ring to it).
Now there are plenty of reasons for putting things into your
robots.txt file, and most of them have nothing to do with trying to withhold information from the public.
It’s rather odd, though, the set of directories Bush’s site was blocking from the spiders. I find
help especially amusing. The others aren’t quite so funny. What exactly about
omb (Office of Management and Budget) did they need to hide? And… uh… well…
911 kind of goes without saying.
Why block these pages from being indexed by search engines? Good question. And here, I think, is the answer: to make it harder for the average citizen to keep track of changes that have been made to those pages by accessing Google’s cached versions (or, perhaps even more damning, the indefinitely-archived snapshots on the Wayback Machine).
But, it’s a new day. President Obama has promised a much more open and transparent White House, and if the visible underbelly of its website is any indication, he intends to keep his promise.
Also of interest: Here’s a comparison of the old and new whitehouse.gov sites.