Google is no longer a search engine

This is old news, but it’s a useful demonstration of what absolute garbage Google has become as a search engine. It is now an ad engine.

The scenario: I need to set up WordPress Multisite. I’ve done this several times, but since I only have to do the initial setup once every 2-3 years, it’s not something I have memorized. So… I google it! That’s what you do in the 21st century.

So, I went to Google and typed:

'WordPress Multisite installation' Google search

Now, the real solution to this that a smart search engine, which was designed for maximum usefulness as a search engine, would be to provide a link to the official WordPress documentation on the topic.

Is that what it returned? Of course not, silly! It returned four ads, which, depending on your window size, could take up the entire screen:

Google ad results
But then, the first “organic” result should be the official documentation, right?

Wrong!

The first organic result is a page from the dreadful wpbeginner.com, which is overflowing with the most verbose, poorly written, surface-level articles that are designed not to be genuinely useful but to ensure that Google’s search algorithm places them exactly where it did in these results.

Yes, of course, I did click the wpbeginner.com link, because I always do, and then I get annoyed with myself for falling into their trap. And multilingualpress.org is not much better… and also always near the top of the results.

Then, of course, before we finally get to the page I really was looking for, Google makes one last ditch effort to keep me from going where I want to go, by inserting its “People also ask” block, with quick answers scraped from real websites, designed specifically to keep you from actually venturing any deeper than Google’s search results page itself.

Thanks Google for doing your part to make the Internet suck.

P.S. What do you think happened when I clicked “I’m Feeling Lucky”?

Fun with recycled IP addresses

OK, well that title kind of gives away the end of the story, but it’s still a good one.

So…

Earlier this week I launched a new site for a client. As part of the usual process, I submitted their sitemap.xml file to Google Search Console and Bing Webmaster Tools. Usually that’s all it takes for a new site to get indexed within 1-3 days.

But it seemed to be taking longer than usual for this client, and I decided to investigate the situation.

I should note that we did a private “soft launch” of the site about a week prior to the official launch. During that time I had a robots “noindex” directive turned on so it wouldn’t start showing up in search engines prematurely.

I went into Google Search Console to request a re-crawl. And that’s when I noticed this…

Excluded due to 'noindex'

Well, that’s… weird. Not so much that it had read a “noindex” directive when it, unfortunately, had crawled the URL just a day before we launched — although it was a bit weird that it had crawled it at all — but that the Referring page was a totally different site that should have had no business linking to us, yet.

So then I did what anyone (?) would naturally do, I visited that URL. And much to my surprise, it redirected to our site. What??

Next I used mxtoolbox to do a DNS lookup, and suddenly it all made sense.

We’re hosting the site at Linode. And as it happens, the DNS entry for the referring site is set to the same IP address as our site. This is a virtual private server, so we’re the only people now using this IP address.

But there are a finite number of possible IP addresses, especially IPv4 addresses (about 4 billion). So they naturally get reused. This particular site was for a limited-use product that was only relevant in 2015, so it’s not too surprising that the owners of the domain took down their Linode server and relinquished the IP address. It’s unfortunate though that they didn’t think to remove the DNS entry from their zone file.

At this point, we could (a) contact them and ask them to update their DNS, but that could be convoluted and time-consuming, for no real benefit to us, (b) set up a rewrite in our server that shunts traffic that’s trying to access their product site back over to their main site, which would take less time but also wouldn’t really benefit us in any way, or (c) leave it as-is, and let the few randos who are still looking for a product that was last relevant during the Obama administration wonder why they’re instead seeing our site.

I’m going with (c).

I’m also going with submitting re-crawl requests to both Google and Bing so we can get in the priority queue, and hopefully by this time tomorrow the site will be showing up in search results.