How to purge fake/bot WooCommerce customer accounts directly in the MySQL database

DANGER! If you don’t know the havoc one can wreak with a DELETE statement in MySQL, stop right here. I take no responsibility for what you might do with the information that follows.


Bots like to create fake customer accounts on WooCommerce (WordPress) sites, apparently. What they’re attempting to do, I don’t know. But if you don’t stay on top of things, you might find you have thousands of fake customer accounts in your site. Chances are they haven’t, won’t, and can’t actually cause any damage, but they’re cluttering things up, and any unnecessary user account in a WordPress database represents a potential future security risk.

On a particular client’s heavy-traffic WooCommerce site, I discovered that over the course of the site’s 7-year lifespan it had accumulated nearly 8,000 such accounts, and I wanted to be rid of them.

After carefully exploring the data surrounding a few of these obviously fake accounts, I determined a pattern, and came up with a fairly cautious set of conditions that, to me, indicated a customer was fake:

  1. They had the customer role.
  2. Their user account had nothing in the First Name or Last Name fields.
  3. Likewise, their user account had nothing in the Billing First Name or Billing Last Name fields. (If you’re feeling extra draconian, you might skip this one.)
  4. They had never placed an order while logged in — their user ID did not have an _order_count entry in the wp_usermeta table. Which is perhaps an obvious condition because…
  5. They had never logged in at all — their user ID did not have a wfls-last-login entry in the wp_usermeta table. This condition will only apply if your site uses WordFence.

You can test all of those conditions with a single SQL query:

SELECT DISTINCT `user_id`
FROM `wp_usermeta`
WHERE
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wp_capabilities' AND `meta_value` = 'a:1:{s:8:"customer";b:1;}') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'first_name' AND `meta_value` = '') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'last_name' AND `meta_value` = '') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_first_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_last_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = '_order_count' AND meta_value > 0) AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wfls-last-login');

You may want to spot check some of these IDs in the wp_users table, or directly in the site admin, just to be sure everything looks right. Then you can turn the above into a subquery that will delete all of the matching users. Be sure to make a full backup of your database before doing this!

DELETE FROM `wp_users` WHERE `ID` IN (
SELECT DISTINCT `user_id`
FROM `wp_usermeta`
WHERE
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wp_capabilities' AND `meta_value` = 'a:1:{s:8:"customer";b:1;}') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'first_name' AND `meta_value` = '') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'last_name' AND `meta_value` = '') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_first_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_last_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = '_order_count' AND meta_value > 0) AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wfls-last-login')
);

OK… so the users are gone. But each one has a bunch of records in the wp_usermeta table. Now that we’ve gotten rid of the users themselves, it’s easy to purge their associated meta data:

DELETE FROM `wp_usermeta` WHERE `user_id` NOT IN (SELECT `ID` FROM `wp_users`);

WordPress development in the Gutenberg era: Threading the needle of Block Theme development

As described in several of my recent posts here, I have been working for the past month or so on building my first “all-in” Block Theme for WordPress.

After nearly 4 years of adamantly resisting “Gutenberg” and the new Block Editor revolution — not because I disliked the block concept, but because I disagreed philosophically with the core team’s approach (to what constitutes a block, which types of blocks are important, and which technologies are used to manage the UI of the editing screen) — I am finally accepting that if I am to continue making a living primarily as a WordPress developer, I need to give up on my Classic Editor, Advanced Custom Fields “Flexible Content” approach, and embrace that the Block Editor is now The Way.

One of numerous challenges I’ve faced in this process (on top of the learning curve of a completely unfamiliar method of constructing themes, the dearth of adequate and up-to-date documentation, and the core team’s willingness to allow very unfinished versions of functionality to roll out in public WordPress releases) is figuring out the best way to approach some of the more complex design structures I am used to dealing with via ACF’s Flexible Content fields.

My biggest hurdle is recognizing that what I think of as a “block” is not what the WordPress core team thinks of as a “block”

Here I will admit this is a shortcoming of my own approach. I have been “opinionated” in my development approach (well, about everything, really), and created large-scale and complex “blocks” that, in Gutenberg/Block Editor terms, would really be “groups” or “block patterns,” not blocks. Gutenberg blocks are more granular.

Gutenberg blocks are also static, in that they generally do not interact with database content, or if they do, it is in very limited “bloggy” ways that don’t align with my use of WordPress as a general-purpose CMS.

So I’ve found myself falling back on ACF. I like its server-side approach. I’m more accustomed to dealing with PHP and MySQL. I use JavaScript (mainly jQuery) a fair bit for front-end interactive elements, but I don’t build complex functionality in JavaScript and I avoid AJAX if I can help it.

You can see I’m destined for a strained relationship with the Block Editor.

A concrete example: “Tiles”

One of the most idiosyncratic elements of my old ACF approach is the block I call “Tiles.” It’s a way of creating a set of small blurbs to link to other pages/posts. There are numerous options for appearance: number of tiles per row, relationship between the image and text in a tile, etc. And there are also numerous options for the content source: a specific page or post (with the title, featured image and excerpt automatically pulled in), a dynamic feed from the blog or a specific category (likewise with the info pulled in automatically, except this one auto-updates when there’s a new post), or a completely custom-built tile, where you manually select the image and enter the text and link.

Here are screenshots of the ACF-based editing tools for my Tiles block.

I tried recreating this entire setup as a new ACF-based Block field group. It worked, to be sure, but the complex ACF editing layout really did not feel right in the new Block Editor interface, either in the sidebar (eek) or in the main content area. It felt like a cop-out.

Then I considered creating a block pattern. I knew this would lack some of the benefits of my ACF-based Tiles block, but one in particular: the option to dynamically pull in the details of another page/post, rather than manually entering the text. But as a starting point, I decided to recreate the “custom” tile type.

That, also, worked. But it was finicky and didn’t apply well in a lot of different places. So I realized that instead of creating a Tiles block pattern, I needed to create a Tile block pattern. Just one tile. Instead of a monolithic block that was really a group, users would insert each individual tile into whatever larger structure they want (e.g. columns).

The end result was a block pattern that looked like this (screenshot instead of live code because, well, you really don’t want to use this):

I was really proud of myself for using the new “lock” feature to force the elements to stay in a particular order, but to allow the user to remove elements they don’t need, such as the image, lead-in text, or CTA button.

Still… I didn’t like it. And it didn’t allow for dynamic sources.

Along the way I also finally came to grasp another fundamental limitation of the Block Editor and Block Patterns. Since all of the parameters of the blocks are stored as HTML comments right in the post content, you (the theme developer) can’t update the design of a block pattern once it’s inserted into a page/post. The Block Editor isn’t dynamically inserting content into a template when the page is rendered. It’s making a one-off copy of the template and storing it right along with the content at the point when the content is inserted into the page/post.

This seems, to me, to be a fundamental flaw in the entire Gutenberg/Block Editor approach. It’s bad enough that if I were building it myself, I’d have stopped right there and taken a completely different direction. Maybe there’s a long-term plan to address this limitation, but for now it appears to be here to stay. Which led me back to ACF.

Threading the needle

And that was when I had my insight into the true nature of the problem. It was the fact that what I think of as a “block” is larger-scale than what the Block Editor treats as a block.

So I went back to my Tiles ACF field group, stripped out the repeater functionality and created a Tile ACF field group. Now you’re building one tile at a time, it has all of the previous benefits of ACF’s dynamic integration of content with template functionality, it seems to correctly fit the definition of a “block” in the WordPress sense, and you can still have a flexible presentation of multiple tiles in a group… just using Block Editor “groups” to achieve that.

I still have more to learn and questions to answer (followed by more questions to ask and then answer), but I feel like this was a major step forward in finding a way to merge the benefits of ACF with the… inevitability, I guess, of the Block Editor.

Fun with recycled IP addresses

OK, well that title kind of gives away the end of the story, but it’s still a good one.

So…

Earlier this week I launched a new site for a client. As part of the usual process, I submitted their sitemap.xml file to Google Search Console and Bing Webmaster Tools. Usually that’s all it takes for a new site to get indexed within 1-3 days.

But it seemed to be taking longer than usual for this client, and I decided to investigate the situation.

I should note that we did a private “soft launch” of the site about a week prior to the official launch. During that time I had a robots “noindex” directive turned on so it wouldn’t start showing up in search engines prematurely.

I went into Google Search Console to request a re-crawl. And that’s when I noticed this…

Excluded due to 'noindex'

Well, that’s… weird. Not so much that it had read a “noindex” directive when it, unfortunately, had crawled the URL just a day before we launched — although it was a bit weird that it had crawled it at all — but that the Referring page was a totally different site that should have had no business linking to us, yet.

So then I did what anyone (?) would naturally do, I visited that URL. And much to my surprise, it redirected to our site. What??

Next I used mxtoolbox to do a DNS lookup, and suddenly it all made sense.

We’re hosting the site at Linode. And as it happens, the DNS entry for the referring site is set to the same IP address as our site. This is a virtual private server, so we’re the only people now using this IP address.

But there are a finite number of possible IP addresses, especially IPv4 addresses (about 4 billion). So they naturally get reused. This particular site was for a limited-use product that was only relevant in 2015, so it’s not too surprising that the owners of the domain took down their Linode server and relinquished the IP address. It’s unfortunate though that they didn’t think to remove the DNS entry from their zone file.

At this point, we could (a) contact them and ask them to update their DNS, but that could be convoluted and time-consuming, for no real benefit to us, (b) set up a rewrite in our server that shunts traffic that’s trying to access their product site back over to their main site, which would take less time but also wouldn’t really benefit us in any way, or (c) leave it as-is, and let the few randos who are still looking for a product that was last relevant during the Obama administration wonder why they’re instead seeing our site.

I’m going with (c).

I’m also going with submitting re-crawl requests to both Google and Bing so we can get in the priority queue, and hopefully by this time tomorrow the site will be showing up in search results.

“It doesn’t suck.”

Is it possible to be excited about a text editor? Well… yes. When it’s good enough to be an essential tool for decades.

Back in 1994 when I was a junior in college and the web was emerging, I wanted to learn how to build web pages. The somewhat helpful person in the Gustavus computer store (why did I go there? I guess because I wasn’t taking any CS classes so I didn’t know any of the faculty) told me to download BBEdit, so I did. That didn’t help me to learn HTML, but it became the program I used to write it as I did learn.

I’ve dabbled with other apps over the years — including PageSpinner in 1996, at the recommendation of my boss at my first professional “webmaster” job*, and HomeSite in the brief period when I mainly worked on Windows 2000 — but guess which program is open on my Mac right now; which program I was busy writing code in moments before I read this Daring Fireball article.

* Yes, that really was my title. I gave up on PageSpinner and switched back to BBEdit shortly thereafter, because although I liked PageSpinner’s color coding (before BBEdit supported it, or at least before I figured out that it did), it bugged me endlessly that it used proportional fonts.

How to make the WooCommerce main shop page show featured products only in 2022

Sorry for the click-bait-y title (especially the “in 2022” part), but I searched fruitlessly for way too long and found too many woefully outdated answers to this. Ultimately what I found still wasn’t quite the complete answer, so I modified it a bit myself to arrive at the following.

The goal here is, as the title suggests, to get the WooCommerce main shop page to only show your featured products. Why this isn’t just a checkbox option in WooCommerce is beyond me. But then a lot of the decisions made by the WooCommerce dev team are beyond me. (Excuse me, Professor Brainiac, but I’ve built e-commerce platforms from scratch and, uh, I think I know how a proton accelerator… oh wait, never mind.)

Anyway, this is it:

add_action('woocommerce_product_query', function($query) {
    if (is_shop()) {
        $query->set('tax_query', array(array(
            'taxonomy' => 'product_visibility',
            'field' => 'name',
            'terms' => 'featured',
            'operator' => 'IN',
        )));
    }
});

What exactly is happening here? Well, as noted by the most helpful resource I found, since WooCommerce 3 (currently on 6.1.1), the “featured” status has been handled by the product_visibility taxonomy, and not by the _featured post meta field. So this needs a tax_query and not a meta_query.

Beyond that, we’re making an extra check that we’re on the shop main page — so this doesn’t affect category archive pages. And we’re using the woocommerce_product_query hook, not pre_get_posts as some other examples suggest, so it only runs on WooCommerce queries and we can skip adding extra conditionals for pre_get_posts to run on, you know, every single post query on every page of the site, including admin.

That’s all there is to it. Now your main WooCommerce shop page will only display featured products, and nothing else changes.