How to purge fake/bot WooCommerce customer accounts directly in the MySQL database

DANGER! If you don’t know the havoc one can wreak with a DELETE statement in MySQL, stop right here. I take no responsibility for what you might do with the information that follows.


Bots like to create fake customer accounts on WooCommerce (WordPress) sites, apparently. What they’re attempting to do, I don’t know. But if you don’t stay on top of things, you might find you have thousands of fake customer accounts in your site. Chances are they haven’t, won’t, and can’t actually cause any damage, but they’re cluttering things up, and any unnecessary user account in a WordPress database represents a potential future security risk.

On a particular client’s heavy-traffic WooCommerce site, I discovered that over the course of the site’s 7-year lifespan it had accumulated nearly 8,000 such accounts, and I wanted to be rid of them.

After carefully exploring the data surrounding a few of these obviously fake accounts, I determined a pattern, and came up with a fairly cautious set of conditions that, to me, indicated a customer was fake:

  1. They had the customer role.
  2. Their user account had nothing in the First Name or Last Name fields.
  3. Likewise, their user account had nothing in the Billing First Name or Billing Last Name fields. (If you’re feeling extra draconian, you might skip this one.)
  4. They had never placed an order while logged in — their user ID did not have an _order_count entry in the wp_usermeta table. Which is perhaps an obvious condition because…
  5. They had never logged in at all — their user ID did not have a wfls-last-login entry in the wp_usermeta table. This condition will only apply if your site uses WordFence.

You can test all of those conditions with a single SQL query:

SELECT DISTINCT `user_id`
FROM `wp_usermeta`
WHERE
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wp_capabilities' AND `meta_value` = 'a:1:{s:8:"customer";b:1;}') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'first_name' AND `meta_value` = '') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'last_name' AND `meta_value` = '') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_first_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_last_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = '_order_count' AND meta_value > 0) AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wfls-last-login');

You may want to spot check some of these IDs in the wp_users table, or directly in the site admin, just to be sure everything looks right. Then you can turn the above into a subquery that will delete all of the matching users. Be sure to make a full backup of your database before doing this!

DELETE FROM `wp_users` WHERE `ID` IN (
SELECT DISTINCT `user_id`
FROM `wp_usermeta`
WHERE
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wp_capabilities' AND `meta_value` = 'a:1:{s:8:"customer";b:1;}') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'first_name' AND `meta_value` = '') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'last_name' AND `meta_value` = '') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_first_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_last_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = '_order_count' AND meta_value > 0) AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wfls-last-login')
);

OK… so the users are gone. But each one has a bunch of records in the wp_usermeta table. Now that we’ve gotten rid of the users themselves, it’s easy to purge their associated meta data:

DELETE FROM `wp_usermeta` WHERE `user_id` NOT IN (SELECT `ID` FROM `wp_users`);

I spent 5 hours troubleshooting this WordPress problem so you don’t have to (starring: WooCommerce Action Scheduler)

Sorry for that “clickbaity” headline. I added the parenthetical so it might be at least marginally useful. Since my WordPress-related posts are always about how I solved a particularly weird or obscure WP issue, I usually consider their titles carefully. “What would I have googled to find a solution to this problem?” But honestly, I spent 5 hours on this yesterday partly because I wasn’t sure what to google. (And I use lowercase “google” as a generic for “conduct an Internet search”; I normally use DuckDuckGo.)

OK, so here’s the situation. This particular site is — among my normally very-low-traffic clients — one of the busiest I work on. It’s a WooCommerce site with hundreds of products and 20+ daily orders. (Yeah, 20+ orders a day is not huge, but on the scale I normally deal with, it’s a lot.)

This site runs on its own virtual private server, with 8 GB RAM and 4 vCPUs. Pretty substantial for a single site. And yet, for weeks it has been maxing out RAM and CPU resources. Not to the point where the site was in crisis mode that demanded my immediate attention, but it was frustratingly slow. Just slightly below the threshold of me dropping work on other new projects to try to fix this. (At this point I feel obliged to note that I did not actually build or maintain this site for its first couple of years of existence, so I don’t know its inner workings as well as I normally would. I just know it’s running way too many plugins and desperately needs some TLC I have not had time to give it.)

Yesterday things finally got to the breaking point. For me, at least. The client had contacted me about an unrelated issue, but as I was dealing with that, I got frustrated by seeing all of this inexplicable resource usage, so I had to address it.

As it happens, this post is actually a bit of a sequel to my last post, about getting Apache’s mod_status and mod_rewrite to play nicely on a WordPress site. About three weeks ago I finally got mod_status working on this site, and had planned to come back, when I had a chance, to investigate this issue.

If you are not familiar with mod_status, you should check it out. Apache is generally a bit of a “black box” but this lets you see exactly what’s happening with each thread — the requested URL, the requesting IP address, connection time, resource usage, etc.

I noticed an absurd number of threads were coming from the localhost and were requesting wp-admin/admin-ajax.php with a query string referencing WooCommerce’s Action Scheduler. But what to do with that information?

I’ll admit, this is where I wasted a bunch of time in fruitless searches, because I don’t know a lot about Action Scheduler. I read a few threads on the WordPress support forums and StackOverflow that kind of danced around the problem I was having but never really got at it.

Eventually I ended up in phpMyAdmin, scrutinizing the wp_actionscheduler_actions table, and trying to figure out where all of the wc_facebook_regenerate_feed actions were coming from. I used my old favorite Search-Replace-DB to try to find any instances of “facebook” in the database. (This was an utter failure, for reasons I can’t explain. But that failure was critical to why this took me so long to resolve.)

I went to Tools > Scheduled Actions in WP admin and discovered there were over 200,000 actions, although there were only about 70,000 (only!) showing up in wp_actionscheduler_actions. Mystery!

I went to wp_actionscheduler_actions again, saw that those wc_facebook_regenerate_feed actions had all been scheduled weeks ago, and decided to just chuck out the lot. I truncated the table, but within seconds it started filling up again with hundreds of wc_facebook_regenerate_feed actions, with the same weeks-old scheduled dates. Where were they coming from???

What was especially maddening to me about all of this was that I had already, weeks ago, determined that the plugin that had created these — Facebook for WooCommerce — had been causing some kind of trouble, and I had deactivated it. Yesterday I even went so far as to delete it. I scoured the theme code for references to Facebook. I looked in the file system for stray files that might be responsible. And as I mentioned before, I tried to search the database for any references to Facebook. I was getting nowhere.

Eventually I realized Search-Replace-DB was having problems, so I dove into phpMyAdmin directly and started searching individual fields, in individual tables, for “Facebook”. And that’s where I finally figured it out.

WordPress puts everything in wp_posts, and that’s a problem.

I’ve complained over the years about the database architecture in WordPress. Having built multiple custom CMSes in the years prior to when I finally, fully embraced WordPress in 2014, I have a fair bit of experience designing databases. And two things I learned in that experience were: 1) clearly define what your data tables are for, and 2) indexes make databases efficient. WordPress is awesome for many things, but it has far, far outgrown its original conception as blogging software. Custom Post Types and Custom Fields make it super-flexible, but shoving everything they create into wp_posts and wp_postmeta can create a disastrous situation.

Case in point, WooCommerce scheduled actions. In earlier iterations, those were custom posts! (As are, still, WooCommerce orders, which is totally f***ed up, if you ask me.) At some point Woo or Automattic realized scheduled actions don’t belong in wp_posts, so they created four new tables specifically for managing them. Plugins that use scheduled actions had to create new scheduled actions for migrating the old wp_posts scheduled actions into the new tables.

And that’s where I found myself. Through some curious set of circumstances with this particular site, which probably at some point included someone other than me disabling WP-Cron to try to fix some other problem, 200,000+ scheduled actions from the Facebook for WooCommerce plugin (in the wp_posts table) got queued up for migration to the new tables. And as quickly as I was deleting them from the new tables, Action Scheduler (which runs once a minute!!!) was dutifully refilling them.

(And obviously they were never actually running… perhaps because I had deactivated the plugin? Or because they were simply timing out? Who knows? But here’s something I see as a flaw with Action Scheduler: it should check to see if the plugin that scheduled an action is currently active, and if not, purge the action immediately.)

At last here was the fix. I had to run this SQL query in phpMyAdmin. (Proceed with caution! Don’t just use this code… look in your database for exactly what is causing problems and adjust accordingly.)

DELETE FROM `wp_posts` WHERE `post_title` = 'wc_facebook_regenerate_feed';

Note: I’m doing this from memory — and a glance back at my browser history from yesterday. I didn’t keep notes on exactly what the title was.

For a more generalized — and drastic — approach, you could also do this:

DELETE FROM `wp_posts` WHERE `post_type` = 'scheduled-action';

I’ll just conclude here with a nice little graph of the site’s CPU and RAM usage over the past 24 hours. It was 6 PM when I finally figured this out!

Progress…

I realized what a daunting task it was going to be to use the built-in WordPress functionality to manually copy over all of the articles from my old site database, so I dug under the hood and used the power of SQL to just more-or-less dump everything from one database into the other. So now all (well, most) of the stuff is moved over, and I can begin to disassemble the old site (currently lingering in the link set as “Room 34 Cold Storage”).

If you care to look back at the old stuff (and why wouldn’t you?), you may notice some oddities… that’s mainly due to inconsistencies between my idiosyncratic HTML and the fancy formatting/clean-up WordPress does. I’ll gradually fix that stuff… maybe. (Right after I get around to organizing the basement.)

I’m also tinkering more with the themes. I’ve downloaded several interesting themes that I plan to try out for a while, to get a feel for both their visual appeal as well as the ways they handle certain technical features of WordPress. Once I find one I like, I’ll use it as a basis to build my own, one that incorporates my photography as well. (Will the excitement never end?!)

Oh, I almost forgot to mention… some of the old stuff didn’t have dates, so I hastily and more-than-somewhat arbitrarily assigned them all to January 1, 2004.