How to purge fake/bot WooCommerce customer accounts directly in the MySQL database

DANGER! If you don’t know the havoc one can wreak with a DELETE statement in MySQL, stop right here. I take no responsibility for what you might do with the information that follows.


Bots like to create fake customer accounts on WooCommerce (WordPress) sites, apparently. What they’re attempting to do, I don’t know. But if you don’t stay on top of things, you might find you have thousands of fake customer accounts in your site. Chances are they haven’t, won’t, and can’t actually cause any damage, but they’re cluttering things up, and any unnecessary user account in a WordPress database represents a potential future security risk.

On a particular client’s heavy-traffic WooCommerce site, I discovered that over the course of the site’s 7-year lifespan it had accumulated nearly 8,000 such accounts, and I wanted to be rid of them.

After carefully exploring the data surrounding a few of these obviously fake accounts, I determined a pattern, and came up with a fairly cautious set of conditions that, to me, indicated a customer was fake:

  1. They had the customer role.
  2. Their user account had nothing in the First Name or Last Name fields.
  3. Likewise, their user account had nothing in the Billing First Name or Billing Last Name fields. (If you’re feeling extra draconian, you might skip this one.)
  4. They had never placed an order while logged in — their user ID did not have an _order_count entry in the wp_usermeta table. Which is perhaps an obvious condition because…
  5. They had never logged in at all — their user ID did not have a wfls-last-login entry in the wp_usermeta table. This condition will only apply if your site uses WordFence.

You can test all of those conditions with a single SQL query:

SELECT DISTINCT `user_id`
FROM `wp_usermeta`
WHERE
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wp_capabilities' AND `meta_value` = 'a:1:{s:8:"customer";b:1;}') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'first_name' AND `meta_value` = '') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'last_name' AND `meta_value` = '') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_first_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_last_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = '_order_count' AND meta_value > 0) AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wfls-last-login');

You may want to spot check some of these IDs in the wp_users table, or directly in the site admin, just to be sure everything looks right. Then you can turn the above into a subquery that will delete all of the matching users. Be sure to make a full backup of your database before doing this!

DELETE FROM `wp_users` WHERE `ID` IN (
SELECT DISTINCT `user_id`
FROM `wp_usermeta`
WHERE
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wp_capabilities' AND `meta_value` = 'a:1:{s:8:"customer";b:1;}') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'first_name' AND `meta_value` = '') AND
`user_id` IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'last_name' AND `meta_value` = '') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_first_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'billing_last_name') AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = '_order_count' AND meta_value > 0) AND
`user_id` NOT IN (SELECT DISTINCT `user_id` FROM `wp_usermeta` WHERE `meta_key` = 'wfls-last-login')
);

OK… so the users are gone. But each one has a bunch of records in the wp_usermeta table. Now that we’ve gotten rid of the users themselves, it’s easy to purge their associated meta data:

DELETE FROM `wp_usermeta` WHERE `user_id` NOT IN (SELECT `ID` FROM `wp_users`);

How to sort empty values last in WordPress

For the past several days I’ve been hammering my head against a conundrum: how to get WordPress to sort a set of posts in ascending order, but with empty values at the end of the list instead of the beginning.

This seems like it should be a simple option in the query. But MySQL doesn’t offer a straightforward way to do this. There are some fairly simple MySQL tricks that will accomplish it, but there’s no way to apply those tricks within the context of WP_Query because they require manipulating either the SELECT or ORDER BY portions of the SQL query in ways WP_Query doesn’t allow. (I mean, you can write custom SQL for WP_Query, but if you’re trying to alter the output of the main query, good luck.)

I tried everything I could possibly think of yesterday with the pre_get_posts hook, but it all went nowhere, other than discovering a very weird quirk of MySQL that I don’t fully understand and won’t bother explaining here.

Sleep on it

I woke up this morning with an idea! I resigned myself to the fact that this ordering can’t happen before the query runs, but I should be able to write a pretty simple function to do it after the query has run.

Bear one key thing in mind: This is not going to work properly with paginated results. I mean, it’ll sort of work. The empty values will get sorted to the end of the list, but they’ll stay on the same “page” they were on before the query was run. In other words, they’ll be sorted to the bottom of page one, not of the last page. Anyway… consider this most useful in cases where you’re setting posts_per_page to -1 or some arbitrarily large number (e.g. 999).

The function

This simple (and highly compact) function accepts a field name (and a boolean for whether or not it’s a custom field [meta data]), then takes the array of posts in the main query ($wp_query), splits them into two separate arrays — one with the non-empty values for your selected field, one with the empty values — and then merges those arrays back together, with all of the non-empty values first. (Other than shifting empties to the back, it retains the same post order from the original query.)

function sort_empty_last($field, $is_meta=false) {
  global $wp_query;
  if (!$wp_query->is_main_query()) { return; }
  $not_empty = $empty = array();
  foreach ((array)$wp_query->posts as $post) {
    $field_value = !empty($is_meta) ? get_post_meta($post->ID, $field) : $post->{$field};
    if (empty(implode((array)$field_value))) { $empty[] = $post; }
    else { $not_empty[] = $post; }
  }
  $wp_query->posts = array_merge($not_empty,$empty);
}

Calling the function

As I said, this function is designed to work directly on the main query. You just need to call the function right before if (have_posts()) in any archive template where you want it to apply. Because of the way it works — especially the posts_per_page consideration — I thought calling it directly in the template was the most clear-cut way to work with it. Here’s an example of the first few lines of a really basic archive template that uses it, looking for a custom field (meta data) called deadline:

<?php

get_header();

sort_empty_last('deadline', true);

if (have_posts()) {

Noted for future reference (by me): How to reset the MySQL root password in Ubuntu 18.04

As a bleeding-edge early adopter, here in June 2018 I am already using Ubuntu 18.04 LTS for new sites I’m setting up for clients. (How daring!)

I ran aground this afternoon with a new server setup, because I couldn’t log into phpMyAdmin as root (or, therefore, as anyone, since I hadn’t set up my own user yet).

All of the old familiar ways I had been trying, and the tutorials I had referred to for at least the past two years, were not working. So then I specifically searched for a solution for Ubuntu 18.04 and found this excellent tutorial.

First off, mysql_secure_installation wasn’t working. That was one of the “old familiar ways” I had already tried thrice. THRICE I TELL YOU!

The key, I think, was these two unexpected lines:

$ sudo mkdir -p /var/run/mysqld
$ sudo chown mysql:mysql /var/run/mysqld

Because that folder didn’t exist and/or didn’t have the proper permissions, some other steps were failing silently or giving error messages that failed to point me in the direction of the actual problem.

The other thing to note is that with the version of MySQL included in Ubuntu 18.04, these are the proper MySQL commands to run:

USE mysql;
UPDATE user SET authentication_string=PASSWORD("linuxconfig.org") WHERE User='root';
UPDATE user SET plugin="mysql_native_password" WHERE User='root';

And it is actually important to run both of those UPDATE commands, because in the tutorial the results displayed show that the first one updated a record for them, while the second didn’t. I had already run the first command (but not the second) in one of my failed updates. So when I ran these, the first one didn’t update any records and the second one did.

When WordPress Treats an Administrator Like a Contributor

The first sign that something was wrong was when I tried to create a new page on the client’s site. The blue Publish button I normally see was replaced with Submit for Review. What the…? That’s what WordPress users with the lowly Contributor role usually see. But I’m an Administrator — the most mighty role known to the world of (single-site) WordPress. (Yes, multi-site installations also confer the fearsome title of Super Admin upon a select few.)

Worse still, if I tried to click Submit for Review, it wouldn’t actually save!

Other problems abounded — I tried to create a new user with Administrator privileges, just to see if my own user account was corrupt. Couldn’t save that, either.

I had Debug Bar installed, and I noticed it was giving an error:

WARNING: wp-admin/includes/post.php:641 - Creating default object from empty value
get_default_post_to_edit

Well, that’s not good. Googling the error didn’t lead to anything immediately helpful, besides this comment that led me to explore the database structure in phpMyAdmin for any problems.

Yes, there were problems. Many of the tables, including wp_options, wp_posts, wp_postmeta and wp_users were missing their primary keys. A bit more digging into the WordPress core showed that, for complex reasons (i.e. I don’t totally get it), without primary keys on these tables, WordPress can’t determine the post type of a new post, and if it can’t determine the post type, it can’t determine the user’s capabilities with regard to that post type, which all comes back to…

WARNING: wp-admin/includes/post.php:641 - Creating default object from empty value
get_default_post_to_edit

Googling on the matter of WordPress tables missing their primary keys (or, perhaps more pertinently, their auto-increments), led me to a solution!!

Fixing WordPress indexes, foreign keys and auto_increment fields

Well, a partial solution. Because the database I was working with was not damaged in exactly the same way as the one the OP was working with, I couldn’t use the sample code directly. I had to go through the database and manually create a few primary keys, delete a bunch of auto-draft posts that all had an ID of 0, etc. Then I had to skip a few lines of the OP’s SQL code because they referred to tables that hadn’t lost their keys in my case, for whatever reason. But this is the… key… to solving the problem.

Now then, how did the database get this way? Well, the site lives on a fairly creaky old Fatcow (ugh, that name) shared hosting account, running an old version of MySQL and an almost unrecognizably ancient version of phpMyAdmin. We were undertaking major content changes on the site, so I copied it over to my own sleek, modern staging server running the latest and greatest of everything. The idea was that we’d get all of our changes in place just the way we wanted on the staging server, rather than mess up the live site for 2-3 weeks, and when we were done, we’d just copy everything back over.

Slick. Right? Sure, if both servers are running reasonably identical software versions. Which of course is never the case. Ever.

Apparently when I copied the site back to Fatcow, due to the older MySQL (or possibly phpMyAdmin) version, certain things like the primary keys and auto-increments — and, I’d be willing to bet, but I’m not sure it matters, the collation as well — got lost along the way.

WordPress challenge of the day: sorting by meta value, including posts WITHOUT that meta value set

I dug around quite a bit for a solution to this today, and eventually I found one, even though it’s a bit ugly.

The problem here is in setting up a WordPress query that sorts posts based on a meta value. I wanted to sort a list of pages by template, but I wanted to include all of the pages, even ones that don’t explicitly have a template set. But the default query was only showing the ones that did have the template value.

Several dead ends almost led me to give up, until I realized it was a JOIN issue. Specifically, the need to change an INNER JOIN to a LEFT JOIN. I just needed to figure out how to do that in the context of WP_Query.

Cut to the chase, here’s what I ended up with.

add_action('pre_get_posts', function($query) {
  if (!is_admin()) { return; }
  $orderby = $query->get('orderby');
  if ('_wp_page_template' == $orderby) {
    $query->set('meta_key','_wp_page_template');
    $query->set('orderby','meta_value');
    // Workaround to include items without this meta key
    // Based on: https://core.trac.wordpress.org/ticket/19653#comment:11
    add_filter('get_meta_sql', function($clauses) {
      $clauses['join'] = str_replace('INNER JOIN','LEFT JOIN',$clauses['join']) . $clauses['where'];
      $clauses['where'] = '';
      return $clauses;
    });
  }
});

I don’t like doing a str_replace() on part of the pre-built query, but sometimes you gotta do what you gotta do. Also note that this is part of a larger function I am writing that is only for use in the admin side; you could remove that is_admin() check if you want this to work everywhere.

I haven’t had a chance to dig into the details of the query to figure out why the original source post included moving $clauses['where'] into $clauses['join'], but it’s essential. I tried skipping it, and it didn’t work.

Sorry I can’t provide any more context here… but I hope it’s helpful to anyone who finds themselves in a similar situation!