Find the mode of an array in PHP

For those of you who don’t remember studying statistics in math (and I barely do), the mode refers to the value that occurs most frequently in a set of data. That contrasts with the mean — what most of us call the “average” — and the median, which is the “middle” value if you sort all of the values in order.

My daughter was recently studying all of this and it brought it back to my mind. These are really not things I use often. But, as it happens, right now in my work I have a need for a PHP function that determines the mode in a set of data.

In this case, it’s not actually numbers; it’s dates. In short, I have an array of dates, and I need to know which date occurs the most often in the array. You’d think there would be a built-in PHP function for this, likely called array_mode() or else something long and completely illogical, or short and incomprehensible. But alas, there is no array_mode() function.

Fortunately, it’s pretty damn easy to write one. I found some examples on other sites, but they weren’t pithy enough for my tastes, so I rolled my own. Now you don’t have to:

function array_mode($arr) {
  $count = array();
  foreach ((array)$arr as $val) {
    if (!isset($count[$val])) { $count[$val] = 0; }
    $count[$val]++;
  }
  arsort($count);
  return key($count);
}

Perhaps it’s excessive even to bother casting $arr as an array, but it’s a habit I picked up a long time ago and can’t seem to shake. Anyway, there you have it. (Of course, this probably breaks if $val isn’t scalar, but I’ll leave that to you to fix.)

Shut off and lock down comments on your WordPress site with 5 lines of SQL

Comments are kind of passé. Well, OK, they’re still everywhere, but they’re almost universally garbage. Meaningful discussion happens on social media these days, even if it’s prompted by a blog post. And if you’re using WordPress as a general-purpose CMS rather than just as a blogging tool, then you probably have no use for comments whatsoever.

Yet, they’re built in, and they’re a spam magnet. Even if your templates aren’t actually showing comments anywhere, the default WordPress settings allow comments to come in, cluttering up your database and nagging you with a disconcertingly large number in a bright red circle in the WordPress admin bar.

Yuck.

Fortunately, if you have direct database access and the fortitude to run a few simple lines of SQL, you can quickly accomplish the following:

  1. Purge all queued spam and pending comments (while safely retaining any old, approved comments for archival purposes
  2. Prevent any new comments from appearing on any of your existing posts/pages
  3. Prevent comments from ever being accepted on future posts/pages

The last of those is a simple setting. In WP admin, you can go to Settings > Discussion and uncheck the second and third boxes under Default article settings at the top of the page. Actually, uncheck all three of those. If you’re going to turn off incoming pings, you should turn off pingbacks. But my SQL code below doesn’t.

screen-shot-2016-09-09-at-12-22-19-pm

If you’re just starting a brand new WordPress site and you don’t ever intend to allow comments, just go and uncheck those boxes and you’re done. But if you’re trying to rescue a long-suffering WordPress site from drowning in spam, read on.


Here then in all of its glory is the magic SQL you’ve been looking for:

DELETE FROM `wp_comments` WHERE `comment_approved` != 1;
DELETE FROM `wp_commentmeta` WHERE `comment_id` NOT IN (SELECT `comment_ID` FROM `wp_comments`);
UPDATE `wp_posts` SET `comment_status` = 'closed', `ping_status` = 'closed';
UPDATE `wp_options` SET `option_value` = 'closed' WHERE option_name = 'default_comment_status';
UPDATE `wp_options` SET `option_value` = 'closed' WHERE option_name = 'default_ping_status';

Want to dissect what each of these lines is doing? Sure…

Line 1

DELETE FROM `wp_comments` WHERE `comment_approved` != 1;

This is going to delete all “pending” and “spam” comments. It leaves approved comments untouched. Note: you may have spam comments that are approved; one site I was just working on had thousands that were “approved” because the settings were a little too generous. I can’t give a catch-all SQL statement to address that problem, unfortunately. It requires analyzing the content of the comments to some extent.

You’d think maybe `comment_approved` = 0 would be better, but I found as I poked around that the possible values aren’t just 0 or 1. It may also be spam. It may be something else. (I haven’t researched all of the possibilities.)

Line 2

DELETE FROM `wp_commentmeta` WHERE `comment_id` NOT IN (SELECT `comment_ID` FROM `wp_comments`);

There’s a separate table that stores miscellaneous meta data about comments. There’s a good chance there’s nothing in here, but you may as well delete any meta data corresponding to the comments you just deleted, so here you go.

Line 3

UPDATE `wp_posts` SET `comment_status` = 'closed', `ping_status` = 'closed';

This is going through all of the existing posts — which don’t include just “posts”… “pages” are posts, “attachments” are posts… anything in WordPress is a post, really — and setting them to no longer accept comments or pingbacks. This doesn’t delete any comments on the posts that were already approved; it just prevents any new ones.

screen-shot-2016-09-09-at-12-21-58-pm

It’s the equivalent of going into every single post and unchecking the two boxes in the screenshot above. But it only takes a couple of seconds. FEEL THE AWESOME POWER OF SQL!!!

Line 4

UPDATE `wp_options` SET `option_value` = 'closed' WHERE option_name = 'default_comment_status';

Remember that screenshot near the beginning of this post, showing the three checkboxes under Settings > Discussion? Well this is the equivalent of unchecking the third one.

Line 5

UPDATE `wp_options` SET `option_value` = 'closed' WHERE option_name = 'default_ping_status';

And this is the equivalent of unchecking the second one.

So there you have it. No more comments, no more spam, no need for an Akismet account.

Postscript to my first experience running WordPress on PHP 7.0

This is a companion piece to my post from earlier today, When switching servers breaks code: a WordPress mystery.

It’s not much of a mystery anymore, but since this was my first, and somewhat unplanned, foray into working with PHP 7.0, I felt it was worthwhile to spend a bit more time today finally exploring which features from earlier versions are gone in PHP 7.0, so I’ll know if I am going to encounter any more challenges in the eventual, inevitable move to 7.0.

The best source I can find on the removals is this RFC from two years ago (!) that lists deprecated features under consideration for removal, all of which did eventually end up being removed. (I assume this is a complete list of the removals, since I can find no other more current list. But suffice to say it is clear that everything listed here was removed, even if it’s not all that was removed.)

I’m happy to see that the only thing in here that I have been using on any recent projects that was removed was the /e modifier for preg_replace(), which is what I uncovered as the source of the “mystery” in my last post. Of course, before this morning I’m not sure I even would have remembered that I was using it.

That said, I’m relatively confident that I am not using any of the other features, because while I had forgotten about the /e modifier, I at least know I use preg_replace() on a regular basis. Most of the other features in the list are things I don’t recall ever having seen before. Such is the nature of being a self-taught user of a language with over 5,000 built-in functions.

There’s only one other removed feature that I know I have used. A lot. It’s the mysql extension. It has been deprecated in favor of the mysqli extension for several years. But my own CMS, cms34, built originally in 2008 on the then-current version 1.2 of CakePHP, has mysql functions everywhere. I did eventually manage to update the platform to version 1.3.21 of CakePHP, but the leap to 2.0 was too arduous, and reworking hundreds of files to use mysqli functions was, likewise, something that never seemed justified. I stopped new development on cms34 in early 2014, but we still have many client sites running on it. I have committed to supporting them as long as necessary, but I am actively encouraging those clients to make the switch to WordPress.

The absolute incompatibility of cms34 with PHP 7.0 makes the merits of that switch a lot more obvious. And now the clock is ticking.

Bottom line for anyone who, like me, is still supporting legacy PHP 5 (or earlier!) code: change is coming.


Update! Well, OK, I was just searching in the wrong way. The official documentation for migrating from PHP 5.6 to 7.0 has lists of both removed extensions and SAPIs and changed functions.

When switching servers breaks code: a WordPress mystery

Earlier this week I launched a brand new WordPress site for a long-time client. Break out the champagne! But of course it’s never that simple, is it?

The client’s live server is a newly configured VPS running Ubuntu 16.04 LTS and PHP 7.0; meanwhile, our staging server is still chugging away on Ubuntu 14.04 LTS and PHP 5.5. So, clearly, a difference there. But I was pleased to find that, for the most part, the site functions perfectly on the new server.

But then the client discovered a problem: on one page, content from a custom post type query wasn’t displaying.

Here’s a short version of the pertinent code:

$people = new WP_Query(array(
  ‘order’ => ‘ASC’,
  ‘orderby’ => ‘menu_order’,
  ‘posts_per_page’ => -1,
  ‘post_type’ => ‘person’,
));

if ($people->have_posts()) {
  while ($people->have_posts()) {
    $people->the_post();
    ?>
    <article>

      <header><h2><?php the_title(); ?></h2></header>
      <div><?php the_content(); ?></div>

    </article>
    <?php
  }
}

Strangely, the_title() was working fine, but the_content() wasn’t. It had been — still is, in fact — working on our staging server, all other things within the WordPress context being equal. (Identical, up-to-date versions of the theme files and all plugins, and WP itself.) And the client confirmed that the content was present in WP admin.

I found, confusingly, that get_the_content() works, even though the_content() doesn’t. But of course you don’t get all of the proper formatting (like paragraph breaks) without some WP filters that the_content() applies, so I tried this:

<?php echo apply_filters(‘the_content’, get_the_content()); ?>

Still didn’t work. After a bit more research I was reminded that the pertinent function that filter runs is wpautop(), so I just called that directly:

<?php echo wpautop(get_the_content()); ?>

Now I have the content displaying nicely, but this is clumsy and I really do not get what might be different. I know the new server is running PHP 7.0 and our staging server is running PHP 5.5… but I’m struggling to understand what kind of changes in PHP could cause this specific problem.

Since get_the_content() works, and the_content() doesn’t, the problem has to lie in something that’s happening with the filters on the_content(). Why? Because the_content() calls get_the_content() right up front. In fact, there’s not a lot to the_content() at all. This function lives in wp-includes/post-template.php (beginning at line 230 in WP 4.6). Here it is in its entirety (reformatted slightly for presentation here):

function the_content( $more_link_text = null, $strip_teaser = false) {
  $content = get_the_content( $more_link_text, $strip_teaser );

  /**
  * Filters the post content.
  *
  * @since 0.71
  *
  * @param string $content Content of the current post.
  */
  $content = apply_filters( ‘the_content’, $content );
  $content = str_replace( ‘]]>’, ‘]]>’, $content );
  echo $content;
}

As you can see, it’s really just 4 lines of actual code. It calls get_the_content() to retrieve the content, applies filters, does an obscure string replacement (which I think I understand but is not really pertinent here), and then echoes the results out to the page.

It’s pretty clear to me that the problem has to lie in one (or more) of the filters in the 'the_content' stack. I have to admit that even after years of working with it, I only have a rather nebulous understanding of how hooks work, so I’m not even sure where to begin dissecting the filter stack here to pinpoint the source of the trouble.

Whenever I know something works in one place and doesn’t work in another place, the first course of action in troubleshooting is to try to identify all of the differences between the two environments. Obviously we have some big differences here as I noted at the top of this post. But I am going to assume that the problem does not lie at the OS layer. Most likely it’s either a difference between PHP 5.5 and 7.0, or, even more likely, a difference between the PHP configurations on the two servers… specifically, modules that are or are not active. See my previous post on The Hierarchy of Coding Errors for my rationale here. Also keep in mind that I personally was responsible for installing LAMP on the server and configuring PHP, and it’s pretty obvious that we’re looking at the sysadmin equivalent of #1 or #2 in that list.

The next step, were I to care to pursue it much further (and if I didn’t have 200 other more important things to do, now that I have the problem “fixed”), would be to run phpinfo() on both servers and identify all of the differences.

That’s one possible path, at least. Another thing to consider is that the_content() actually is working just fine in other parts of the site, so maybe it would be worth digging into that WordPress filter stack first.

At this point, because as I said I have a few other more important things to work on, I will probably leave the mystery unresolved here. But I’d welcome any ideas from readers as to an explanation for all of this.


Update! I just couldn’t leave well enough alone, so a few minutes after I published this post, with the client’s permission, I restored the old version of the template, turned on WP_DEBUG and installed the Debug Bar plugin. Jackpot! Debug Bar returned the following error message when I was calling the_content(), but not when I had my “fixed” code in place:

Screen Shot 2016-09-01 at 9.24.16 AM

Well, how about that? As it turns out, the problem is due to a filter I myself had added, using a previously written function. (That’s #3 on the hierarchy list.) Combine that with deprecated functionality that was removed in PHP 7.0, and problem solved. And I even figured out why the problem is only occurring on this page and not site-wide… because my filter only runs if there’s an email address link in the content.

Web developers: learn how to Google. If no one else has the same problem, the problem is you.

OK, maybe not you per se. This is not a judgment of your merits as a developer, or as a human being. But it does mean the problem is almost certainly something specific to the code you’ve written.

The Hierarchy of Coding Errors

If your code isn’t working, the source of the problem is one of the following, in order from most likely to least likely:

  1. New procedural code you’ve just written
  2. New object-oriented code you’ve just written*
  3. Custom functions or objects you built, but have used before
  4. Third-party/open source add-ons to the software platform you’re using (e.g. WordPress plugins)
  5. Standard functions or objects in the software platform (e.g. WordPress core)
  6. Public code libraries that are included in your chosen software platform (e.g. jQuery)
  7. Browser bugs
  8. OS bugs
  9. Internet protocol bugs
  10. Quantum fluctuations in the fabric of spacetime
  11. Gremlins

You may have guessed correctly at this point that this blog post is not just idle Friday afternoon musings. I’ve spent the majority of the day today troubleshooting a very strange issue with a website I’m currently building. I fixed the problem, but not after being forced to — once again — confront this humbling reality. If something’s not working, it’s probably your own fault. Especially if you’re the only person with the problem.

Googling the issue got me (almost) nowhere… which was the most obvious clue that it was my own fault

Aside from the natural human inclination to deflect blame, the tools we have for troubleshooting these types of problems are not necessarily well suited to forcing us to be honest with ourselves. It’s too easy to blame external forces.

Here’s my scenario. I found out last week while presenting work-in-progress to a client at their office that there was a JavaScript-related problem with the website. It only affected Internet Explorer (and Edge), which I had not yet tested the site in, and, weirdest of all, it didn’t always happen. I’d say maybe 10-20% of the time, the page loaded normally. But the rest of the time, it got an error.

Since this was only affecting one browser, my natural inclination was to start all the way at number 7 on the list, blaming Internet Explorer. But I’ve learned that as much as I want to blame it, issues with IE usually just shine a light on something in my own code that other browsers are more forgiving about. So it was time to walk backwards down the list. (Again… not really, but this is how it played out.)

The error that the browser reported was a “security problem” with jQuery Migrate. First I had to figure out what the hell jQuery Migrate was and why it was being loaded. (Turns out, it’s a place the jQuery team dumped deprecated code it pulled from version 1.9. It’s loaded by default by WordPress.)

With that in mind, this should be affecting every site I’ve built recently, since they’re all in WordPress. But it was only affecting this one site. So I had to try to narrow down where the problem exists. With WordPress, there are two main “variables” in the implementation: themes and plugins. When in doubt, try switching your theme and disabling the plugins you’re using. I started by disabling all of the plugins, one by one. No change. I found the error didn’t occur if I disabled Advanced Custom Fields, but that’s because half of the page didn’t load without it! (That’s another error on my part but let’s ignore that for now, shall we?)

OK, so it’s not a plugin. Next I swapped in the standard Twenty Sixteen theme in place of my custom theme. Not surprisingly, the error didn’t occur, but that didn’t help much because none of my Advanced Custom Fields content was in the pages. I still couldn’t rule out ACF as the culprit. But I tend to reuse field groups from site to site, so once again, if this were attributable to an ACF issue — even something specific to my field groups — it would’ve cropped up on another site.

So now I had little left to do but selectively comment out elements of the theme so I could narrow down where the problem was. (I make this all sound like a logical progression; in fact my debugging process is a lot more chaotic than this description — I actually did this commenting-out process haphazardly and repetitively throughout the afternoon.)

Eventually I pinpointed the troublesome block of code. Yes, it was #1 from the list. But as is usually the case with hard-to-diagnose problems, the complete picture here is that #1 included a combination of #3 and #5, which triggered an error message generated by #6, but only in the context of #7.

Yes. That’s what happened.

In the footer of the page, I had a link to the client’s email address. As is my standard (but by now probably outmoded) practice, I have a custom-built function I wrote years ago to obfuscate the email address by randomly converting most (but not all) of the characters in the string into HTML ampersand entities. My problem was not that function itself, which is tried and true. It’s that in this particular instance I called it on a string that included the mailto: pseudo-protocol, not just the email address itself.

I think the colon in mailto: is particularly significant to the problem, as evidenced by the fact that around 10-20% of the time the problem didn’t occur, and the page loaded normally. Since my obfuscation function randomly leaves characters in the string alone, that’s about how often the colon would have been kept untouched.

But even then, what difference should it make? Browsers decode those entity strings and can handle them in the href attribute of links just fine. However in this particular case I didn’t just use my obfuscation function. Without giving it much thought, in this particular site I had decided to wrap the obfuscated string in the standard WordPress esc_url() function. Trying to properly sanitize things, like a good developer. Right? Except — and I took a quick look at the source code to confirm it — there’s special handling in esc_url() for strings that don’t contain a colon. So the roughly 86% of the time that my string didn’t contain a colon, esc_url() was prepending http:// onto the string.

This situation was causing a particular piece of code in jQuery Migrate to barf… but only in Internet Explorer and Edge, for reasons I still don’t understand, but it has to do with how the different browsers handle security warnings in JavaScript. I found along the way (but before I had pinpointed the real problem) that if I commented out a particular segment of code in jQuery Migrate pertaining to the handling of selectors containing hashtags (see, the HTML ampersand entities again) I could get the page to load normally.

So, like I said, my newly written procedural code (#1), which itself included calls to both an existing custom function I wrote (#3) and a function baked into the WordPress core (#5), caused jQuery Migrate to issue an error (#6) but it was one that only a particular browser (Internet Explorer/Edge) cared to acknowledge (#7).

No wonder it took all afternoon to figure it out.

* The only reason I break out OO from procedural code is that OO has more structured patterns that are less likely to result in sloppy mistakes. Slightly.