Postscript to my first experience running WordPress on PHP 7.0

This is a companion piece to my post from earlier today, When switching servers breaks code: a WordPress mystery.

It’s not much of a mystery anymore, but since this was my first, and somewhat unplanned, foray into working with PHP 7.0, I felt it was worthwhile to spend a bit more time today finally exploring which features from earlier versions are gone in PHP 7.0, so I’ll know if I am going to encounter any more challenges in the eventual, inevitable move to 7.0.

The best source I can find on the removals is this RFC from two years ago (!) that lists deprecated features under consideration for removal, all of which did eventually end up being removed. (I assume this is a complete list of the removals, since I can find no other more current list. But suffice to say it is clear that everything listed here was removed, even if it’s not all that was removed.)

I’m happy to see that the only thing in here that I have been using on any recent projects that was removed was the /e modifier for preg_replace(), which is what I uncovered as the source of the “mystery” in my last post. Of course, before this morning I’m not sure I even would have remembered that I was using it.

That said, I’m relatively confident that I am not using any of the other features, because while I had forgotten about the /e modifier, I at least know I use preg_replace() on a regular basis. Most of the other features in the list are things I don’t recall ever having seen before. Such is the nature of being a self-taught user of a language with over 5,000 built-in functions.

There’s only one other removed feature that I know I have used. A lot. It’s the mysql extension. It has been deprecated in favor of the mysqli extension for several years. But my own CMS, cms34, built originally in 2008 on the then-current version 1.2 of CakePHP, has mysql functions everywhere. I did eventually manage to update the platform to version 1.3.21 of CakePHP, but the leap to 2.0 was too arduous, and reworking hundreds of files to use mysqli functions was, likewise, something that never seemed justified. I stopped new development on cms34 in early 2014, but we still have many client sites running on it. I have committed to supporting them as long as necessary, but I am actively encouraging those clients to make the switch to WordPress.

The absolute incompatibility of cms34 with PHP 7.0 makes the merits of that switch a lot more obvious. And now the clock is ticking.

Bottom line for anyone who, like me, is still supporting legacy PHP 5 (or earlier!) code: change is coming.


Update! Well, OK, I was just searching in the wrong way. The official documentation for migrating from PHP 5.6 to 7.0 has lists of both removed extensions and SAPIs and changed functions.

Maintaining session between SSL and non-SSL pages in CakePHP

It’s funny, in a way, that cms34 has been around for nearly five years now, and it’s only just become a real issue that we were not maintaining sessions between SSL and non-SSL pages. This is somewhat understandable: practically speaking, the only time it really matters to carry over the session between the two is when a user is logged in. (This might not be the case with all web applications, but for us, at least, there’s rarely enough happening in the session when a user is not logged in for it to matter.)

As it happens, not that many cms34 sites use SSL; not that many cms34 sites use the user login capability on the front end. And very few use both. But we’ve had a couple of new sites come online lately that do use both, and it’s become a bit of an issue.

The issue was exacerbated by the fact that I recently modified the Users controller to require SSL on the login page, if the site has an SSL certificate. Consequently there were issues with trying to access login-restricted, but non-SSL pages… redirect loops and other such fun.

What’s the problem?

The problem is simple: for obvious security reasons, sessions and cookies cannot be shared directly between two different domains. It’s possible (although less secure) to share them between both SSL and non-SSL on the same domain, and it’s also relatively easy to set them up to work between different subdomains. But if your SSL pages use a different domain name than the non-SSL pages, even if they’re on the same server, there’s no way to get them to automatically use the same session.

The solution (though still not ideal, as it can introduce the risk of session hijacking), as you’ll find in lots of places, is to pass the session ID as a query string variable. Then you can use that to restore the same session ID, even if it’s on another domain — as long as it’s on the same physical server.

Some improvements

There are two key improvements I made to the basic “pass the session ID in the query string” scenario.

First, when the session is created I am writing the user’s IP address (using $_SERVER['REMOTE_ADDR']) as a session variable. Then, when I am attempting to restore the session with the ID passed as a query string variable, I read the session file on the server first, and make sure the IP address in the file matches still matches the user’s current IP address. Only then do I restore the session.

Second, and this is an aesthetic issue almost as much as a security one, once the session has been re-established, and before any response has been sent, I strip the session ID out of the requested URL and redirect to that new URL. It’s all invisible to the user, and the session ID never actually appears in the browser’s address bar.

A look at the code

There’s a lot going on in the cms34 code, much of which is highly specific to this application. But in short the keys to making this work happen in two places:

UsersController::login()

I have a login() action in UsersController that handles all of the special functionality that needs to happen when a user logs in. The actual login itself happens “automagically” via AuthComponent, but Auth doesn’t know everything I need to have happen when a user logs in, so after Auth does its work, my login() action takes it from there.

Honestly not a lot needs to happen here to make this work. Just two things: you have to write the user’s IP address to the session as I noted above, and you have to pass the session ID in a query string variable on the redirect that happens when login is successful. My code looks a little something like this (note that I have an array in the session called Misc that I use for… miscellaneous stuff like this):

class UsersController extends AppController {

  var $name = 'Users'
  // Other controller variables go here, of course.

  function login() {

    // All of this should only run if AuthComponent has already logged the user in.
    // Your session variable names may vary.
    if ($this->Session->read('Auth.User')) {

      // Various session prep stuff happens here.

      // Write IP address to session (used to verify user when restoring session).
      $this->Session->write('Misc.remote_addr(',$_SERVER['REMOTE_ADDR']);

      // Some conditionals for special redirects come here but we'll skip that.

      // Redirect user to home page, with session ID in query string.
      // Make up a query string variable that makes sense for you.
      $this->redirect('/?cms34sid=' . session_id());

    }
  }
}

So far, so good. The rest of the excitement happens in…

AppController::beforeFilter()

Ah yes, the magical beforeFilter() method. There’s a whole lot of stuff going on in AppController::beforeFilter() in cms34, most of which is highly specific to our application. But this is where you will need to put your code to retrieve the session ID from the query string and restore the session… this function runs at the beginning of every page load on your site.

I’ve put this logic almost at the beginning of beforeFilter(), because we really do want that session restored as soon as possible.

Here’s a look…

class AppController extends Controller {

  function beforeFilter() {

    // Additional code specific to your app will likely come before and after this.

    // Only run if session ID query string variable is passed and different from the existing ID.
    // Be sure to replace cms34sid with your actual query string variable name.
    if (!empty($this->params['url']['cms34sid']) && $this->params['url']['cms34sid'] != session_id()) {

      // Verify session file exists.
      // I am using CakeSession; your session files may be elsewhere.
      $session_file = TMP.DS.'sessions'.DS.'sess_'.$this->params['url']['cms34sid'];

      if (file_exists($session_file)) {
        $session_contents = file_get_contents($session_file);

        // Find user's IP address in session file data (to verify identity).
        // The CakePHP session file stores a set of serialized arrays; we're reading raw serialized data.
        // If you used a different session variable name than remote_addr, change it AND the 11 to its string length.
        $session_match = 's:11:"remote_addr";s:'.strlen($_SERVER['REMOTE_ADDR']).':"'.$_SERVER['REMOTE_ADDR'] .'";';

        // User's IP address is in session file; so we can continue.
        if (strpos($session_contents,$session_match) !== false) {

          // Set session ID to restore session
          $this->Session->id($this->params['url']['cms34sid']);

          // Redirect to this same URL without session ID in query string
          $current_url = rtrim(preg_replace('/cms34sid=[^&]+[&]?/','',current_url()),'?&');
          $this->redirect($current_url);
        }
      }
    }
  }
}

A few final thoughts

I didn’t really discuss cookies at all here, but suffice to say there’s a cookie containing the session ID that gets written. If you’re only using cookies for the session ID (which is probably a good idea), then you don’t really need to do anything else with them. But if you’re writing certain cookies when a user logs in (like I do), you’ll need to write additional logic to restore them in AppController::beforeFilter(). In my case, the relevant cookies are all duplicates of session data, but are intended for a few edge cases where I need to access that information through JavaScript or in .htaccess files that are protecting login-restricted downloadable files — in other words, places where I can’t use PHP to look at session data.

You may also notice near the end of the code in the AppController::beforeFilter() example above that I am calling a function called current_url(). This is not a built-in part of PHP or CakePHP; it’s a simple little function I have in my config/functions.php file. Here it is:

function current_url() {
  return (!empty($_SERVER['HTTPS']) ? 'https://' : 'http://') . $_SERVER['SERVER_NAME'] . $_SERVER['REQUEST_URI'];
}

CakePHP at the command line: it’s cron-tastic!

I’m kind of surprised it’s taken this long. cms34 has been around for almost three years now, and this is the first time I’ve had a client need a cron job that relied on CakePHP functionality. (The system has a couple of backup and general-purpose cleanup tools that can be configured as cron jobs, but they’re just simple shell scripts.)

And so it was today that I found myself retrofitting my CakePHP-based web application to support running scripts from the command line. I found a great post in the Bakery that got me about 85% of the way there, but there were some issues, mostly due to the fact that the post was written in late 2006, and a lot has happened in CakePHP land over the last 4 1/2 years.

There are two big differences in app/webroot/index.php in CakePHP 1.3 compared to the version that existed at the time of that original post: first, the calls to the Dispatcher object are now wrapped in a conditional, so the instruction to replace everything below the require line should now be something closer to “replace everything within the else statement at the bottom of the file.”

The other big change is that this file defines some constants for directories within the application, and those paths are all wrong if you move the file into the app directory as instructed.

Below is my revised version of the code. Note that I also reworked it slightly so the command can accept more than two arguments as well. There’s a space before Dispatcher is called where you can insert any necessary logic for handling those arguments. (I also removed all of the comments that appear in the original version of the file.)

<?php
if (!defined('DS')) {
    define('DS', DIRECTORY_SEPARATOR);
}
if (!defined('ROOT')) {
    define('ROOT', dirname(dirname(__FILE__)));
}
if (!defined('APP_DIR')) {
    define('APP_DIR', basename(dirname(__FILE__)));
}
if (!defined('CAKE_CORE_INCLUDE_PATH')) {
    define('CAKE_CORE_INCLUDE_PATH', ROOT);
}
if (!defined('WEBROOT_DIR')) {
    define('WEBROOT_DIR', APP_DIR . DS . 'webroot');
}
if (!defined('WWW_ROOT')) {
    define('WWW_ROOT', WEBROOT_DIR . DS);
}
if (!defined('CORE_PATH')) {
    if (function_exists('ini_set') && ini_set('include_path', CAKE_CORE_INCLUDE_PATH . PATH_SEPARATOR . ROOT . DS . APP_DIR . DS . PATH_SEPARATOR . ini_get('include_path'))) {
        define('APP_PATH', null);
        define('CORE_PATH', null);
    } else {
        define('APP_PATH', ROOT . DS . APP_DIR . DS);
        define('CORE_PATH', CAKE_CORE_INCLUDE_PATH . DS);
    }
}
if (!include(CORE_PATH . 'cake' . DS . 'bootstrap.php')) {
    trigger_error("CakePHP core could not be found. Check the value of CAKE_CORE_INCLUDE_PATH in APP/webroot/index.php. It should point to the directory containing your " . DS . "cake core directory and your " . DS . "vendors root directory.", E_USER_ERROR);
}
if (isset($_GET['url']) && $_GET['url'] === 'favicon.ico') {
    return;
} else {
    // Dispatch the controller action given to it
    // eg php cron_dispatcher.php /controller/action
    define('CRON_DISPATCHER',true);
    if($argc >= 2) {

        // INSERT ANY LOGIC FOR ADDITIONAL ARGUMENT VALUES HERE

        $Dispatcher= new Dispatcher();
        $Dispatcher->dispatch($argv[1]);
    }
}

After implementing this version of cron_dispatcher.php, I was able to get CakePHP scripts to run at the command line without being fundamentally broken, but there were still a few further adjustments I needed to make (mostly in app_controller.php). Those were specific to my application. You’ll probably find yourself in the same boat.

A couple of other things worth noting: if your server is anything like mine, you’ll need to specify the full path of the PHP command line executable when running your scripts. In my case it was /usr/local/bin/php. And, the one sheepish confession I have to make: I know the goal with the way the file path constants are defined is to avoid any literal naming, but I couldn’t find a good way to get around that for WEBROOT_DIR, with the file no longer residing in webroot itself. I’ll leave fixing that as an exercise for the reader.

Good luck!

Download the Script

Download zip file... cron_dispatcher.php
cron_dispatcher.php.zip • 1.2 KB

Sendmail not working? Maybe your server’s IP is on a block list

This is pretty arcane, even for me, but since I spent several hours troubleshooting this problem this week — and the solution was nowhere to be found on Google — I figured it was worth sharing.

My CMS, cms34, as I’ve mentioned a few times before, is built on CakePHP. Some features of cms34 include automatically generated email messages. CakePHP has a nice email component that facilitates a lot of this work. It can be configured to use an SMTP server, but by default it sends mail directly from the web server using whatever you have installed on the server, either the ubiquitous sendmail or the more powerful (and capitalized) Postfix. Don’t unleash a deluge of flame comments on me, but I’m using sendmail. So be it.

All was working well until a few weeks ago, when suddenly none of the mails were being sent. There were no errors on the website; the messages just wouldn’t go through. What was more confusing was that messages being sent to my own domain did go through, but for those being sent to my clients’ domains, nothing.

Nothing except log entries, that is. Specifically, the mail log was filling up with lines like this:

Sep 13 13:45:56 redacted sm-mta[28158]: o8DIjsx0028156:
to=<redacted@redacted.com>, ctladdr=<redacted@redacted.com>
(33/33), delay=00:00:02, xdelay=00:00:01,
mailer=esmtp, pri=120799, relay=redacted.com.
[123.456.789.000], dsn=5.7.1, stat=Service unavailable

(Note that I’ve removed the real email and IP addresses to protect the innocent, namely myself.)

“Service unavailable,” huh? I researched that error extensively, without finding much. Eventually I was led to believe it may be an issue with my hostname, hosts, hosts.allow and/or hosts.deny files.

A few relevant points: 1) my hosts.allow file only contains one (uncommented) line: sendmail: LOCAL and 2) likewise, the hosts.deny file only contains: ALL: PARANOID. I’ll save you some time right here: the problem I had ended up having nothing whatsoever to do with any of these host-related files. Leave ’em alone.

After following a number of these dead ends, I was inspired to check the mail file on the server for the user Apache runs as, in my case www-data. On Ubuntu Linux (and probably other flavors), these mail files can be found in /var/mail. Indeed, there were some interesting things to be found there, namely, a number of references to this URL:

http://www.spamhaus.org/query/bl?ip=123.456.789.000

(Again, the IP address has been changed… and yes, I know that’s not a valid IP address. That’s the point.)

I was not previously aware of The Spamhaus Project, but perhaps I should have been. The reason my messages weren’t getting through was because my server’s IP address was on the PBL: Policy Block List. Essentially, that is a list of all of the IP addresses (or IP ranges) in the world that, according to a well-defined set of rules, have no business acting like SMTP (Simple Mail Transfer Protocol*) servers — the servers that send mail out.

It stands to reason that my server was on this list; technically it’s not an SMTP server. But it’s perfectly legitimate for a web server to be running sendmail or Postfix or something of that nature, and sending messages out from the web applications it runs. Fortunately, it’s easy to get legitimate servers removed from the PBL. Simply fill out a form, verify your identity (via a code sent to you in an email message), and within about an hour, the changes will propagate worldwide.

Success! So if you’re in the same kind of situation I was in, where everything seems to be configured properly but your messages just aren’t going out for some reason, try checking Spamhaus to see if your IP is on the PBL.

* If you made it this far in the post, I shouldn’t have to explain the acronym. But I will anyway, as is my wont.

Making CakePHP’s TreeBehavior work with scope

We’re not talking mouthwash here. We’re talking code.

CakePHP’s TreeBehavior is cool, but tree traversal is a pretty arcane concept, even for developers, and MPTT is not something that is easy to digest mentally, or to develop around. Unfortunately, it’s also not incredibly efficient on large data sets. Even not-so-large data sets, on the order of a few hundred items, make certain actions — reordering, in particular — really processor intensive.

I’m using a JavaScript drag-and-drop tool in cms34 to allow admins to manage the page tree on their sites, and the data gets stored using CakePHP’s TreeBehavior, which is MPTT-based. The problem is, it really wasn’t working. So I completely rebuilt the code that assigns the über-critical “left” and “right” values to each node in the tree. My new way is much faster, even though it may break a few rules, and requires bypassing TreeBehavior’s callbacks.

Once I got that working, I was delighted, but then I discovered some other problems, namely pertaining to… scope. My CMS supports multiple sites in one installation, which means multiple trees, which means scope. The problem is, I was having a hell of a time figuring out just how to make scope work with TreeBehavior. Finally I found a link to a succinct and effective solution.

The upshot here is that you can’t define your $actsAs in the model, because… well… to be honest, I only have a vague understanding of why not, but essentially it’s due to the split roles of the model-view-controller framework. You’re building a rule that requires access to specific data values, which is something that needs to happen in the controller; the model is strictly for the abstract structure of the data. I understand it just enough to agree that it makes sense not to do it in the model. Which means you have to do it in the controller. The sample code from the link above goes a little something like this:

function add() {
    $this->Task->Behaviors->attach('Tree', array(
         'scope' => "Task.schedule_id = {$this->data['Task']['schedule_id']}"
    ));
    $this->Task->create();
    $this->Task->save($this->data);
}

That didn’t quite work in my situation, but it got me far enough along that I could figure out what to do from there.

I still need to do a little more testing to make sure my solution to the more efficient tree reordering is rock solid, and then I’ll post a tutorial. But for now, I hope this helps spread the word that scope does work on TreeBehavior… if you do it right.

Update (September 22, 2010): Although this didn’t really seem to be breaking anything, just throwing up a warning (when debugging was turned on), I discovered a minor issue with this code yesterday. Turns out CakePHP expects the value of scope to be an array. Just taking the string it was defined as and wrapping it in array() did the trick.