How are open source CMSes like Microsoft enterprise software?

Aside from the fact that both topics would put the average blog reader to sleep before the end of the first…

OK, now that they’re asleep, let’s talk. Throughout most of my career, open source software and Microsoft’s (or, really, any software behemoth’s) enterprise “solutions” have seemed diametrically opposed. But the more I think about the situation, I begin to find some startling similarities, at least in their implementation (and reasons for said implementation), if not in their actual structure and licensing.

If you’re the one person (besides me) who’s spent any significant amount of time reading this blog, you probably know two things: 1) I don’t like Microsoft, and 2) I don’t like Drupal. So these are the objects of my scorn in today’s post as well, although the problems I’m describing can be generalized, I think, to the broader sectors of the software industry that they represent.

When I worked in the corporate world, I resented Microsoft’s dominance across the board from operating systems to desktop software to enterprise systems. It just seemed that most of their tools weren’t really that good, and eventually I began to realize that the reason they were successful was that Microsoft’s customers were not the end users, but rather the IT managers who made purchasing decisions. These decisions were largely based on their own knowledge and experience with Microsoft’s software (to the detriment of other, possibly superior options), but also (I believe cynically) to preserve their own jobs and those of their staffs. Microsoft’s systems require(d?) constant maintenance and support. Not only did this mean bigger IT staffs on the corporate payroll, but it meant lots of highly paid “consulting” firms whose sole job was to promote and then support the sales and implementation of Microsoft products.

In the indie developer world, where I now reside, the culture and software platforms are different, but perhaps not as different as they seem. Apple’s computers dominate the desktops in small studios, and the tabletops in coffeehouses where freelancers can frequently be spotted hunched over their MacBooks hard at work while sipping lattes and meeting (usually a little too loudly) with clients. And open source software dominates at the server level.

But just like Microsoft’s platforms, I think most open source software just isn’t really very good. And the problem, once again, is the customer (or… well… whatever you call the person who makes the decisions when selecting a free product). It seems that the end user experience is rarely given much priority when most open source software is being designed and developed. Part of the problem is a lack of direct contact between the development teams and those end users (or, to be honest, even between the geographically scattered members of the development teams themselves). Devs don’t really know what end users want or need. They only know what they want or need, along with what’s been submitted to their bug trackers.

It’s not that these devs are bad people, or bad at what they do. There’s just a disconnect between coder and user, and as a result the goal of building good software isn’t met.

So, why do independent developers still use tools that are not really the best for their clients? Again, cynically, I wonder sometimes if job security isn’t a factor. It’s a lot easier to build something that works, but that requires indefinite, ongoing attention and support, than to build something that is flawless, that you can hand off to your client and never touch again. It’s easier… and it provides built-in job security.

Now, I’m not perfect, and I’m not above all of this. There is no such thing as flawless software, and I have ongoing support contracts with some of my bigger clients. But I’m proud to say that’s mostly because I’m constantly building new sites for them, or building functional enhancements onto the sites they already have, rather than doing endless bug fixes and technical support because the tools I’ve sold them are too confusing or simply don’t work right. Sure, the bug fixes and tech support do happen. But the tools — primarily WordPress and cms34, my own CMS — are built much more with the end user in mind, and have managed to avoid the pitfalls that mean a guaranteed job for me at the expense of a mediocre user experience for my clients.

That’s harder, and riskier. But it’s better. I’m delivering a higher quality product to the clients, and I’m keeping my own work interesting and moving forward.

On the ugly history of early open source CMSes (or why, surprisingly, I did not enjoy listening to Merlin Mann and John Gruber together in a podcast)

It was a perfect idea, or so I thought. Two of my favorite personalities from 5by5 podcasts were going to do a show together: John Gruber (of The Talk Show and Daring Fireball) was going to appear on Merlin Mann’s Back to Work podcast!

Unfortunately, when Merlin Mann’s fire met John Gruber’s ice, the result, like Derek Smalls, was lukewarm water.

Or maybe it was just because they spent the first several minutes of the podcast — all I managed to get through in one sitting (though to be sure, it’s all that’s relevant for this post) — talking about the early days of open source content management systems (CMSes). It maybe wasn’t really their fault. I reflexively dry heave when I hear words like PostNuke or Plone. They’re names I haven’t thought about in years, and haven’t thought about positively… well… ever. I dabbled with both of them in the early early days of CMSes, and quickly ran away.

But then Gruber and Mann (sounds like a comedy duo from the ’60s) got into something that really stings: they started professing love for Drupal (Merlin) and Movable Type (John). I cannot tell you how strongly I dislike both Drupal and Movable Type.

Gruber loves him some Movable Type

It’s been a point of pride for me since becoming a loyal Daring Fireball reader that I do not like Movable Type. The fact that Gruber uses Movable Type for Daring Fireball and is a vocal supporter of the platform, and the fact that I loathe Movable Type, is something I use to prove that I have not just become a devoted Gruber acolyte. I still think independently; I just happen to agree with him on almost everything he writes about.

One of Gruber’s biggest reasons for liking Movable Type, apparently, is the fact that it doesn’t serve content from the database; it publishes the entire site out to static HTML files. This is great for server performance under heavy loads, and has probably allowed him to continue to run Daring Fireball with a much lower-powered server than he’d need if he were using a database-intensive CMS. But in my own experience it also makes the process of using Movable Type a chore. Perhaps that’s because I’ve never actually used it directly myself; my work with it has been limited to setting up templates/themes/whatever-they-call-them-in-MT for other people. MT’s tools for working with stylesheets and templates are cumbersome, MT’s proprietary scripting language is tedious for “non-natives” like me, and having to republish the entire site to see changes to the templates is a real pain in the ass.

On the other hand, Gruber loves to rip on my personal favorite open source CMS (though surely he doesn’t know, nor would he care, that it’s my favorite), WordPress. And it is precisely because WordPress doesn’t publish static pages the way MT does that he hates it so much. He commonly refers to sites that crash under the weight of his referral traffic as being “fireballed” (clever), and sites running a stock installation of WordPress are notorious for this. But there are several caching plugins available for WordPress that can greatly boost its ability to handle peak traffic, in a much less intrusive way than MT’s static HTML publishing process. My plugin of choice these days is Quick Cache. The plugin’s tagline says it all: “Speed Without Compromise.” (I’m not-so-secretly hoping that Gruber links to this post on Daring Fireball, but then again, I’m not sure I want to risk being proven wrong.)

Merlin and Drupal, sittin’ in a tree

And then there’s Merlin, and Drupal. Oh, Merlin. Again my experience with Drupal is fairly limited. I tried using it to run this site for a brief period (which you can probably find if you dig in the archives… I believe it was around early 2006). I very quickly gave up. Drupal’s admin interface was clunky and unintuitive, but it was the fact that my site was almost instantly drowning in comment spam that killed the deal for me. In the time since then the only work I’ve done with Drupal has been focused on helping people move their sites off of it, but the fact that so many people want to do that says enough for me.

I find two major issues with a lot of open source CMSes (like Joomla and MODx, to name two others I’ve tried in the past) that are perfectly exemplified by Drupal and Movable Type.

First, unless you really know what you’re doing and put a lot of effort into it, your Drupal site is going to look like a Drupal site. (Yes, it’s true that as I am writing this I’m using a more-or-less stock WordPress template, so my site looks like WordPress, but that’s not the point; it’s easy to build a completely custom WordPress theme from scratch with little more than basic HTML/CSS skills.)

Second, and I touched on this with Movable Type, many of these systems have their own custom scripting languages or other idiosyncrasies to be learned, such that learning to use these systems is not substantially easier than just learning to code HTML/CSS.

Of course, I know how to code HTML and CSS. That’s my job. I have “visual” mode disabled in the editor in WordPress. A CMS exists not just to shield you from writing code, but to make it easy to manage, organize, and re-organize large amounts of site content, while maintaining a consistent look and feel. But a good CMS goes beyond that. A good CMS puts the power to manage the website in the hands of people who don’t know, and don’t want to know, how to write code.

A little background

As I listened to Merlin Mann and John Gruber wistfully recall the glory days of crappy CMSes, I wondered why they had devoted so much time and attention to learning and working with those systems, while at the same time I had rejected them out of hand. (I also wondered why they were at roughly the same place in their careers as I was at the time, and now they’re “famous,” but that’s beside the point… or maybe not… read on.) Then it hit me: I didn’t like those CMSes because I had already built a few of my own and I liked them better.

Now, it’s true that I liked my CMSes better partly because, well, I had built them. And I think Merlin was spot-on in the podcast when he said that most systems worked great if you thought just like the developer. (Specifically, he said that Basecamp works great if you think like Jason Fried.) That’s a great point, and one I should not overlook. But there’s another aspect to this. Most of these open source CMSes are developed by online communities of… well… geeks. Sure, they take feature requests from users (I assume), but ultimately the main people giving the developers feedback on how the systems should work are other developers.

I’ve built a number of CMSes over my career. They’ve all been built to meet specific client/user needs, and always in direct consultation with those clients/users. Has that made them perfect, or made me impervious to casting my CMSes in my own image? Of course not. But it’s made it harder to hide away in my geek cave and crank out systems that only other geeks can use.

The first CMS I built was in the heady days of early 2000, just before the dot-com bust. I was working for a certain big box retailer’s dot-com subsidiary. (This was back in the days when certain big box retailers believed they could spin off dot-com subsidiaries as independent companies that issued their own stock and everyone got rich yay!!!)

The company had invested 7 figures in some colossal enterprise CMS that was going to take several months and thousands of hours of consultant time to customize to our needs. In the meantime, our staff of writers (yes, we had writers producing informative weekly articles for each “department” of the online store) would deliver their content to me and to the one other front-end developer who was good at HTML (I know, right?), and we were to manually convert them into HTML and put them into the static pages of the site. (As front-end developers, we were strictly forbidden any database access. Because, you know, we were dangerous. We might put HTML in it.)

After a few weeks, I decided this was a ridiculous arrangement, so over the weekend I scraped together a quick-and-dirty, database-free CMS that would allow the writers to enter their content directly. The system would merge their content with prebuilt page templates, and save the output as static HTML pages on the server.

And so a career as a CMS developer was born.

At my next several jobs, I built more CMSes, honing the process each time (and always starting from scratch, since my previous work was the IP of my former employers). And eventually, when I went out on my own in 2008 (not to mention my first failed attempt at full-time freelancing, in 2003, which is going to be the subject of a future post), it was natural that I’d create my own CMS, and this time since I own the IP, I can keep building on it and expanding its capabilities.

Does my CMS have the polish and ease-of-use of WordPress? Probably not, although I’ve had more than one client (more than two, even!) tell me they prefer my CMS over WordPress. But the stock version of WordPress has a fairly limited scope of capabilities (you’ve got your pages and you’ve got your posts and, hey, why would you need anything else?), and my CMS is more modular, with a number of other capabilities (built in event calendar with registration, ecommerce, custom forms, etc.) that can be turned on or off to suit the needs of the client. WordPress is highly extensible with plugins, of course, but I find that, in short, if a client needs a custom solution, they need a custom solution. My CMS is a shortcut to a custom solution.

So that’s why!

Along this mental journey through the littered landscape of dead (or dying) content management systems, I learned a few things. I learned why I never fell in love with those early CMSes. And I think I probably also learned why I’m not “famous” in the field of tech bloggers/podcasters. I’ve been too busy reinventing my own wheel for much of the past decade to have the time to devote to self-promotion (and I don’t mean that as a slight against Gruber and Mann) necessary to achieve that level of recognition.

CakePHP at the command line: it’s cron-tastic!

I’m kind of surprised it’s taken this long. cms34 has been around for almost three years now, and this is the first time I’ve had a client need a cron job that relied on CakePHP functionality. (The system has a couple of backup and general-purpose cleanup tools that can be configured as cron jobs, but they’re just simple shell scripts.)

And so it was today that I found myself retrofitting my CakePHP-based web application to support running scripts from the command line. I found a great post in the Bakery that got me about 85% of the way there, but there were some issues, mostly due to the fact that the post was written in late 2006, and a lot has happened in CakePHP land over the last 4 1/2 years.

There are two big differences in app/webroot/index.php in CakePHP 1.3 compared to the version that existed at the time of that original post: first, the calls to the Dispatcher object are now wrapped in a conditional, so the instruction to replace everything below the require line should now be something closer to “replace everything within the else statement at the bottom of the file.”

The other big change is that this file defines some constants for directories within the application, and those paths are all wrong if you move the file into the app directory as instructed.

Below is my revised version of the code. Note that I also reworked it slightly so the command can accept more than two arguments as well. There’s a space before Dispatcher is called where you can insert any necessary logic for handling those arguments. (I also removed all of the comments that appear in the original version of the file.)

<?php
if (!defined('DS')) {
    define('DS', DIRECTORY_SEPARATOR);
}
if (!defined('ROOT')) {
    define('ROOT', dirname(dirname(__FILE__)));
}
if (!defined('APP_DIR')) {
    define('APP_DIR', basename(dirname(__FILE__)));
}
if (!defined('CAKE_CORE_INCLUDE_PATH')) {
    define('CAKE_CORE_INCLUDE_PATH', ROOT);
}
if (!defined('WEBROOT_DIR')) {
    define('WEBROOT_DIR', APP_DIR . DS . 'webroot');
}
if (!defined('WWW_ROOT')) {
    define('WWW_ROOT', WEBROOT_DIR . DS);
}
if (!defined('CORE_PATH')) {
    if (function_exists('ini_set') && ini_set('include_path', CAKE_CORE_INCLUDE_PATH . PATH_SEPARATOR . ROOT . DS . APP_DIR . DS . PATH_SEPARATOR . ini_get('include_path'))) {
        define('APP_PATH', null);
        define('CORE_PATH', null);
    } else {
        define('APP_PATH', ROOT . DS . APP_DIR . DS);
        define('CORE_PATH', CAKE_CORE_INCLUDE_PATH . DS);
    }
}
if (!include(CORE_PATH . 'cake' . DS . 'bootstrap.php')) {
    trigger_error("CakePHP core could not be found. Check the value of CAKE_CORE_INCLUDE_PATH in APP/webroot/index.php. It should point to the directory containing your " . DS . "cake core directory and your " . DS . "vendors root directory.", E_USER_ERROR);
}
if (isset($_GET['url']) && $_GET['url'] === 'favicon.ico') {
    return;
} else {
    // Dispatch the controller action given to it
    // eg php cron_dispatcher.php /controller/action
    define('CRON_DISPATCHER',true);
    if($argc >= 2) {

        // INSERT ANY LOGIC FOR ADDITIONAL ARGUMENT VALUES HERE

        $Dispatcher= new Dispatcher();
        $Dispatcher->dispatch($argv[1]);
    }
}

After implementing this version of cron_dispatcher.php, I was able to get CakePHP scripts to run at the command line without being fundamentally broken, but there were still a few further adjustments I needed to make (mostly in app_controller.php). Those were specific to my application. You’ll probably find yourself in the same boat.

A couple of other things worth noting: if your server is anything like mine, you’ll need to specify the full path of the PHP command line executable when running your scripts. In my case it was /usr/local/bin/php. And, the one sheepish confession I have to make: I know the goal with the way the file path constants are defined is to avoid any literal naming, but I couldn’t find a good way to get around that for WEBROOT_DIR, with the file no longer residing in webroot itself. I’ll leave fixing that as an exercise for the reader.

Good luck!

Download the Script

Download zip file... cron_dispatcher.php
cron_dispatcher.php.zip • 1.2 KB

Sendmail not working? Maybe your server’s IP is on a block list

This is pretty arcane, even for me, but since I spent several hours troubleshooting this problem this week — and the solution was nowhere to be found on Google — I figured it was worth sharing.

My CMS, cms34, as I’ve mentioned a few times before, is built on CakePHP. Some features of cms34 include automatically generated email messages. CakePHP has a nice email component that facilitates a lot of this work. It can be configured to use an SMTP server, but by default it sends mail directly from the web server using whatever you have installed on the server, either the ubiquitous sendmail or the more powerful (and capitalized) Postfix. Don’t unleash a deluge of flame comments on me, but I’m using sendmail. So be it.

All was working well until a few weeks ago, when suddenly none of the mails were being sent. There were no errors on the website; the messages just wouldn’t go through. What was more confusing was that messages being sent to my own domain did go through, but for those being sent to my clients’ domains, nothing.

Nothing except log entries, that is. Specifically, the mail log was filling up with lines like this:

Sep 13 13:45:56 redacted sm-mta[28158]: o8DIjsx0028156:
to=<redacted@redacted.com>, ctladdr=<redacted@redacted.com>
(33/33), delay=00:00:02, xdelay=00:00:01,
mailer=esmtp, pri=120799, relay=redacted.com.
[123.456.789.000], dsn=5.7.1, stat=Service unavailable

(Note that I’ve removed the real email and IP addresses to protect the innocent, namely myself.)

“Service unavailable,” huh? I researched that error extensively, without finding much. Eventually I was led to believe it may be an issue with my hostname, hosts, hosts.allow and/or hosts.deny files.

A few relevant points: 1) my hosts.allow file only contains one (uncommented) line: sendmail: LOCAL and 2) likewise, the hosts.deny file only contains: ALL: PARANOID. I’ll save you some time right here: the problem I had ended up having nothing whatsoever to do with any of these host-related files. Leave ’em alone.

After following a number of these dead ends, I was inspired to check the mail file on the server for the user Apache runs as, in my case www-data. On Ubuntu Linux (and probably other flavors), these mail files can be found in /var/mail. Indeed, there were some interesting things to be found there, namely, a number of references to this URL:

http://www.spamhaus.org/query/bl?ip=123.456.789.000

(Again, the IP address has been changed… and yes, I know that’s not a valid IP address. That’s the point.)

I was not previously aware of The Spamhaus Project, but perhaps I should have been. The reason my messages weren’t getting through was because my server’s IP address was on the PBL: Policy Block List. Essentially, that is a list of all of the IP addresses (or IP ranges) in the world that, according to a well-defined set of rules, have no business acting like SMTP (Simple Mail Transfer Protocol*) servers — the servers that send mail out.

It stands to reason that my server was on this list; technically it’s not an SMTP server. But it’s perfectly legitimate for a web server to be running sendmail or Postfix or something of that nature, and sending messages out from the web applications it runs. Fortunately, it’s easy to get legitimate servers removed from the PBL. Simply fill out a form, verify your identity (via a code sent to you in an email message), and within about an hour, the changes will propagate worldwide.

Success! So if you’re in the same kind of situation I was in, where everything seems to be configured properly but your messages just aren’t going out for some reason, try checking Spamhaus to see if your IP is on the PBL.

* If you made it this far in the post, I shouldn’t have to explain the acronym. But I will anyway, as is my wont.

Making CakePHP’s TreeBehavior work with scope

We’re not talking mouthwash here. We’re talking code.

CakePHP’s TreeBehavior is cool, but tree traversal is a pretty arcane concept, even for developers, and MPTT is not something that is easy to digest mentally, or to develop around. Unfortunately, it’s also not incredibly efficient on large data sets. Even not-so-large data sets, on the order of a few hundred items, make certain actions — reordering, in particular — really processor intensive.

I’m using a JavaScript drag-and-drop tool in cms34 to allow admins to manage the page tree on their sites, and the data gets stored using CakePHP’s TreeBehavior, which is MPTT-based. The problem is, it really wasn’t working. So I completely rebuilt the code that assigns the über-critical “left” and “right” values to each node in the tree. My new way is much faster, even though it may break a few rules, and requires bypassing TreeBehavior’s callbacks.

Once I got that working, I was delighted, but then I discovered some other problems, namely pertaining to… scope. My CMS supports multiple sites in one installation, which means multiple trees, which means scope. The problem is, I was having a hell of a time figuring out just how to make scope work with TreeBehavior. Finally I found a link to a succinct and effective solution.

The upshot here is that you can’t define your $actsAs in the model, because… well… to be honest, I only have a vague understanding of why not, but essentially it’s due to the split roles of the model-view-controller framework. You’re building a rule that requires access to specific data values, which is something that needs to happen in the controller; the model is strictly for the abstract structure of the data. I understand it just enough to agree that it makes sense not to do it in the model. Which means you have to do it in the controller. The sample code from the link above goes a little something like this:

function add() {
    $this->Task->Behaviors->attach('Tree', array(
         'scope' => "Task.schedule_id = {$this->data['Task']['schedule_id']}"
    ));
    $this->Task->create();
    $this->Task->save($this->data);
}

That didn’t quite work in my situation, but it got me far enough along that I could figure out what to do from there.

I still need to do a little more testing to make sure my solution to the more efficient tree reordering is rock solid, and then I’ll post a tutorial. But for now, I hope this helps spread the word that scope does work on TreeBehavior… if you do it right.

Update (September 22, 2010): Although this didn’t really seem to be breaking anything, just throwing up a warning (when debugging was turned on), I discovered a minor issue with this code yesterday. Turns out CakePHP expects the value of scope to be an array. Just taking the string it was defined as and wrapping it in array() did the trick.