How are open source CMSes like Microsoft enterprise software?

Aside from the fact that both topics would put the average blog reader to sleep before the end of the first…

OK, now that they’re asleep, let’s talk. Throughout most of my career, open source software and Microsoft’s (or, really, any software behemoth’s) enterprise “solutions” have seemed diametrically opposed. But the more I think about the situation, I begin to find some startling similarities, at least in their implementation (and reasons for said implementation), if not in their actual structure and licensing.

If you’re the one person (besides me) who’s spent any significant amount of time reading this blog, you probably know two things: 1) I don’t like Microsoft, and 2) I don’t like Drupal. So these are the objects of my scorn in today’s post as well, although the problems I’m describing can be generalized, I think, to the broader sectors of the software industry that they represent.

When I worked in the corporate world, I resented Microsoft’s dominance across the board from operating systems to desktop software to enterprise systems. It just seemed that most of their tools weren’t really that good, and eventually I began to realize that the reason they were successful was that Microsoft’s customers were not the end users, but rather the IT managers who made purchasing decisions. These decisions were largely based on their own knowledge and experience with Microsoft’s software (to the detriment of other, possibly superior options), but also (I believe cynically) to preserve their own jobs and those of their staffs. Microsoft’s systems require(d?) constant maintenance and support. Not only did this mean bigger IT staffs on the corporate payroll, but it meant lots of highly paid “consulting” firms whose sole job was to promote and then support the sales and implementation of Microsoft products.

In the indie developer world, where I now reside, the culture and software platforms are different, but perhaps not as different as they seem. Apple’s computers dominate the desktops in small studios, and the tabletops in coffeehouses where freelancers can frequently be spotted hunched over their MacBooks hard at work while sipping lattes and meeting (usually a little too loudly) with clients. And open source software dominates at the server level.

But just like Microsoft’s platforms, I think most open source software just isn’t really very good. And the problem, once again, is the customer (or… well… whatever you call the person who makes the decisions when selecting a free product). It seems that the end user experience is rarely given much priority when most open source software is being designed and developed. Part of the problem is a lack of direct contact between the development teams and those end users (or, to be honest, even between the geographically scattered members of the development teams themselves). Devs don’t really know what end users want or need. They only know what they want or need, along with what’s been submitted to their bug trackers.

It’s not that these devs are bad people, or bad at what they do. There’s just a disconnect between coder and user, and as a result the goal of building good software isn’t met.

So, why do independent developers still use tools that are not really the best for their clients? Again, cynically, I wonder sometimes if job security isn’t a factor. It’s a lot easier to build something that works, but that requires indefinite, ongoing attention and support, than to build something that is flawless, that you can hand off to your client and never touch again. It’s easier… and it provides built-in job security.

Now, I’m not perfect, and I’m not above all of this. There is no such thing as flawless software, and I have ongoing support contracts with some of my bigger clients. But I’m proud to say that’s mostly because I’m constantly building new sites for them, or building functional enhancements onto the sites they already have, rather than doing endless bug fixes and technical support because the tools I’ve sold them are too confusing or simply don’t work right. Sure, the bug fixes and tech support do happen. But the tools — primarily WordPress and cms34, my own CMS — are built much more with the end user in mind, and have managed to avoid the pitfalls that mean a guaranteed job for me at the expense of a mediocre user experience for my clients.

That’s harder, and riskier. But it’s better. I’m delivering a higher quality product to the clients, and I’m keeping my own work interesting and moving forward.

On the ugly history of early open source CMSes (or why, surprisingly, I did not enjoy listening to Merlin Mann and John Gruber together in a podcast)

It was a perfect idea, or so I thought. Two of my favorite personalities from 5by5 podcasts were going to do a show together: John Gruber (of The Talk Show and Daring Fireball) was going to appear on Merlin Mann’s Back to Work podcast!

Unfortunately, when Merlin Mann’s fire met John Gruber’s ice, the result, like Derek Smalls, was lukewarm water.

Or maybe it was just because they spent the first several minutes of the podcast — all I managed to get through in one sitting (though to be sure, it’s all that’s relevant for this post) — talking about the early days of open source content management systems (CMSes). It maybe wasn’t really their fault. I reflexively dry heave when I hear words like PostNuke or Plone. They’re names I haven’t thought about in years, and haven’t thought about positively… well… ever. I dabbled with both of them in the early early days of CMSes, and quickly ran away.

But then Gruber and Mann (sounds like a comedy duo from the ’60s) got into something that really stings: they started professing love for Drupal (Merlin) and Movable Type (John). I cannot tell you how strongly I dislike both Drupal and Movable Type.

Gruber loves him some Movable Type

It’s been a point of pride for me since becoming a loyal Daring Fireball reader that I do not like Movable Type. The fact that Gruber uses Movable Type for Daring Fireball and is a vocal supporter of the platform, and the fact that I loathe Movable Type, is something I use to prove that I have not just become a devoted Gruber acolyte. I still think independently; I just happen to agree with him on almost everything he writes about.

One of Gruber’s biggest reasons for liking Movable Type, apparently, is the fact that it doesn’t serve content from the database; it publishes the entire site out to static HTML files. This is great for server performance under heavy loads, and has probably allowed him to continue to run Daring Fireball with a much lower-powered server than he’d need if he were using a database-intensive CMS. But in my own experience it also makes the process of using Movable Type a chore. Perhaps that’s because I’ve never actually used it directly myself; my work with it has been limited to setting up templates/themes/whatever-they-call-them-in-MT for other people. MT’s tools for working with stylesheets and templates are cumbersome, MT’s proprietary scripting language is tedious for “non-natives” like me, and having to republish the entire site to see changes to the templates is a real pain in the ass.

On the other hand, Gruber loves to rip on my personal favorite open source CMS (though surely he doesn’t know, nor would he care, that it’s my favorite), WordPress. And it is precisely because WordPress doesn’t publish static pages the way MT does that he hates it so much. He commonly refers to sites that crash under the weight of his referral traffic as being “fireballed” (clever), and sites running a stock installation of WordPress are notorious for this. But there are several caching plugins available for WordPress that can greatly boost its ability to handle peak traffic, in a much less intrusive way than MT’s static HTML publishing process. My plugin of choice these days is Quick Cache. The plugin’s tagline says it all: “Speed Without Compromise.” (I’m not-so-secretly hoping that Gruber links to this post on Daring Fireball, but then again, I’m not sure I want to risk being proven wrong.)

Merlin and Drupal, sittin’ in a tree

And then there’s Merlin, and Drupal. Oh, Merlin. Again my experience with Drupal is fairly limited. I tried using it to run this site for a brief period (which you can probably find if you dig in the archives… I believe it was around early 2006). I very quickly gave up. Drupal’s admin interface was clunky and unintuitive, but it was the fact that my site was almost instantly drowning in comment spam that killed the deal for me. In the time since then the only work I’ve done with Drupal has been focused on helping people move their sites off of it, but the fact that so many people want to do that says enough for me.

I find two major issues with a lot of open source CMSes (like Joomla and MODx, to name two others I’ve tried in the past) that are perfectly exemplified by Drupal and Movable Type.

First, unless you really know what you’re doing and put a lot of effort into it, your Drupal site is going to look like a Drupal site. (Yes, it’s true that as I am writing this I’m using a more-or-less stock WordPress template, so my site looks like WordPress, but that’s not the point; it’s easy to build a completely custom WordPress theme from scratch with little more than basic HTML/CSS skills.)

Second, and I touched on this with Movable Type, many of these systems have their own custom scripting languages or other idiosyncrasies to be learned, such that learning to use these systems is not substantially easier than just learning to code HTML/CSS.

Of course, I know how to code HTML and CSS. That’s my job. I have “visual” mode disabled in the editor in WordPress. A CMS exists not just to shield you from writing code, but to make it easy to manage, organize, and re-organize large amounts of site content, while maintaining a consistent look and feel. But a good CMS goes beyond that. A good CMS puts the power to manage the website in the hands of people who don’t know, and don’t want to know, how to write code.

A little background

As I listened to Merlin Mann and John Gruber wistfully recall the glory days of crappy CMSes, I wondered why they had devoted so much time and attention to learning and working with those systems, while at the same time I had rejected them out of hand. (I also wondered why they were at roughly the same place in their careers as I was at the time, and now they’re “famous,” but that’s beside the point… or maybe not… read on.) Then it hit me: I didn’t like those CMSes because I had already built a few of my own and I liked them better.

Now, it’s true that I liked my CMSes better partly because, well, I had built them. And I think Merlin was spot-on in the podcast when he said that most systems worked great if you thought just like the developer. (Specifically, he said that Basecamp works great if you think like Jason Fried.) That’s a great point, and one I should not overlook. But there’s another aspect to this. Most of these open source CMSes are developed by online communities of… well… geeks. Sure, they take feature requests from users (I assume), but ultimately the main people giving the developers feedback on how the systems should work are other developers.

I’ve built a number of CMSes over my career. They’ve all been built to meet specific client/user needs, and always in direct consultation with those clients/users. Has that made them perfect, or made me impervious to casting my CMSes in my own image? Of course not. But it’s made it harder to hide away in my geek cave and crank out systems that only other geeks can use.

The first CMS I built was in the heady days of early 2000, just before the dot-com bust. I was working for a certain big box retailer’s dot-com subsidiary. (This was back in the days when certain big box retailers believed they could spin off dot-com subsidiaries as independent companies that issued their own stock and everyone got rich yay!!!)

The company had invested 7 figures in some colossal enterprise CMS that was going to take several months and thousands of hours of consultant time to customize to our needs. In the meantime, our staff of writers (yes, we had writers producing informative weekly articles for each “department” of the online store) would deliver their content to me and to the one other front-end developer who was good at HTML (I know, right?), and we were to manually convert them into HTML and put them into the static pages of the site. (As front-end developers, we were strictly forbidden any database access. Because, you know, we were dangerous. We might put HTML in it.)

After a few weeks, I decided this was a ridiculous arrangement, so over the weekend I scraped together a quick-and-dirty, database-free CMS that would allow the writers to enter their content directly. The system would merge their content with prebuilt page templates, and save the output as static HTML pages on the server.

And so a career as a CMS developer was born.

At my next several jobs, I built more CMSes, honing the process each time (and always starting from scratch, since my previous work was the IP of my former employers). And eventually, when I went out on my own in 2008 (not to mention my first failed attempt at full-time freelancing, in 2003, which is going to be the subject of a future post), it was natural that I’d create my own CMS, and this time since I own the IP, I can keep building on it and expanding its capabilities.

Does my CMS have the polish and ease-of-use of WordPress? Probably not, although I’ve had more than one client (more than two, even!) tell me they prefer my CMS over WordPress. But the stock version of WordPress has a fairly limited scope of capabilities (you’ve got your pages and you’ve got your posts and, hey, why would you need anything else?), and my CMS is more modular, with a number of other capabilities (built in event calendar with registration, ecommerce, custom forms, etc.) that can be turned on or off to suit the needs of the client. WordPress is highly extensible with plugins, of course, but I find that, in short, if a client needs a custom solution, they need a custom solution. My CMS is a shortcut to a custom solution.

So that’s why!

Along this mental journey through the littered landscape of dead (or dying) content management systems, I learned a few things. I learned why I never fell in love with those early CMSes. And I think I probably also learned why I’m not “famous” in the field of tech bloggers/podcasters. I’ve been too busy reinventing my own wheel for much of the past decade to have the time to devote to self-promotion (and I don’t mean that as a slight against Gruber and Mann) necessary to achieve that level of recognition.

How I migrated a client’s blog from Drupal to WordPress 3.0

I just finished migrating a client’s blog from Drupal to WordPress 3.0. When I took on the project, I assumed there was a straightforward way to migrate content from Drupal to WordPress, but I was mistaken. Fortunately, there is a way, but it involves directly mucking about with the Drupal and WordPress databases. Never fear! There are some tools to help.

After a bunch of research, I found the most comprehensive and up-to-date solution on Mike Smullin’s blog. It wasn’t that up-to-date, though, having been written over a year ago and targeted at WordPress 2.7. Some helpful comments on that post offered a few refinements, and I added a few of my own to accommodate the particular quirks of this client’s site.

Since Mike Smullin (and D’Arcy Norman and Dave Dash before him) was so kind to share his work, I thought I should do the same. And so, I present my further refined SQL script for migrating Drupal data to a WordPress 3.0 site.

# DRUPAL-TO-WORDPRESS CONVERSION SCRIPT

# Changelog

# 07.29.2010 – Updated by Scott Anderson / Room 34 Creative Services http://blog.room34.com/archives/4530
# 02.06.2009 – Updated by Mike Smullin http://www.mikesmullin.com/development/migrate-convert-import-drupal-5-to-wordpress-27/
# 05.15.2007 – Updated by D’Arcy Norman http://www.darcynorman.net/2007/05/15/how-to-migrate-from-drupal-5-to-wordpress-2/
# 05.19.2006 – Created by Dave Dash http://spindrop.us/2006/05/19/migrating-from-drupal-47-to-wordpress/

# This assumes that WordPress and Drupal are in separate databases, named 'wordpress' and 'drupal'.
# If your database names differ, adjust these accordingly.

# Empty previous content from WordPress database.
TRUNCATE TABLE wordpress.wp_comments;
TRUNCATE TABLE wordpress.wp_links;
TRUNCATE TABLE wordpress.wp_postmeta;
TRUNCATE TABLE wordpress.wp_posts;
TRUNCATE TABLE wordpress.wp_term_relationships;
TRUNCATE TABLE wordpress.wp_term_taxonomy;
TRUNCATE TABLE wordpress.wp_terms;

# If you're not bringing over multiple Drupal authors, comment out these lines and the other
# author-related queries near the bottom of the script.
# This assumes you're keeping the default admin user (user_id = 1) created during installation.
DELETE FROM wordpress.wp_users WHERE ID > 1;
DELETE FROM wordpress.wp_usermeta WHERE user_id > 1;

# TAGS
# Using REPLACE prevents script from breaking if Drupal contains duplicate terms.
REPLACE INTO wordpress.wp_terms
    (term_id, `name`, slug, term_group)
    SELECT DISTINCT
        d.tid, d.name, REPLACE(LOWER(d.name), ' ', '_'), 0
    FROM drupal.term_data d
    INNER JOIN drupal.term_hierarchy h
        USING(tid)
    INNER JOIN drupal.term_node n
        USING(tid)
    WHERE (1
         # This helps eliminate spam tags from import; uncomment if necessary.
         # AND LENGTH(d.name) < 50
    )
;

INSERT INTO wordpress.wp_term_taxonomy
    (term_id, taxonomy, description, parent)
    SELECT DISTINCT
        d.tid `term_id`,
        'post_tag' `taxonomy`,
        d.description `description`,
        h.parent `parent`
    FROM drupal.term_data d
    INNER JOIN drupal.term_hierarchy h
        USING(tid)
    INNER JOIN drupal.term_node n
        USING(tid)
    WHERE (1
         # This helps eliminate spam tags from import; uncomment if necessary.
         # AND LENGTH(d.name) < 50
    )
;

# POSTS
# Keeps private posts hidden.
INSERT INTO wordpress.wp_posts
    (id, post_author, post_date, post_content, post_title, post_excerpt,
    post_name, post_modified, post_type, `post_status`)
    SELECT DISTINCT
        n.nid `id`,
        n.uid `post_author`,
        FROM_UNIXTIME(n.created) `post_date`,
        r.body `post_content`,
        n.title `post_title`,
        r.teaser `post_excerpt`,
        IF(SUBSTR(a.dst, 11, 1) = '/', SUBSTR(a.dst, 12), a.dst) `post_name`,
        FROM_UNIXTIME(n.changed) `post_modified`,
        n.type `post_type`,
        IF(n.status = 1, 'publish', 'private') `post_status`
    FROM drupal.node n
    INNER JOIN drupal.node_revisions r
        USING(vid)
    LEFT OUTER JOIN drupal.url_alias a
        ON a.src = CONCAT('node/', n.nid)
    # Add more Drupal content types below if applicable.
    WHERE n.type IN ('post', 'page', 'blog')
;

# Fix post type; http://www.mikesmullin.com/development/migrate-convert-import-drupal-5-to-wordpress-27/#comment-17826
# Add more Drupal content types below if applicable.
UPDATE wordpress.wp_posts
    SET post_type = 'post'
    WHERE post_type IN ('blog')
;

# Set all pages to "pending".
# If you're keeping the same page structure from Drupal, comment out this query
# and the new page INSERT at the end of this script.
UPDATE wordpress.wp_posts SET post_status = 'pending' WHERE post_type = 'page';

# POST/TAG RELATIONSHIPS
INSERT INTO wordpress.wp_term_relationships (object_id, term_taxonomy_id)
    SELECT DISTINCT nid, tid FROM drupal.term_node
;

# Update tag counts.
UPDATE wp_term_taxonomy tt
    SET `count` = (
        SELECT COUNT(tr.object_id)
        FROM wp_term_relationships tr
        WHERE tr.term_taxonomy_id = tt.term_taxonomy_id
    )
;

# COMMENTS
# Keeps unapproved comments hidden.
# Incorporates change noted here: http://www.mikesmullin.com/development/migrate-convert-import-drupal-5-to-wordpress-27/#comment-32169
INSERT INTO wordpress.wp_comments
    (comment_post_ID, comment_date, comment_content, comment_parent, comment_author,
    comment_author_email, comment_author_url, comment_approved)
    SELECT DISTINCT
        nid, FROM_UNIXTIME(timestamp), comment, thread, name,
        mail, homepage, ((status + 1) % 2)
    FROM drupal.comments
;

# Update comments count on wp_posts table.
UPDATE wordpress.wp_posts
    SET `comment_count` = (
        SELECT COUNT(`comment_post_id`)
        FROM wordpress.wp_comments
        WHERE wordpress.wp_posts.`id` = wordpress.wp_comments.`comment_post_id`
    )
;

# Fix images in post content; uncomment if you're moving files from "files" to "wp-content/uploads".
# UPDATE wordpress.wp_posts SET post_content = REPLACE(post_content, '"/files/', '"/wp-content/uploads/');

# Fix taxonomy; http://www.mikesmullin.com/development/migrate-convert-import-drupal-5-to-wordpress-27/#comment-27140
UPDATE IGNORE wordpress.wp_term_relationships, wordpress.wp_term_taxonomy
    SET wordpress.wp_term_relationships.term_taxonomy_id = wordpress.wp_term_taxonomy.term_taxonomy_id
    WHERE wordpress.wp_term_relationships.term_taxonomy_id = wordpress.wp_term_taxonomy.term_id
;

# OPTIONAL ADDITIONS — REMOVE ALL BELOW IF NOT APPLICABLE TO YOUR CONFIGURATION

# CATEGORIES
# These are NEW categories, not in Drupal. Add as many sets as needed.
INSERT IGNORE INTO wordpress.wp_terms (name, slug)
    VALUES
    ('First Category', 'first-category'),
    ('Second Category', 'second-category'),
    ('Third Category', 'third-category')
;

# Set category names to title case (in case term already exists [as a tag] in lowercase).
UPDATE wordpress.wp_terms SET name = 'First Category' WHERE name = 'first category';
UPDATE wordpress.wp_terms SET name = 'Second Category' WHERE name = 'second category';
UPDATE wordpress.wp_terms SET name = 'Third Category' WHERE name = 'third category';

# Add categories to taxonomy.
INSERT INTO wordpress.wp_term_taxonomy (term_id, taxonomy)
    VALUES
    ((SELECT term_id FROM wp_terms WHERE slug = 'first-category'), 'category'),
    ((SELECT term_id FROM wp_terms WHERE slug = 'second-category'), 'category'),
    ((SELECT term_id FROM wp_terms WHERE slug = 'third-category'), 'category')
;

# Auto-assign posts to category.
# You'll need to work out your own logic to determine strings/terms to match.
# Repeat this block as needed for each category you're creating.
INSERT IGNORE INTO wordpress.wp_term_relationships (object_id, term_taxonomy_id)
    SELECT DISTINCT p.ID AS object_id,
        (SELECT tt.term_taxonomy_id
        FROM wordpress.wp_term_taxonomy tt
        INNER JOIN wordpress.wp_terms t USING (term_id)
        WHERE t.slug = 'enter-category-slug-here'
        AND tt.taxonomy = 'category') AS term_taxonomy_id
    FROM wordpress.wp_posts p
    WHERE p.post_content LIKE '%enter string to match here%'
    OR p.ID IN (
        SELECT tr.object_id
        FROM wordpress.wp_term_taxonomy tt
        INNER JOIN wordpress.wp_terms t USING (term_id)
        INNER JOIN wordpress.wp_term_relationships tr USING (term_taxonomy_id)
        WHERE t.slug IN ('enter','terms','to','match','here')
        AND tt.taxonomy = 'post_tag'
    )
;

# Update category counts.
UPDATE wp_term_taxonomy tt
    SET `count` = (
        SELECT COUNT(tr.object_id)
        FROM wp_term_relationships tr
        WHERE tr.term_taxonomy_id = tt.term_taxonomy_id
    )
;

# AUTHORS
INSERT IGNORE INTO wordpress.wp_users
    (ID, user_login, user_pass, user_nicename, user_email,
    user_registered, user_activation_key, user_status, display_name)
    SELECT DISTINCT
        u.uid, u.mail, NULL, u.name, u.mail,
        FROM_UNIXTIME(created), '', 0, u.name
    FROM drupal.users u
    INNER JOIN drupal.users_roles r
        USING (uid)
    WHERE (1
        # Uncomment and enter any email addresses you want to exclude below.
        # AND u.mail NOT IN ('test@example.com')
    )
;

# Assign author permissions.
# Sets all authors to "author" by default; next section can selectively promote individual authors
INSERT IGNORE INTO wordpress.wp_usermeta (user_id, meta_key, meta_value)
    SELECT DISTINCT
        u.uid, 'wp_capabilities', 'a:1:{s:6:"author";s:1:"1";}'
    FROM drupal.users u
    INNER JOIN drupal.users_roles r
        USING (uid)
    WHERE (1
        # Uncomment and enter any email addresses you want to exclude below.
        # AND u.mail NOT IN ('test@example.com')
    )
;
INSERT IGNORE INTO wordpress.wp_usermeta (user_id, meta_key, meta_value)
    SELECT DISTINCT
        u.uid, 'wp_user_level', '2'
    FROM drupal.users u
    INNER JOIN drupal.users_roles r
        USING (uid)
    WHERE (1
        # Uncomment and enter any email addresses you want to exclude below.
        # AND u.mail NOT IN ('test@example.com')
    )
;

# Change permissions for admins.
# Add any specific user IDs to IN list to make them administrators.
# User ID values are carried over from Drupal.
UPDATE wordpress.wp_usermeta
    SET meta_value = 'a:1:{s:13:"administrator";s:1:"1";}'
    WHERE user_id IN (1) AND meta_key = 'wp_capabilities'
;
UPDATE wordpress.wp_usermeta
    SET meta_value = '10'
    WHERE user_id IN (1) AND meta_key = 'wp_user_level'
;

# Reassign post authorship.
UPDATE wordpress.wp_posts
    SET post_author = NULL
    WHERE post_author NOT IN (SELECT DISTINCT ID FROM wordpress.wp_users)
;

# VIDEO – READ BELOW AND COMMENT OUT IF NOT APPLICABLE TO YOUR SITE
# If your Drupal site uses the content_field_video table to store links to YouTube videos,
# this query will insert the video URLs at the end of all relevant posts.
# WordPress will automatically convert the video URLs to YouTube embed code.
UPDATE IGNORE wordpress.wp_posts p, drupal.content_field_video v
    SET p.post_content = CONCAT_WS('\n',post_content,v.field_video_embed)
    WHERE p.ID = v.nid
;

# IMAGES – READ BELOW AND COMMENT OUT IF NOT APPLICABLE TO YOUR SITE
# If your Drupal site uses the content_field_image table to store images associated with posts,
# but not actually referenced in the content of the posts themselves, this query
# will insert the images at the top of the post.
# HTML/CSS NOTE: The code applies a "drupal_image" class to the image and places it inside a <div>
# with the "drupal_image_wrapper" class. Add CSS to your WordPress theme as appropriate to
# handle styling of these elements. The <img> tag as written assumes you'll be copying the
# Drupal "files" directory into the root level of WordPress, NOT placing it inside the
# "wp-content/uploads" directory. It also relies on a properly formatted <base href="" /> tag.
# Make changes as necessary before running this script!
UPDATE IGNORE wordpress.wp_posts p, drupal.content_field_image i, drupal.files f
    SET p.post_content =
        CONCAT(
            CONCAT(
                '<div class="drupal_image_wrapper"><img src="files/',
                f.filename,
                '" class="drupal_image" /></div>'
            ),
            p.post_content
        )
    WHERE p.ID = i.nid
    AND i.field_image_fid = f.fid
    AND (
        f.filename LIKE '%.jpg'
        OR f.filename LIKE '%.jpeg'
        OR f.filename LIKE '%.png'
        OR f.filename LIKE '%.gif'
    )
;

# Fix post_name to remove paths.
# If applicable; Drupal allows paths (i.e. slashes) in the dst field, but this breaks
# WordPress URLs. If you have mod_rewrite turned on, stripping out the portion before
# the final slash will allow old site links to work properly, even if the path before
# the slash is different!
UPDATE wordpress.wp_posts
    SET post_name =
    REVERSE(SUBSTRING(REVERSE(post_name),1,LOCATE('/',REVERSE(post_name))-1))
;

# Miscellaneous clean-up.
# There may be some extraneous blank spaces in your Drupal posts; use these queries
# or other similar ones to strip out the undesirable tags.
UPDATE wordpress.wp_posts
    SET post_content = REPLACE(post_content,'<p> </p>','')
;
UPDATE wordpress.wp_posts
    SET post_content = REPLACE(post_content,'<p class="italic"> </p>','')
;

# NEW PAGES – READ BELOW AND COMMENT OUT IF NOT APPLICABLE TO YOUR SITE
# MUST COME LAST IN THE SCRIPT AFTER ALL OTHER QUERIES!
# If your site will contain new pages, you can set up the basic structure for them here.
# Once the import is complete, go into the WordPress admin and copy content from the Drupal
# pages (which are set to "pending" in a query above) into the appropriate new pages.
INSERT INTO wordpress.wp_posts
    (`post_author`, `post_date`, `post_date_gmt`, `post_content`, `post_title`,
    `post_excerpt`, `post_status`, `comment_status`, `ping_status`, `post_password`,
    `post_name`, `to_ping`, `pinged`, `post_modified`, `post_modified_gmt`,
    `post_content_filtered`, `post_parent`, `guid`, `menu_order`, `post_type`,
    `post_mime_type`, `comment_count`)
    VALUES
    (1, NOW(), NOW(), 'Page content goes here, or leave this value empty.', 'Page Title',
    '', 'publish', 'closed', 'closed', '',
    'slug-goes-here', '', '', NOW(), NOW(),
    '', 0, 'http://full.url.to.page.goes.here', 1, 'page', '', 0)
;

A few notes and suggestions:

  • Use at your own risk. I’m just sharing the code I used. In fact, it’s not even exactly the code I used, since I had to generalize a few things for mass consumption. I haven’t even tested this modified version. I make no guarantees as to its reliability or applicability to your particular implementation.
  • Download the script below; don’t just copy-paste the code from the block above. I had to apply some HTML formatting to make it look right, and this might result in bad characters that would break the script if run as-is above.
  • Always work with a test copy of your database before altering live site content. In theory this script can be run over and over again cleanly, but it will erase whatever data was already in the WordPress database, and there will be no way to get it back unless you have a backup.
  • Read through the code carefully, especially all comments before running the script. Most likely you will need to make some changes to the code.
  • Good luck! If you find bugs or develop further improvements, please comment below!

Download the Script

Download zip file... Drupal-to-WordPress Migration Script
drupal-to-wordpress.sql.zip • 4.8 KB