Download those PDFs!

Wow, I really like these 512x512 icons in Snow Leopard...This post is strictly for web developers/server administrators. The rest of you can resume your daily activities, confident that nothing that was even remotely relevant to you transpired here.

PDFs. Web browsers. Both are a daily, or at least frequent, part of the lives of most computer users. But not all web browsers are created equal when it comes to the matter of handling PDFs. Some browsers (say, the ones developed by commercial OS makers) take the approach of trying to manage everything for the user. They include PDF readers that are embedded right into the browser, and PDFs load almost like another web page. Other browsers (most notably Firefox) treat PDFs as downloadable files, and when the user clicks a link to one, the file gets downloaded to their hard drive to be opened in a “helper app” — usually Adobe Reader.

Most website owners prefer the latter approach, and I suspect most users do as well. PDFs in general are not much like web pages, and it does not seem logical that they should be viewed within a web browser. Generally when people are accessing a PDF, it’s because they want to download the document to their computer to be used offline or to print. It is illogical for these actions to take place in a web browser. Sure, savvy users can right-click (or on Mac, Ctrl-click) and select “Save linked file as…” or some such nonsense from the contextual menu. But a lot of Windows users don’t even know their mouse has a right button, a lot of Mac users have no idea that you can press keys and click the mouse button simultaneously to perform special actions, and a lot of both would be confused by this entire process.

So we come to the matter of web developers (such as myself) trying to find ways to force the web browser to download the file instead of loading it directly. A trick I have used often is to link not to the file directly, but to a special PHP script that reads the file into the server’s memory, changes various aspects of the response (such as the MIME type), and then streams the content out to the browser. This is especially useful when you want to restrict access to files, say only to logged-in users, or only to users who have entered a special passcode. The PHP script is perfect for that, because it allows you to execute some code before sending the file to the browser. It even lets you store your files in a directory on the server that web browsers cannot access directly, ensuring (more or less… this article isn’t about hacking) the security of your files.

But what if your files aren’t in a protected area? What if you don’t want to bother with the mucky-muck of the PHP wrapper — you just want to link directly to the (browser-accessible) file, but you still want to force the download? Well, if you’re using Apache, you’re in luck. I found this great explanation of a small block of code you need to add to your httpd.conf file to achieve the same effect.

Ultimately, what you want to do is change the MIME type of the response. Browsers that are inclined to load PDFs internally perform this magic by seeking out files that are sent with the application/pdf MIME type. But there’s a very handy, “generic” MIME type for binary files, which all browsers treat as files to be downloaded rather than displayed directly: application/octet-stream. It may sound like a group of 8 singers standing in a small river, but it really just means… a generic binary file.

Here’s the complete block of code to put into your httpd.conf file, or into the appropriate virtual host configuration, however it’s stored on your particular server. I placed the code just within the virtual host configuration for the client in question, so the change applies only to their site, and not to any others that are also running on the server:

<FilesMatch "\.(?i:pdf)$">
ForceType application/octet-stream
Header set Content-Disposition attachment
</FilesMatch>

If you’re not the server admin, you should also be able to put this in a .htaccess file in your site’s root directory, but I haven’t actually tested that to see if it works.