Issues converting from Drupal to Hugo

I have had to relocate this blog a couple of times over the past few months, and every time I try to make it better. Sure, I have not actually posted anything since 2009 but there have been a few appreciative comments telling me at least some of the content is still useful, and a few things I have been struggling with have inspired me to reboot the blog.

Drupal 8 was released a few weeks ago which means Drupal 6 (which drove my blog) is officially no longer supported. Having basically lost all interest in Drupal, the time taken to migrate servers feels like an incredible waste of time just to have a database-driven blog; but I recently discovered Hugo, a static site generator with a number of nice themes and an impressive list of features. So I figured it is time to ditch Drupal and return to publishing content in a static site. This post details the issues I had in the process of converting my site from Drupal to Hugo.

Exporting content from Drupal 6

Hugo is written in Go, a language I have mostly ignored for the past 5 years but which is becoming more interesting with the number of high profile DevOps tools that are all written in Go. Think Docker and CoreOS’ suite of supporting tools like etcd and fleet; or the GitHub/Gitlab clone Gogs.

One of the nice things about Go is it’s quite easy to learn and code is generally well structured. So luckily there is a tool written in Go to help with exporting content out of a Drupal database: drupal2hugo. Unfortunately it is designed for Drupal 7 sites with fields and entities which weren’t in core Drupal 6, but all I really want out of it is the core content and its metadata. Someone else had already tried it with a D6 site and hit a wall, so I wrote a patch to solve the problem.

drupal2hugo dumped the node content into a bunch of Markdown files with mostly correct metadata. I just had to add url aliases, tags and categories (my patch may not be perfect ;-))

Fixing markup

Of course it came out as full HTML and some of it needed a bit of tweaking to remove old classes and tables of contents that were unnecessary. I left some files as markup and cleaned some up to make them Markdown.

None of the links to images and downloads worked so they all had to be fixed up. Drupal stores its files in /sites/default/files/ and/or /sites/_domain_/files/ by default so it was a fairly simple sed command to fix all the image links in place.

$ sed -i '' -e 's,/sites/(default|waddles.org)/files/,/,' *.md

Then of course move the site’s artifacts to Hugo’s static/ directory.

Cloudflare Flexible SSL

A few months ago I started using Cloudflare as my CDN mainly for their offline caching of the site in case my server had to be rebooted for some reason. Cloudflare offers a flexible SSL option that secures content from the user to the CDN but uses plaintext from the CDN to the actual site.

Disqus not working at all

I found that enabling SSL caused the Disqus section to fail to load but eventually settled on the following configuration which got the Disqus section working, albeit without any historical comments showing:

  1. Edit Hugo’s config.toml and set

    baseurl = "https://waddles.org/"
    
  2. Set Flexible SSL in Cloudflare’s Security settings

This gave me the problem that icons were displaying as square boxes.

Font Awesome Icons displaying as boxes

The Hugo theme I am using is Icarus which includes Font Awesome Icons, a clever CSS trick to display scalable icons using fonts. Unfortunately they only displayed correctly when SSL was off.

The themer chose to install Font Awesome locally instead of using a CDN, but since I’m using Cloudflare anyway, I thought it made more sense to Cloudflare’s CDN version of the library - which was an upgrade anyway. So a simple change to the templates was all it took to get icons working again.

diff -u themes/hugo-icarus-theme/layouts/partials/head.html layouts/partials/head.html
--- themes/hugo-icarus-theme/layouts/partials/head.html 2015-12-17 17:20:12.000000000 +1000
+++ layouts/partials/head.html  2015-12-19 17:56:58.000000000 +1000
@@ -14,15 +14,14 @@
     <link rel="icon" href="{{ .Site.BaseURL }}favicon.ico">
     <link rel="apple-touch-icon" href="{{ .Site.BaseURL }}apple-touch-icon.png" />
     <link rel="stylesheet" href="{{ .Site.BaseURL }}css/style.css">
-    <link rel="stylesheet" href="{{ .Site.BaseURL }}css/font-awesome.min.css">
+    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.5.0/css/font-awesome.min.css">
     <link rel="stylesheet" href="{{ .Site.BaseURL }}css/monokai.css">
     <link rel="stylesheet" href="{{ .Site.BaseURL }}fancybox/jquery.fancybox.css">
     {{ template "_internal/opengraph.html" . }}

Disqus comments not appearing

Drupal’s Disqus module was clever enough to keep track of comments by using the /node/nid URL path regardless of how the node was referenced - ie. users may click a link to //waddles.org/content/replicating-zfs-root-disks and Drupal would render the content of /node/16 along with the Disqus comments associated with /node/16.

In Hugo, although I added URL aliases for the Drupal paths, Disqus was using the content’s new path (eg. https://waddles.org/2009/11/17/replicating-zfs-root-disks). Fortunately they allow you to remap your URL paths by uploading a CSV file of the old and new paths using their URL remapper. This is also how to “merge” discussions since they provide no way to delete discussions, and every page rendered in Hugo’s inbuilt server for local development creates a new discussion thread for every draft page.

comments powered by Disqus