Migrating Content to WordPress

Tips for migrating content from a static site.

Motivation

A migration is always a ton of work. I’ve done a lot of them, and I never go into them lightly. To make it easier, I find ways to automate as much as I can at the start of the process, and take advantage of any automation I can do at the end of the process. That way, I minimize the manual labor that has to go on in the middle.

As to why I converted my site from DreamWeaver to WordPress, lets just say that my site, created in the late 90’s, had become ancient. And since so people were using mobile devices, search engines were penalizing sites like mine that only worked well on a large screen.

So WordPress gave me that chance to use a “responsive” theme that works well on mobile devices, which would help to restore my pages to their former prominence. It also gave me a site-search capability that had been missing, true menus, and themes with terrific graphic layouts that I could spend an entire career trying to create myself.

Perhaps even more importantly, it provides for a fair degree of interaction. People can comment on articles, and they can share articles they like using Twitter and Facebook, which makes it possible for the really important pages to “go viral” as people share them with others.

Learn More: Converting to WordPress: Benefits and Drawbacks

Prep

As always, it is helpful to do some prep work before you start adding posts, to create pages that more closely approximate what you want for the conversion.

In my case, I was coming from a DreamWeaver site, so I made a temporary copy of the site and modified the templates to remove the page adornments that weren’t needed in the new WordPress site (horizontal bars, for example, and footer text).

I also took the opportunity to reorganize things a bit, so I needed to create new folders in the temporary DreamWeaver site, and move things into them. That way, DreamWeaver’s CMS capabilities could be used to automatically adjust the links. (DreamWeaver is much more of a CMS that WordPress, frankly. Were it not for the many other advantages that WordPress provides, I might never have converted!)

Another advantage to having a temporary (TMP) copy of the site is that I could delete articles as they were migrated, so I always know how much was left to do.

Migration

Keep Notes

At the end of the process, you can use a search-and-replace plugin like Search RegEx for the final cleanup steps. But the many things that become apparent as you working with the files need to be noted.

Small individual problems that apply to the current article should of course be fixed. After all, it’s just as easy to fix it as it is to take a note.

But you will also note generic problems that apply to articles that have yet to be converted — problems that are best fixed with an automated search and replace, after the articles have been moved. Each such problem needs to become an item in the “Cleanup list”, so it isn’t forgotten.

Articles

If you’re migrating from a static site, you’ll then create posts one-by-one for each HTML page in the old site. Copy the title into the title field, and copy the remainder of the HTML into the content field. Then you’ll manually add a <!–more–> tag, to tell WordPress where the summary information starts. (Unfortunately, you can’t use automation to put that tag into your source files. It doesn’t survive when you paste in the content.)

Of course, all of the links would still have to change, since an old DreamWeaver link that went to /Wheat.html became /whats-wrong-with-wheat in the new site. So it made sense to change as many of those as I could, as the articles were migrated. (But clearly, a good a LinkCheck step is needed afterward.)

Files

Image files that were liable to be referenced from anywhere (or from top-level pages, at least) went into the standard media library. But with several hundred text files, spreadsheets, image files, and image-construction source files that were only referenced in subcategories, upload to the Media Library would have been an onerous task.

In the end, I decided that the best way to proceed was to create a “files” subdomain in the site, and replicate the category hierarchy. So an image that was previously in {site}/health/images moved to files.{site}/health. Then, after uploading the files via FTP, and migrating the articles, the myriad of broken links could be fixed using fairly straightforward regular-expression replacements.

Cleanup

Convert to “Semantic” Tags

HTML <i> tags work fine in the editor, but they don’t appear (at least, not in the theme I’m using). So one update to replace <i> and </i> with their “semantic” equivalents, <em> and </em>. Similarly, <tt> works in the editor, but only <code> works when the page is displayed. (Surprisingly, <b> displays just fine, so you’re not forced to use <strong>.)

Note:
I am in vehement disagreement with HTML’s designers on this subject. There is such a thing as a true “semantic” tag, where the tag has a meaning. Such tags are very useful. But “em” is no more or less meaningful than “i”, “code” is no more or less meaningful than “tt”. They’re just longer!

Adjust External Links

Where there are links that take a visitor away from your site (external links), they need to open in a new tab, so the visitor keeps their place in your site. You do that by adding target="_blank" to the link. (See the Search RegEx post for tips on how to do that.)

Check Links

When you’re done with the manual conversion effort, you’ll have fixed many of the links — especially links to articles, which could be inserted as you were editing. However, some of those will be missed. And most of the links to images and various kinds of files will probably have been missed.

So now is the time to run a good Link Check program, find everything that’s broken, and fix them.

Go Live

Fix Internal Links

WordPress stores absolute links to posts and static pages, instead of relative links. (MediaWiki is much better, in that regard. There is a lesson to be learned, there!) The WordPress policy has several unfortunate consequences:

  1. When you move a site, all of the site-links in all of the posts and static pages need to change.
  2. So if you create a temporary site to migrate content to, you have to change those links when the site goes live, and takes on it’s final URL.
  3. When you visit a post listed on the “Featured Articles” page, for example, the URL becomes https://{tmpURL}/~{username}/wp-admin/{category}/{yourPost}, instead of simply https://{yourSite}/{yourPost}, as you would like.

Copyright © 2017, TreeLight PenWorks

Please share!

1 Comment

    Trackbacks & Pingbacks

    1. Getting Started with WordPress | Treelight.com March 30, 2017 (7:54 am)

      […] Content Migration […]

    Add your thoughts...

    This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Categories


    %d bloggers like this: