XML Import using WP-CLI

The built-in XML import and export is often overlooked, and for good reasons. It’s not meant to migrate an entire website. It won’t migrate anything from the customizer, menus or widgets. What it does handle is migrating page and post content. It works well in situations where you want to pull in an existing blog into a brand new WordPress site.

Even in ideal situations importing feel a bit glitchy.

In order to successfully import content with images the original website needs to be healthy and accessible. Then there is dealing with errors relating to author accounts not being in sync between the two websites. Lastly, large imports tends to fail to due limitation of PHP. That’s pretty much any blog with images.

The solution is to use XML import over WP-CLI.

Using XML import over WP-CLI is significantly better. With the command line you have direct access to the server. That means large imports will run much faster and successfully complete. Since imports can easily take hours to run I highly recommend using screen -R to start it as a background process.

wp import wordpress.2020-04-14.000.xml --authors=usermap.csv

If you can use the argument --authors=create then do. However that option can fail. The workaround is to define a user mappings CSV file which helps keep the importing from failing. Here is a sample of what that CSV file looks like.

old_user_login,new_user_login
randomuser1,anchorhost
randomuser2,anchorhost
oldsiteadmin,anchorhost

For a website with many users, creating the author’s CSV can be tricky.

However a few smart CLI commands can significantly help. You can extract a unique list of authors from the XML file then manually create a CSV.

# Scan XML file for authors
authors=$( grep -Po "(?<=<dc:creator>).*(?=</dc:creator>)" wordpress.2020-04-14.000.xml )

# Print out unique list of authors
php -r "\$items = \"$authors\"; \$array = explode(\"\n\", \$items ); echo implode( \"\n\", array_unique( \$array ) );"

Take that unique list and put it into a two column Google spreadsheet with columns old_user_login and new_user_login. Make any manual mapping changes then export as a CSV file.