Migrating Commento comments to native Ghost comments

How to migrate comments from Commento (or other comments platform) to native Ghost member comments.

Migrating Commento comments to native Ghost comments

Now that Ghost has its own native membership and comments capabilities and Commento Open Source project is dead, it is time to migrate.

This tip is part of my more complete outline of Trio multi-volume Ghost Theme I created to consolidate each of the "Just Enough Architecture" Ghost instances (just, industry, tips & soapbox) into a single multi-volumes site. The final part of the process was to migrate the comments out of Commento into Ghost.

Status: May 2023 - Migration completed and moving this tip into its own blog, as need to move comments into Ghost is more generic than just Commento.


The Generic How

Some basic things about Commento and Ghost:

  • Commento has a comments export function, but rather than providing a download dump it generates a report that is sent via email. This did not work for me.
  • Commento uses a PostgreSQL relational database
  • Ghost has a member import function but not a comments import function. The member import format is via csv (comma seperated values) and the format for this is documented and downloadable as a template file.
  • Ghost uses a MySQL relational database
  • Commento and Ghost use highly normalised database schems which rely on objectids as keys to map the relationships between commentors/members, posts, comments and emails.
  • Commento uses unique "hex" identifier for objectids
  • Ghost uses MongoDB style 24 character hex ids, for which there are a number of generators for JavaScript, PostgreSQL and MySQL.

Here is outline of the process:

  1. The first thing required to migrate comments from one commenting platform to another is to create a denormalised migration table/s which traces the relationships and adds the referenced values into the migration table. This needs to be done for both the commenters/members and the comments.
  2. Generate the member migration csv using PostgreSQL command interface \copy command. Once commenter table is dumped it trivial to load into Ghost
  3. To extract the comments from Commento you should create a new migration table, which adds the extra columns needed to hold the denormalised value data. Using SQL you can do SQL update with joins to add in data values.
  4. Dump out the comments migration table and impot into Ghost's MySQL database.
  5. Now you have all the data collected in a single migration table, in MySQL you need to add Ghost style ObjectIds to each comment.
  6. Using email ids and slugs find the Ghost ObjectId for the posts and members and also use the prior Commento hex ids to trace and reapply the comment parent relationship using the Ghost style ObjectIds.
  7. You should now have all the data required to load the comments into the Ghost comments table. As the comments table has automatically managed foreign key constraints, disable these prior to loading.
  8. Use SQL inserts to add data into Ghost.
  9. Re-enable the SQL constraints

Once complete check and verify all is ok. The SQL required to complete these steps is provided in the "Trio" theme util directory, with seperate PostgreSQL and MySQL  SQL.

As this SQL has only been tested for my migration I would recommend to run each step seperately and verify using selects or with UI base SQL tool before proceeding to next step.

More details follows with "Specific Migration".

Specific Migration Details

Commento has an "export" button, but rather than giving dump via browser, this starts a job and is then supposed to send the export result via email. I am still waiting for the postman to arrive ... ;-) .

The Commento table structure has indirect association from "comments" to "commentor" via the "commentorhex" (rather then email address). So to find the email of address from comments requires lookup of db to get the email address to be able to bring these into Ghost. Lets see how this goes.

Before doing a dump of Commento I first did some direct database updates to move the historical commentary into the new site, that took new Trio multi-volume structure into account.

This required:

  • Update "pages" table so "domain" == main volume domain (just.graphica.com.au - for me) and "path"== prefixed with the sub-volume Ghost route prefix (for example 'post' -> '/tips/post')
  • Update the "comments" table as per pages ... "domain" == main volume domain, "path" == prefixed with the sub-volume Ghost route prefix

So for now the "Trio" multi-volume theme is:

  • rendering the existing Commento commetry and
  • has well as Ghost native member commentry

To load the members id from Commento is simple as Ghost has a CVS loader for this. So you can use psql /copy command to extract data:

# psql -d commento -U XXXXXX -W
Password for user XXXXXX: 
psql (10.10)
Type "help" for help.

all_commento=> \copy (select * from commenters) to '/tmp/commenters.csv' with csv header
COPY XXX

Having downloaded as CVS it is trivial to create Ghost member load CSV using the email and create date from Commento.

Looking at the Ghost native commenting table  ("comments") this uses uids for the comment "id" and foreign keys which reference: post, member emails (id) and parent comment. Commento has a similar structure using hexid to each comment and having hierachy of comments through hexid foreign keys and using hexid for user, but the posts are referenced by "domain", "post id" (== Ghost Slug).

So to "stuff" data into the Ghost native comments table you need to have:

  • Map from "slug" to post uid (or do query to DB)
  • Map from commenterhex to commento email to member uid (or do query to DB)
  • UID generator to create your own UID that is consistent and non-conflicting with Ghost ids (Ghost appears to use "bson-objectid" node library for id generation)

Doing some testing, Commento and Ghost Comments work differently, in how they managed a comment deletion. When a Commento comment is deleted then the commenter and the comment are replaced by "deleted" and "anonymous" and the logical delete flag is set. When a Ghost comment is deleted the result is to mark its status as deleted, while the record of the original commenter is maintained. To accommodate the Ghost behavior I added an "anonymous" placeholder account into the Ghost members database. All deleted comments where then set to point to this member.

Having imported the members, I decided the simplest way to migrate the comments was to, use SQL to bang the data into the Ghost DB. As the existing comments where in PostgreSQL and Ghost uses MySQL this done in two stages.

First the PostgreSQL stage:

--
-- From: pgsql (Commento)
--   copy comments to denormalised migrate table
--
select * into comments_migrate from comments;
--
-- add extra columns for migration data
--
alter table comments_migrate
  add column email text, 
  add column ghost_postid text, 
  add column ghost_commentid text,
  add column ghost_parentid text ;
--
--  get the email address for the comments that have commenterhex
--    note: this might only work with postgresql
--          which is what Commento uses...
--
update comments_migrate
  set email = commenters.email
  from commenters
    where commenters.commenterhex = comments_migrate.commenterhex ;
--
-- dump the table and load into ghost database
--  this can be done by
--   1. get table definition and edit these
--      to remove timezone qualifcation from timestamp and public
--      qualifier from name
--   2. dump table as sql and edit to remove the 'public.' qualifier
--      prior to loading
--

Now you should have SQL "insert" code from PostgreSQL. This has a bit of SQL preamble, which is not needed to import data into MySQL and PostgreSQL prefixes its tables with "public.", which must be removed.

Create the table using generated SQL "create" statement and then load data with generated "insert" statements.

Now you complete the migration process in MySQL:

--
-- dump the table and load into ghost database
--  this can be done by
--   1. get table definition and edit these
--      to remove timezone qualifcation from timestamp and public
--      qualifier from name
--   2. dump table as sql and edit to remove the 'public.' qualifier
--      prior to loading
--
-- From: mysql (Ghost)
--   load the ghost post ids using path/slug as join point
--
update comments_migrate
  inner join posts on locate(posts.slug, comments_migrate.path) > 0
  set comments_migrate.ghost_postid = posts.id ;
--
--
-- load ghost member id using email as the join point
--
update comments_migrate
  inner join members on comments_migrate.email = members.email
  set comments_migrate.ghost_memberid = members.id ;
--
--
-- Update the deleted comments to point to "anonymous" member
--  added via Ghost UI
--
update comments_migrate
  set comments_migrate.ghost_memberid = 'XXXXXXXXXXXX'
  where comments_migrate.commenterhex = 'anonymous' ;
--
--
-- Now need to allocate ghost comment ids and re-establish
-- parent relationship
--
-- To generate an MongoDB like UID I installed mysql extension
--  this will generate uid using: hex(xid_bin())
--  so adding hexid to comments is just a simple:
--
update comments_migrate
 set comments_migrate.ghost_commentid = hex(xid_bin());
--
--
-- Now re-establish parent relationshsips.
--
-- First create join that finds the missing ghost_parentid
--   via the available Commento parenthex
--
select
  child.commenthex as child_hex,
  parent.commenthex as parent_hex,
  child.ghost_commentid as child_id,
  parent.ghost_commentid as parent_id
from comments_migrate child 
  join comments_migrate parent on child.parenthex = parent.commenthex ;
--
-- This gives Commento: hex / parent hex 
--            Ghost:    id / parent id
--
-- Reducing this to return the set of missing data (ghost parent id)
--   This is just to validate that SQL statements works...
--
select
  child.ghost_commentid as child_id,
  parent.ghost_commentid as parent_id
from comments_migrate child 
  join comments_migrate parent on child.parenthex = parent.commenthex ;
--
-- ok lets squirrel that away in temp table ...
--
--
create temporary table comments_migrate_temp
  select * from (select child.ghost_commentid as
    child_id,parent.ghost_commentid as parent_id
  from comments_migrate child
    join comments_migrate parent
      on child.parenthex = parent.commenthex ) ids ;
--
-- and join and set ...
--
--
update comments_migrate
  inner join comments_migrate_temp
    on (comments_migrate.ghost_commentid = comments_migrate_temp.child_id) 
   set comments_migrate.ghost_parentid = comments_migrate_temp.parent_id ;
--
--
-- And now insert the Commento comments...
--
-- First disable foreign key contraints
--
set foreign_key_checks = 0;
--
-- copy Commento comments over..
--
insert into comments
  (id, post_id, member_id, parent_id, status, html, edited_at,
   created_at, updated_at)
select ghost_commentid, ghost_postid,
  ghost_memberid, ghost_parentid, if(deleted=1, 'deleted', 'published'),
  html, if(deleted=1, deletiondate, null), creationdate, creationdate
from comments_migrate;
--
-- verify..
--
--

If you got this far without issue then congratulations, if you found problems use commentry and if you change/extend SQL then please fork and push updates.

Summary

The king (prince.. toad that tried... but didn't get kissed) is dead ... long live the (ghost) king !

NOTE: See below for possible approach for Ghost SaaS users


References & Links:

xid for mysql - this my sql version of MongoDB 12 bit binary uid generator. After installing you can use "select hex(xid_bin()) to creat Ghost like VARCHAR(24) hex Id.

trio github repository - migration sql is in "utils", please fork and push improvements


Appendix A - Ghost Software-as-a-Service Migration (idea)

Let me qualify this note by saying I have not tired this... as I have self hosted Ghost rather then Ghost SaaS like Ghost Pro.

My suggestion for Ghost SaaS users is:

  1. Export out your Ghost site via "Labs" export
  2. Spin up a self hosted Ghost sandbox environment
  3. Import your side into sand box
  4. Extract your comments from what ever comments platform you have
  5. Import Comments into a comments_migration table in your Ghost Sandbox
  6. Run the comments migraiton process as per process above
  7. Once you have verified that comments migrated ok
  8. Export your sandbox site, which should now also include member comments (need to confirm this ...)
  9. Create new Ghost SaaS instance
  10. Import all your sandbox blog + comments into this new Ghost deployment
  11. Swap your site URLs from old Ghost SaaS site to new Ghost SaaS site

This is just a thought at this point ....

Firsty step is to confirm an export of member Ghost site with comments exports everything.

NOTE: I did export of site with members and comments and none of the members or comments appear to be in the JSON backup file...