Smile news

How to migrate content to Strapi?

  • Date de l’événement Jun. 13 2023
  • Temps de lecture min.

Content migration is a common thing when replacing a CMS by another one. In this article, we’re going to see how we can migrate content to Strapi.
For any data migration, we first need to define the source and the destination. The source may be a database, an API, some files (CSV, JSON, …)... depending on where your data comes from. The destination in our case is Strapi.
As the source will be different for every data migration, we’re not going to address this topic in that article, and we’ll focus on the destination: Strapi.
So, how can we migrate content in Strapi?

How we can migrate content in Strapi?

First thing is to define the content types in Strapi that will host the content we want to migrate.
In our example, we’ll use the FoodAdvisor demo that defines several content types including restaurant and article.
The next step is to find a technical method for Strapi mass content creation.
 

How To Create Content In Strapi?

We could import data directly in the database. But that means we’d need to know its structure. The database structure is directly managed by Strapi depending on the content types that are declared in Strapi. While it may be quite simple for really simple content types, it’s getting complex when starting to use relations, components and dynamic zones. We won’t use this approach here.
We may use the newly available (in v4.6.0) Strapi Data Import feature, but it wasn’t available when we started to search for a solution. Also, as it’s originally designed for export / import from a Strapi instance to another Strapi instance, we’d need to reverse-engineer the structure of this export (some JSON Lines files) to make it work. The main difficulty would be to handle IDs manually (creating them and keeping them in memory to create the relations between contents).
We could also use Query Engine API or Entity Service API, but it would require coding the migration script in some Strapi-aware runtime, for example through a plugin. It seemed too complicated to do, especially for something that would be removed from Strapi after migration is done.
We then decided to use the Rest API. After all, Strapi is a headless CMS and is hence designed API-first. So now, what Client API can we use and what payload can we put in the body of those API calls?
 

Strapi Rest Api Client ?

While Strapi REST API clients exist, like strapi-client-js or strapi-sdk-js, none of them had all the features we needed: content and media creation with possibility to create Media directories and moving media in directories.
While media creation uses the traditional API, creating a media directory and moving media in directories require using a special API provided by the upload plugin and requiring a different authorization method (the usual API token doesn’t work in that case and a JWT token has to be used instead).
For that purpose, we developed @smile/strapi-client. Its usage is fairly simple:

import {StrapiClient} from '@smile/strapi-client';
const strapiClient = new StrapiClient('http://127.0.0.1:1337', 'token', 'admin_token');


The constructor accepts 2 tokens:

  • The first one is the usual API token generated in Strapi > Settings > API Tokens.
  • The second one is optional (useful for creating media folders or moving media in the Media Library) and should be a JWT token used by Strapi admin webapp. This token can be found in browser local or session storage after authenticating to Strapi admin.

Then, the Strapi client allows:

  • creating an entry: strapiClient.createEntry('articles', {...});
  • updating an entry: strapiClient.updateEntry('articles', articleId, {...});
  • creating a media: strapiClient.addMediaAsset('pathOrUrl', ‘alt’, ‘caption’);
    • if an URL is provided, the file is automatically downloaded
  • creating a media folder: strapiClient.createMediaFolder(‘folderName’, parentId)
  • moving several media to a folder: strapiClient.moveMedia(folderId, [mediaId1, mediaId2, …]);

For entries, the data passed to the API must match Strapi content types. But how can we help the developer to pass the correct data?


Type generation

In order to pass Strapi API a payload matching the expected content types, we should create TypeScript interfaces that describe fields and their types for every content type in order to help the development phase. But that task is laborious and error prone so we’d like to automate it.
Strapi provides a command to generate types (strapi ts:generate-types), but the generated types do not suit the ones expected for API calls.
We then developed our own tool (strapi-content-type-to-ts). For the restaurant content type of the FoodAdvisor demo, it would generate something like that for the restaurant content type (simplified):

export interface Restaurant {
  name: string;
  slug?: string;
  images: { id: number }[];
  price: (`p1` | `p2` | `p3` | `p4`);
  ...
}

Note that this tool can be extended to handle custom fields provided by Strapi plugins.

Hands on

Introduction

As explained above, our example is based on the FoodAdvisor demo. Make sure you’ve cloned that repository first. As the repository is fairly big, you may clone only the last commit with: git clone --depth 1 git@github.com:strapi/foodadvisor.git


This demo has 2 main directories:

  • api, corresponding to the Strapi part
  • client, corresponding to a Next web application for the demo


Make sure that you’ve installed both parts (api and client) and that you ran the seed in the api part. Also, start Strapi (and Next.js if you want to have a frontend demo).


For our hands on, we’re going to create a new node project outside of the FoodAdvisor demo. But you may also develop it inside the FoodAdvisor demo (with some changes in configuration).

 

Tips for devlopment in Strapi directory

If you decide to develop in the FoodAdvisor demo Strapi project, you should know that Strapi, when started in development mode, automatically restarts and rebuilds when it detects file changes. To prevent unwanted restarts while developing the migration script, you should disable it for the directory you develop your migration script in.


Modify the config/admin.js file:

module.exports = ({ env }) => ({
  ...
  watchIgnoreFiles: ['**/yourMigrationDirectory/**'],
});

Also, if your Strapi project uses TypeScript (it’s not the case by default in FoodAdvisor demo), to prevent compiling TypeScript files in migration directory from Strapi parent project, add the migration directory in tsconfig.json exclude list:

{
  "extends": "@strapi/typescript-utils/tsconfigs/server",
  ...
  "exclude": [
    ...
    "yourMigrationDirectory/"
  ]
}

Init migration project

Create a new directory (for example foodadvisor-migration), go in that directory and run:

npm init -y

Add the following dependencies (respectively for TypeScript types generation script from Strapi content types and for a Strapi client API):

npm i @smile/strapi-content-type-to-ts @smile/strapi-client

Then, create a tsconfig.json file as we’re going to code in TypeScript modules:

{
  "compilerOptions": {
    "target": "ESNext",
    "module": "ESNext",
    "moduleResolution": "node",
    "strict": true,
    "skipLibCheck": true
  }
}

Create a migrate.mts file (we’ll add code later).
And add a scripts in package.json:

{
  "name": "foodadvisor-migration",
  "scripts": {
    "generate-content-types": "strapi-content-type-to-ts -s ../foodadvisor/api -e custom-field -o strapi-content-types.mts",
    "migrate": "tsc && node migrate.mjs"
  },
  ...
}

Make sure to change the -s parameter in generate-content-types script to match the relative path to your FoodAdvisor api directory from your foodadvisor-migration project.

 

Generate types

Run the following script:

npm run generate-content-types

This should create a strapi-content-types.mts file with all the interfaces corresponding to the Strapi content types.
But it also logged a warning about not handling ckeditor.CKEditor field type:

Missing custom field plugin for ckeditor.CKEditor.
Create a /.../foodadvisor-migration/custom-field/ckeditor.CKEditor.js file with the following signature:
module.exports = function (options) {
  return '...';
}

Indeed, by default, the script only handles Strapi native field types. For unhandled types, it generates an any TypeScript type:

export interface Article {
  title: string;
  ...
  ckeditor_content: any //FIXME: missing custom field plugin for ckeditor.CKEditor;
}

But ckeditor.CKEditor is just a nice UI on top of a string content type. So we could add a plugin to @smile/strapi-content-type-to-ts by following the indications in the warning log.
Create a custom-field/ckeditor.CKEditor.js file with the following content:

module.exports = function (options) {
  return 'string';
}

Run the script again (npm run generate-content-types) and you’ll see there is no more warning and that the type has been replaced from any to string in the generated interface:

export interface Article {
  title: string;
  ...
  ckeditor_content: string;
}

In our demo, we’ll only import restaurants. Here is the (simplified) version of the interfaces we’ll use:

export interface Restaurant {
  name: string;
  slug?: string;
  images: { id: number }[];
  price: (`p1` | `p2` | `p3` | `p4`);
  information?: RestaurantInformation;
  place?: number;
}

export interface RestaurantInformation {
  location?: RestaurantLocation;
}

export interface RestaurantLocation {
  address?: string;
}

Now that we have types for Strapi content types, let’s prepare our source input data.

Prepare source input data

For our demo, the data to migrate will come from a JSON file, restaurants.json, with the following format:

[
  {
    "name": "Epicure - Le Bristol Paris",
    "slug": "epicure-le-bristol-paris",
    "mainPhotoSrc": "https://res.cloudinary.com/tf-lab/image/upload/restaurant/8c5df4b2-34be-4eba-bc36-e08e73665325/c171c70e-8b70-4d83-a9ed-dfe8643da943.jpg",
    "priceRange": 451,
    "address": {
      "street": "112 Rue du Faubourg Saint-Honoré",
      "postalCode": "75008",
      "locality": "Paris",
      "country": "France"
    }
  },
  ...
]

Create the restaurants.json file and fill it with data. You can use that JSON.
To type this data, let’s create an input-types.mts file describing the format of the input data:

export interface RestaurantInput {
    name: string;
    slug: string;
    mainPhotoSrc?: string;
    priceRange: number;
    address?: {
        street?: string;
        postalCode?: string;
        locality?: string;
        country?: string;
    }
}

We’ll also have to transform this source data. For this purpose, let’s create a utils.mts file:

import {RestaurantInput} from './input-types.mjs';
import {Restaurant} from './strapi-content-types.mjs';

/**
 * Returns the Strapi content type (among ‘p1’, ‘p2’, ‘p3’ and ‘p4’) from a numeric price
 */
export function getPrice(price: number): NonNullable<Restaurant['price']> {
    return price > 200
        ? 'p4'
        : price > 100
            ? 'p3'
            : price > 50
                ? 'p2'
                : 'p1';
}

/**
 * Returns a String address from a structured object address
 */
export function getAddress(restaurant: RestaurantInput): string {
    return [
        restaurant.address?.street,
        restaurant.address?.postalCode,
        restaurant.address?.locality,
        restaurant.address?.country,
    ]
        .filter(v => !!v)
        .join(' ');
}

The source input data gives a numeric price. We have to deduce the price (among ‘p1’, ‘p2’, ‘p3’ and ‘p4’) from that numeric value. The getPrice does that by doing the following mapping: p1 ≤ 50 < p2 ≤ 100 < p3 ≤ 200 < p4.

The source input data gives an address as a structured object with street, postalCode, locality and country. But our target Strapi content type only has a String address (in information.location.address). The getAddress function extracts that String address from a restaurant input object.

We now have everything we need to write the migration script.

 

Write the migration script

We first need to get the required API tokens. Make sure Strapi is running and connect with an admin account, then:

  • generate a token in Settings > API Tokens:
    • click “Create new API Token”
    • fill-in a Name, choose “Unlimited” duration and “Full access” Token type.
    • click “Save”
    • copy the token
  • get your admin JWT token:
    • open your browser developer tools (F12 or CTRL+SHIFT+i)
    • execute in your console the following command:
JSON.parse(sessionStorage.jwtToken || localStorage.jwtToken)
  • copy the token

Open migrate.mts file and create a StrapiClient (fill-in the tokens retrieved in the previous step):

import {StrapiClient} from '@smile/strapi-client';

const token = '...';
const adminToken = '...';
const strapiClient = new StrapiClient('http://127.0.0.1:1337', token, adminToken);

Continue editing migrate.mts file to load the restaurants from the JSON file:

import {StrapiClient} from '@smile/strapi-client';
import * as fs from 'fs';
import {RestaurantInput} from './input-types.mjs';

...

const restaurants: RestaurantInput[] = JSON.parse(fs.readFileSync('./restaurants.json').toString());

Then write a function creating a restaurant and the corresponding image:

import {StrapiClient} from '@smile/strapi-client';
import * as fs from 'fs';
import {RestaurantInput} from './input-types.mjs';
import {Restaurant} from './strapi-content-types.mjs';
import {getAddress, getPrice} from './utils.mjs';

...

const mediaIds: number[] = [];

async function createRestaurant(restaurant: RestaurantInput) {
  const mediaCreationResponse = await strapiClient.addMediaAsset(restaurant.mainPhotoSrc!);
  const mediaId = mediaCreationResponse[0].id;
  mediaIds.push(mediaId);
  await strapiClient.createEntry<Restaurant>('restaurants', {
    name: restaurant.name,
    slug: restaurant.slug,
    images: [{id: mediaId}],
    price: getPrice(restaurant.priceRange),
    information: {
      location: {
        address: getAddress(restaurant)
      }
    },
    place: 1
  });
}

N.B.:

  • we create a global mediaIds to store all the ids of the created media. The goal is to be able to move them all in one directory later
  • we first create the media before creating the restaurant in order to have the media id for the images attribute
  • the media is automatically downloaded by strapiClient.addMediaAsset function
  • we use getPrice and getAddress from our utils.mts
  • we set place to 1. It should match the already existing place “Paris” that has been created by the seed


We now have to loop on restaurants and call the createRestaurant function:

...

for (let i = 0; i < restaurants.length; i++){
  const restaurant = restaurants[i];
  try {
    await createRestaurant(restaurant);
    console.log(`Created restaurant ${i + 1}/${restaurants.length} (${Math.round((i + 1) * 100 / restaurants.length)}%)`);
  } catch (e) {
    console.error(`Failed creating restaurant`, JSON.stringify(restaurant, null, 2), e);
  }
}

N.B.: we add some logs to detect potential errors and visualize progression.

To finish, we create a media directory and move every imported media to that directory:

...

const mediaFolderCreation = await strapiClient.createMediaFolder(`Migration ${new Date().toISOString()}`);
await strapiClient.moveMedia(mediaFolderCreation.data.id, mediaIds);

Run the script:

npm run migrate

You should see the following logs:

Created restaurant 1/20 (5%)
Created restaurant 2/20 (10%)
...
Created restaurant 19/20 (95%)
Created restaurant 20/20 (100%)

 

And you should see your restaurants in the Strapi admin interface (and even in the Next web application if you launched it).
That’s all! 


Sources

You can find the complete sources of this demo here.

Performance considerations

Some good practices can be followed to improve performances. With those good practices, on a real data migration project, we successfully imported around 37,000 contents and 5,500 images (2.5Gb) in less than 20 minutes on a development laptop.

Here are some of our recommendations.


Use a real database

Use a real database (like PostgreSQL) instead of SQLite. SQLite has a real impact on performances.

 

Pre-download media

If you need to import media from an URL, make a script to download them all locally and then make your migration script use the local version of those media. When running the import again and again in development, it’ll greatly improve the performances as you’ll only have to download the media once.

 

Configure reponsive image feature

Importing an image is quite long because of the Strapi responsive image feature that creates multiple sizes of an image. Make sure it’s configured with the sizes you need and disable the feature if you don’t need it.

 

Parallelize API calls

To make the most of multi-core CPUs, you should parallelize your API calls. This will greatly improve migration performance. The number of parallel calls to configure is empirical (and you should test different values to choose the best one for your case), but a default value of 20 should be OK.


Here is a example of a function that can be used to handle multiple API calls in parallel:

async function batchTasks<T>(items: T[], task: (context: {
  item: T,
  items: T[],
  index: number,
  size: number
}) => Promise<any>, limit: number) {
  const activeTasks: (Promise<any>)[] = [];
  const tasks = items.map((item, index) => () => task({item, items, index, size: items.length}));
  for (const task of tasks) {
    if (activeTasks.length >= limit) {
      await Promise.race(activeTasks);
    }
    const activeTask = task().finally(() => activeTasks.splice(activeTasks.indexOf(activeTask), 1));
    activeTasks.push(activeTask);
  }
  return Promise.all(activeTasks);
}

It could be used like this to have 10 restaurants imported in parallel:

await batchTasks(
  restaurants,
  async ({item: restaurant, index, size}) => {
    try {
      await createRestaurant(restaurant);
      console.log(`Created restaurant ${index + 1}/${size} (${Math.round((index + 1) * 100 / size)}%)`);
    } catch (e) {
      console.error(`Failed creating restaurant`, JSON.stringify(restaurant, null, 2), e);
    }
  },
  10
);

N.B.: a concurrency bug currently exists in Strapi when trying to upload several media in parallel because Strapi tries to create an “API Uploads” directory in simultaneous threads in parallel without appropriate lock. So, you’ll have to make sure the “API Uploads” exists before importing several media in parallel, import media sequentially or make sure the first media upload is finished before uploading the other media in parallel to guarantee the “API Uploads” directory has been created before importing the other media.

Conclusion

Retrospectively, for our real data migration project, the REST API usage was a good approach as it met all our expectations in terms of features and performance. Note that we did a Proof Of Concept to validate the performance of this approach before developing the entire migration.
This approach results from our data migration needs for our use case. It is not a silver bullet and it may not suit your needs.
For a bigger volumetry (several hundreds of thousands or millions of contents), the performance of the REST API may not be enough. If you’re in that case, you should make a POC to validate or not the REST API approach. If using the REST API is not performant enough, consider using lower level APIs, Strapi Data Import feature or direct database import.
As it wasn’t necessary for our use case, the @smile/strapi-client and @smile/strapi-content-type-to-ts packages do not currently handle the Strapi internationalization plugin (that gives the possibility to save different values for different languages on some content type fields). If you need that feature for your migration, it could easily be added as both those packages are Open Source. You could even implement it yourself and create a Pull Request.

Maxime Robert

Maxime Robert

Expert technique