Migrations

From time to time, you might wish to adjust the schema of your database objects. The Pulp platform provides a migration system to assist you with this process. In this section of the guide, we will discuss how to configure your project’s migrations.

Registration

In order to write migrations for your Pulp plugin, you will need to register your plugin’s migrations package with the Pulp server. It is advantageous to register even a brand-new plugin that has no migrations; doing so will enable future users to fast-forward past your future migrations when doing new installations.

How to Register

There are a few steps you will need to perform in order to configure your project to advertise itself to Pulp’s migration system. First you will need to create a migrations Python package in your project’s plugin space. For example, the Pulp RPM project has its migrations at pulp_rpm.migrations. You don’t have to call it “migrations”, but that’s a reasonable choice of name.

Second, you will need to use the Python entry points system to advertise your migration package to Pulp. To do that, add an entry_points argument to in your setup() function in your setup.py file, like this:

    setup(<other_arguments>, entry_points = {
    <other_entry_points>,
    'pulp.server.db.migrations': [
        '<your_project_name> = <path.to.migrations.package>'
    ]
})

It’s important that the entry point name “pulp.server.db.migrations” be used here. To clarify this with an example, the Pulp RPM project’s setup.py has this as it’s entry_points setup argument:

    entry_points = {
    'pulp.distributors': [
        'distributor = pulp_rpm.plugins.distributors.iso_distributor.distributor:entry_point',
    ],
    'pulp.importers': [
        'importer = pulp_rpm.plugins.importers.iso_importer.importer:entry_point',
    ],
    'pulp.server.db.migrations': [
        'pulp_rpm = pulp_rpm.migrations'
    ]
}

Once you have that in your setup() function, you will need to install your package using your setup.py file. This will advertise your package’s migrations to Pulp, and you will be registered with Pulp’s migration system. Once you have installed your package, you should run pulp-manage-db as the same user that apache runs as, and you should see some output that mentions your migration package:

$ sudo -u apache pulp-manage-db
Beginning database migrations.
Migration package pulp.server.db.migrations is up to date at version 2
Migration package pulp_rpm.migrations is up to date at version 4
Migration package <path.to.migrations.package> is up to date at version 0
Database migrations complete.
Loading content types.
Content types loaded.

It should say that your package is at version 0, because you haven’t written any migrations yet. We’ll talk about that next.

Creating Migrations

In the event that you need to make an adjustment to your data in Pulp, you should write a migration script. There are a few rules to follow for migration scripts, and if you follow them carefully, nobody gets hurt. Here are the rules:

  1. Migration scripts should be modules in your migrations package.
  2. Each migration module should be named starting with a version number.
  3. Your migration version numbers are significant. Pulp tracks which version each install has been migrated to. It requires your migration versions to be positive integers that increase in value. For example, 0001_my_first_migration.py, 0002_my_second_migration.py, 0003_add_email_addresses_to_users.py, etc. You don’t have to use leading zeros in the names, as the number is processed with a regular expression that interprets it as an integer. However, the advantage to using leading zeros is that programs like ls will display your migrations in order when you inspect the contents of your migration package.
  4. Each migration module should have a function called “migrate” with this signature: def migrate(*args, **kwargs).
  5. Inside your migrate() function, you can perform the necessary work to change the data in the Pulp install.
  6. Your migrate() function must be written in such a way that it will not fail for a new installation. New installations will start at migration version 0, and all migrations up to the most recent migration will be applied by the system. Therefore, you must not assume that there is data in the database, or on the filesystem. Your migration script should detect what work, if any, needs to be done before performing any operations.

For example, your migrations package might look like this:

migrations
|
|-- __init__.py
|-- 0001_rename_user_to_username.py
|-- 0002_remove_spaces_from_username.py
|-- 0003_recalculate_unit_hashes.py

Here’s what the first migration, 0001_rename_user_to_username.py, might look like:

# Getting the db handle is left as an exercise for the reader
from somewhere import initialize_db

def migrate(*args, **kwargs):
        """
        We want to rename the 'user' attribute in our users collection to 'username' for clarity.
        """
        db = initialize_db()
        db.users.update({}, {'$rename': {'user': 'username'}})

Enabling Fast-Forward for New Installations

When a user installs your plugin for the first time, pulp can optionally skip the migrations. This saves a small amount of time, but more importantly, facilitates the removal of old migrations as described below.

By default, your migrations will not be skipped. To enable fast-forward, set the variable allow_fast_forward in your migration module’s __init__.py. The simplest example:

allow_fast_forward = True

This example assumes that you registered your migration tracker in the first release of your plugin. If that is not true, you will need to do some work to determine dynamically if it is safe to fast-forward. A simple and recommended approach is to check for existence of data:

allow_fast_forward = MyModel.objects.count() == 0

If your plugin has no data in the database, it is probably safe to skip the migrations.

Removing Old Migrations

You may not want to support all of your migrations forever. When you are ready to remove old migrations, follow these steps.

  1. Choose the oldest version of your plugin whose migrations you want to keep. We will use 1.4.0 as an example.
  2. Identify which migrations were introduced prior to that version, and delete them. For this example, imagine that migrations 0001-0005 were introduced prior to 1.4.0.
  3. Create a new migration with the highest number that was removed, which in this example is 0005. From that migration’s migrate function, raise pulp.server.db.migrate.models.MigrationRemovedError`.

The new migration could be named 0005_migrations_removed.py and would look like this:

from pulp.server.db.migrate.models import MigrationRemovedError

def migrate(*args, **kwargs):
    raise MigrationRemovedError('0005', '1.2.0', '1.4.0', 'my_plugin_name')