Database schema evolution

Note

The Sqlite database itself does not support migration of existing data very well, and as a result migration is restricted to PostgreSQL and MySQL databases, even though MySQL also has some migration quirks. See these posts for more information on the issue: one of the last bullets of goals of alembic and Christopher Webber’s rant about the issue.

Once your application is running in production you may want to develop a new version. If code changes in the new version require changes to the database schema, you need to migrate the current database to preserve its data while matching the new schema.

A migration example

The example migration is quite elementary so the default sqlite database can be used to illustrate the concept.

class Address(Base):
    __tablename__ = 'migrationbootstrap_address'

    id            = Column(Integer, primary_key=True)
    email_address = Column(UnicodeText)
    name          = Column(UnicodeText)
    added_date    = 'TODO'
#    added_date    = Column(DateTime)

    @exposed
    def fields(self, fields):
        fields.name = Field(label='Name', required=True)
        fields.email_address = EmailField(label='Email', required=True)

    def save(self):
        self.added_date = datetime.now()
        Session.add(self)
        
    @exposed
    def events(self, events):
        events.save = Event(label='Save', action=Action(self.save))

To try it out, do:

reahl example tutorial.migrationexamplebootstrap
cd migrationexamplebootstrap
python -m pip install --no-deps -e .
reahl createdbtables etc
python migrationexamplebootstrap_dev/create_demo_data.py etc/

Doing all of this simulates an application that ran somewhere for a while, with some data in its database.

Now change the application to a newer version:

  • comment out the ‘TODO’ version of added_date in the Address class, and uncomment the version with the Column (this simulates a change in schema)

  • edit the setup.cfg file and add a new version table for 0.2 which includes a migration. Also create a version table for 0.1 (the previous version) to keep track of what that version used to depend on:

  [versions."0.2"]
    migrations = [
      "reahl.doc.examples.tutorial.migrationexamplebootstrap.migrationexamplebootstrap:AddDate"
    ]
  [versions."0.1"]
    install_requires = [
      "reahl-web>=6.0,<6.1",
      "reahl-component>=6.0,<6.1",
      "reahl-sqlalchemysupport>=6.0,<6.1",
      "reahl-sqlitesupport>=6.0,<6.1",
      "reahl-web-declarative>=6.0,<6.1"
    ]





  • edit the setup.cfg file and increase the version of the component to 0.2:

[metadata]
name = migrationexamplebootstrap
version = 0.2

Note

Your component had version 0.1 at first. In order to trigger a migration, you need a new, higher version. Database schema changes require new major or minor version; patch versions are ignored.

To simulate installing the newer version, run:

python -m pip install --no-deps -e .

This command regenerates setuptools metadata that is derived from your setup.cfg. Only after running it will the setuptools machinery pick up the changed version number and the added Migration.

Now that a new version of your component has been installed, run the following in order to migrate the old database:

reahl -l INFO migratedb etc

Migration basics

Create a Migration subclass for each logical change that needs to be made to the schema (and perhaps data) of the previous version.

In your AddDate Migration, override schedule_upgrades() with code that makes the schema changes:

class AddDate(Migration):
    def schedule_upgrades(self):
        print('scheduling upgrades for AddDate')
        self.schedule('alter', op.add_column, 'migrationbootstrap_address', Column('added_date', DateTime))

Register each of your Migrations in the ‘setup.cfg’ file, in the version table each migration is for:

    migrations = [
      "reahl.doc.examples.tutorial.migrationexamplebootstrap.migrationexamplebootstrap:AddDate"
    ]

The migratedb command checks to see which version of your component the current database schema corresponds with. It then runs only those Migrations needed to bring the existing schema up to date with your new code.

Writing a schedule_upgrades()

Schema changes are written using the alembic.op module of SqlAlchemy’s migration tool: Alembic

A given application can consist of many components, and each of these may have its own Migrations. For this reason, you do not actually call functions of alembic.op in your schedule_upgrades(). You just schedule such calls to be run at the appropriate time using schedule(). Actual execution of these calls happen only once all components had a chance to schedule their migration calls.

Execution of these calls happen in a number of predefined phases. You schedule a call to run during a particular phase.

Phases, in order:

drop_fk

Foreign keys are dropped first, because they refer to other columns.

drop_pk

Primary keys are dropped next, they may also prevent other actions from completing.

pre_alter

Sometimes some code needs to be executed before tables are altered – saving some data in a temporary table, for example, or disabling some other constraints.

alter

Now that all possible constraints have been disabled, tables and columns may be altered.

create_pk

Then, primary keys can be created again.

indexes

Followed by indexes dependent on those primary keys.

data

With a schema mostly fixed, data can be inserted or moved to new locations.

create_fk

A last chance to recreate foreign keys to possible newly moved data in the new schema.

cleanup

Use this phase if any cleanup is needed of temporary tables, etc.

Dependency management

Declare a dependency on component B in the setup.cfg of component A:

  • If component A declares a foreign key to a table that belongs to component B

  • If component A imports code from component B

The migration machinery computes a complicated order in which Migrations are scheduled and run. This ordering relies on correct dependencies among components.