Database schema evolution

Note

The Sqlite database itself does not support migration of existing data very well, and as a result migration is only possible on PostgreSQL databases. See these posts for more information on the issue: one of the last bullets of goals of alembic and Christopher Webber’s rant about the issue.

Once your application is running in production you may want to develop a new version. If code changes in the new version require changes to the database schema, you need to migrate the current database to preserve its data.

A migration example

In order to simulate a program that changes over time, the tutorial.migrationexamplebootstrap example contains an extra added_date column in Address. This code is commented out to make it possible to run the application with a database schema that does not include added_date at first. A new schema will be needed when the actual added_date is uncommented.

class Address(Base):
    __tablename__ = 'migrationbootstrap_address'

    id            = Column(Integer, primary_key=True)
    email_address = Column(UnicodeText)
    name          = Column(UnicodeText)
#    added_date    = Column(DateTime)
    added_date    = 'TODO'

    @exposed
    def fields(self, fields):
        fields.name = Field(label='Name', required=True)
        fields.email_address = EmailField(label='Email', required=True)

    def save(self):
        self.added_date = datetime.now()
        Session.add(self)
        
    @exposed
    def events(self, events):
        events.save = Event(label='Save', action=Action(self.save))

To try it out, do:

reahl example tutorial.migrationexamplebootstrap
cd migrationexamplebootstrap
reahl setup -- develop -N
reahl createdbtables etc
reahl demosetup

Doing all of this simulates an application that ran somewhere for a while, with some data in its database.

Now change the application:

  • comment out the ‘TODO’ version of added_date, and uncomment the version with the Column
  • edit the .reahlproject file and increase the version of the component to 0.1

Note

Your component has version 0.0 by default. In order to trigger a migration, you need a new, higher version.

To simulate installing the newer version, run:

reahl setup -- develop -N

After installing a new version of your component, run the following in order to migrate the old database:

reahl -l INFO migratedb etc

Migration basics

Create a Migration subclass for each change that needs to be made to the schema (and perhaps data) of the previous version.

In your AddDate Migration, override schedule_upgrades() with code that makes the schema changes. AddDate needs a class attribute version which states which version of your component it is for:

class AddDate(Migration):
    version = '0.1'
    def schedule_upgrades(self):
        print('scheduling upgrades for AddDate')
        self.schedule('alter', op.add_column, 'migrationbootstrap_address', Column('added_date', DateTime))

Register all your Migrations in the .reahlproject file:

<migrations>
    <class locator="reahl.doc.examples.tutorial.migrationexamplebootstrap.migrationexamplebootstrap:AddDate"/>
</migrations>

The migratedb command checks to see which version of your component the current database schema corresponds with. It then runs only those Migrations needed to bring the existing schema up to date with your new code.

Writing a schedule_upgrades()

Schema changes are written using the alembic.op module of SqlAlchemy’s migration tool: Alembic

A given application can consist of many components, and each of these may have its own Migrations. For this reason, you do not actually call functions of alembic.op in your schedule_upgrades(). You just schedule such calls to be run at the appropriate time using schedule(). Actual execution of these calls happen only once all components had a chance to schedule their migration calls.

Execution of these calls happen in a number of predefined phases. You schedule a call to run during a particular phase.

Phases, in order:

drop_fk
Foreign keys are dropped first, because they refer to other columns.
drop_pk
Primary keys are dropped next, they may also prevent other actions from completing.
pre_alter
Sometimes some code needs to be executed before tables are altered – saving some data in a temporary table, for example, or disabling some other constraints.
alter
Now that all possible constraints have been disabled, tables and columns may be altered.
create_pk
Then, primary keys can be created again.
indexes
Followed by indexes dependent on those primary keys.
data
With a schema mostly fixed, data can be inserted or moved to new locations.
create_fk
A last chance to recreate foreign keys to possible newly moved data in the new schema.
cleanup
Use this phase if any cleanup is needed of temporary tables, etc.