Database schema evolution¶
The Sqlite database itself does not support migration of existing data very well, and as a result migration is restricted to PostgreSQL and MySQL databases, even though MySQL also has some migration quirks. See these posts for more information on the issue: one of the last bullets of goals of alembic and Christopher Webber’s rant about the issue.
Once your application is running in production you may want to develop a new version. If code changes in the new version require changes to the database schema, you need to migrate the current database to preserve its data while matching the new schema.
A migration example¶
The example migration is quite elementary so the default sqlite database can be used to illustrate the concept.
class Address(Base): __tablename__ = 'migrationbootstrap_address' id = Column(Integer, primary_key=True) email_address = Column(UnicodeText) name = Column(UnicodeText) added_date = 'TODO' # added_date = Column(DateTime) fields = ExposedNames() fields.name = lambda i: Field(label='Name', required=True) fields.email_address = lambda i: EmailField(label='Email', required=True) def save(self): self.added_date = datetime.now() Session.add(self) events = ExposedNames() events.save = lambda i: Event(label='Save', action=Action(i.save))
To try it out, do:
reahl example tutorial.migrationexamplebootstrap cd migrationexamplebootstrap python -m pip install --no-deps -e . reahl createdbtables etc python migrationexamplebootstrap_dev/create_demo_data.py etc/
Doing all of this simulates an application that ran somewhere for a while, with some data in its database.
Now change the application to a newer version:
comment out the ‘TODO’ version of added_date in the Address class, and uncomment the version with the Column (this simulates a change in schema)
edit the pyproject.toml file and add a new version table for 0.2 which includes a migration. Also create a version table for 0.1 (the previous version) to keep track of what that version used to depend on:
[tool.reahl-component.versions."0.2"] migrations = [ "reahl.doc.examples.tutorial.migrationexamplebootstrap.migrationexamplebootstrap:AddDate" ] [tool.reahl-component.versions."0.1"] dependencies = [ "reahl-web>=6.0,<6.1", "reahl-component>=6.0,<6.1", "reahl-sqlalchemysupport>=6.0,<6.1", "reahl-sqlitesupport>=6.0,<6.1", "reahl-web-declarative>=6.0,<6.1" ]
edit the pyproject.toml file and increase the version of the component to 0.2:
[project] name = "migrationexamplebootstrap" version = "0.2"
Your component had version 0.1 at first. In order to trigger a migration, you need a new, higher version. Database schema changes require new major or minor version; patch versions are ignored.
To simulate installing the newer version, run:
python -m pip install --no-deps -e .
This command regenerates setuptools metadata that is derived from your
pyproject.toml. Only after running it will the setuptools machinery
pick up the changed version number and the added
Now that a new version of your component has been installed, run the following in order to migrate the old database:
reahl -l INFO migratedb etc
Migration subclass for each logical change that needs to be made to
the schema (and perhaps data) of the previous version.
class AddDate(Migration): def schedule_upgrades(self): print('scheduling upgrades for AddDate') self.schedule('alter', op.add_column, 'migrationbootstrap_address', Column('added_date', DateTime))
[tool.reahl-component.versions."0.2"] migrations = [ "reahl.doc.examples.tutorial.migrationexamplebootstrap.migrationexamplebootstrap:AddDate" ]
The migratedb command checks to see which version of your component
the current database schema corresponds with. It then runs only those
Migrations needed to bring the existing schema up to date with your
Schema changes are written using the alembic.op module of SqlAlchemy’s migration tool: Alembic
A given application can consist of many components, and each of these
may have its own
Migrations. For this reason, you do not actually
call functions of
alembic.op in your
just schedule such calls to be run at the appropriate time using
execution of these calls happen only once all components had a chance
to schedule their migration calls.
Execution of these calls happen in a number of predefined phases. You schedule a call to run during a particular phase.
Phases, in order:
Foreign keys are dropped first, because they refer to other columns.
Primary keys are dropped next, they may also prevent other actions from completing.
Sometimes some code needs to be executed before tables are altered – saving some data in a temporary table, for example, or disabling some other constraints.
Now that all possible constraints have been disabled, tables and columns may be altered.
Then, primary keys can be created again.
Followed by indexes dependent on those primary keys.
With a schema mostly fixed, data can be inserted or moved to new locations.
A last chance to recreate foreign keys to possible newly moved data in the new schema.
Use this phase if any cleanup is needed of temporary tables, etc.
Declare a dependency on component B in the pyproject.toml of component A:
If component A declares a foreign key to a table that belongs to component B
If component A imports code from component B
The migration machinery computes a complicated order in which
Migrations are scheduled
and run. This ordering relies on correct dependencies among components.