Flask Biography Tutorial Part XI : Managing Database Migration in Production Environment Using Alembic
Although I never marked our live Bio Application as published, actually at the current state it shouldn't be treated as a production quality application. The main reason is simple : up until now, our database will always be recreated when application started, discarding any data already entered by our users. We do this to allow us focusing our work on perfecting application features than to manage our own database migration script.
Today we are going to have a look on how exactly a database migration technique is implemented. It will allow our application to be stamped as Production Quality application, as it will allow users to registered safely into our application. Whatever development changes we introduce, our users data will not just get thrown to the void. The database schema will be upgraded and existing data will be merged to the new schema.
In our beloved Python world there are plenty of database migration tools available. But as we use SQLAlchemy, well, don't you think that the author of SQLAlchemy should be the the legitimate author to write its database migration tool? Well then, introducing Alembic!
What is Data Migration?
The main concept of data migration is to let us track how our database changes are gradually introduced into versions, and also making us able to move back and forth to each of the database version (although at some point of time, surely you don't want to keep track all previous versions from the starting revision). For those who have worked on Ruby project will understand how cool this feature is. The same cool feature is also available in our Python application. In which, we are going to use Alembic to manage our database migration feature. Let see how easy it is to add data migration feature into our application!
I am trying to read your mind here... hm..
Got it. You're correct. Use pip (phew, that wasn't so hard to read..)
pip install alembic
Remember that you must first activate your virtualenv environment, so that alembic will be installed into your local
env directory instead of global site-package directory. And don't forget to add it into your
setup.py file, so that alembic will also be available in your Openshift hosted application. This is how our new
setup.py look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
from setuptools import setup setup(name='Bio', version='1.0', description="Let's you have a fully interactive Curriculum Vitae", author='Eko S. Wibowo', firstname.lastname@example.org', url='http://www.python.org/sigs/distutils-sig/', install_requires= [ 'Flask==0.10.1', 'Flask-SQLAlchemy==1.0', 'Flask-Login==0.2.7', 'Flask-WTF==0.9.2', 'alembic' ], )
alembic package installed, you can start issuing
alembic command (which is installed in
env/bin/alembic), and start managing database migration in your site.
Let first initialize alembic for our application as follows:
alembic init migration
This command will create
migration directory in our application root directory (in the same level as
setup.py file). Of course you can always use another directory name beside
migration, such as
alembic. Try to issue
alembic current command, to show what is our data migration step currently. What do you get? Right, a connection error message.
To configure alembic so that it knows our database location, simply open the file
alembic.ini in root directory, and modify the line below:
sqlalchemy.url = driver://user:pass@localhost/dbname
Please do configure it to match your develoment environment. In my Ubuntu Box, I configure it as follows:
sqlalchemy.url = postgresql://swdev@localhost/bio
I omitted the password field, as my PostgreSQL already configure to always trust localhost connection. In case you forgot, please do have a read on how to do it here.
If you run
alembic current command again, you will find that alembic now giving you this information:
Now our Alembic installation is correct!
It display our current database revision as None, as we haven't issued any database migration/revision. How do Alembic know in what revision do our database currently? Simple. It create
alembic_version table in our database, and store it revisions data there.
Now that everything is in order, let us work with our first database migration : adding
currently_live_in field, so that our fans (ehemm!) will know where we live. Just make sure they don't stalking on us!
But before that, let us see what types of data migration there available in alembic.
Working with Data Migration
There are two types of data migration : automatic and manual. Before continuing, please reaffirm that when we work with Python application using SQLAlchemy, we don't care much of our database structure in the back-end (served by PostgreSQL). Instead, we concentrate on building and restructuring our application model (which currently, still lives in a single
main.py file). We can simply dropping and recreate the whole tables when our application still in development phase. But, this approach is not accepted when our application going live! Hence, the need of data migration arise. Or more importantly, an automatic data migration.
Automatic data migration: adding currently_live_in field
An automatic data migration tool will let us work intensely with our application models, forgetting completely our database live in the back-end. When we satisfied with our application models changes, we can issue command to inspect the model changes and automatically generate a script to upgrade our back-end database. Later on we can inspect the upgrade script and alter it as necessary (such as merging data to new column). Let see how easy it's done with alembic.
First, to allow automatic data migration you have to modify
migration/env.py file and include our application model path into alembic. Actually, this is the not so trivial obstacle you will face in case you are still new to python programming. Find the line that still commented which said,
# target_metadata = mymodel.Base.metadata. You have to uncommented this into
target_metadata = main.db.Model.metadata. But it will generate error as we stored our application model inside
wsgi/main.py file, and we haven't included the path
/wsgi into python path module. Here is how to overcome this situation:
1 2 3 4 5
<span id="delete_me"></span>import os, sys sys.path.append(os.getcwd() + "/wsgi") import main target_metadata = main.db.Model.metadata
As we already have the required alembic code changes to allow automatic data migration, now our next step is to modify our current application model : adding
currently_live_in field, as follows:
1 2 3
class Users(db.Model, object): ... currently_live_in = db.Column(db.String(300))<span id="delete_me"></span>
And run alembic command to do automigration,
alembic revision --autogenerate -m 'Adding currently_live_in field'<span id="delete_me"></span>
Alembic will give you output something similar to this:
Creating automigration script
We can see that alembic give us more than just adding
currently_live_in field, so let us open the generated migration script in
migration/versions/514**** file and focusing on the method
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
<span id="delete_me"></span>def upgrade(): ### commands auto generated by Alembic - please adjust! ### op.drop_index('portfolio_title_key', 'portfolio') op.add_column('users', sa.Column('currently_live_in', sa.String(length=300), nullable=True)) op.drop_index('users_username_key', 'users') op.drop_index('users_email_key', 'users') ### end Alembic commands ### def downgrade(): ### commands auto generated by Alembic - please adjust! ### op.create_index('users_email_key', 'users', [u'email'], unique=True) op.create_index('users_username_key', 'users', [u'username'], unique=True) op.drop_column('users', 'currently_live_in') op.create_index('portfolio_title_key', 'portfolio', [u'title'], unique=True)
We don't care much about the dropping and creation of index, so let streamlined it a bit into this:
1 2 3 4 5
def upgrade(): op.add_column('users', sa.Column('currently_live_in', sa.String(length=300), nullable=True)) def downgrade(): op.drop_column('users', 'currently_live_in')<span id="delete_me"></span>
Now let us run our first database upgrade using this alembic command:
<span id="delete_me"></span>alembic upgrade head
We are telling alembic to upgrade our database into the latest version (alembic able to do this by looking at the current revision in
alembic_revision table, in which you'll encounter result similar to this:
Output of alembic upgrade command
You can see that the table
users will now having new field
currently_live_in. Pretty much cool isn't it?
You can also downgrade the current database into the previous state as follows:
<span id="delete_me"></span>alembic downgrade -1
Alembic downgrade command
Having issued alembic downgrade command, we will have our database revision backed into previous state.
Try to run
alembic currenturrent command for each step before and after upgrade/downgrade command and see what's the result!
Manual data migration: merging firstname and lastname into fullname
If you have try and already have a grasp on how automatic data migration work, you will find that actually it is very for you to generate your own migration script, right? It was just a simple database structure changes using the familiar DDL (Data Definition Language). And also, an automatic data migration has its own limitation (which you can read about it here), in which you must manually create your data migration script.
Here, let use manual data migration technique when we want to introduce new field (
fullname) by merging the other two fields (
lastname). To prepare a new migration script, issue the following command:
alembic revision -m "Merging firstname and lastname into fullname"<span id="delete_me"></span>
You'll be presented to something similar like this:
Preparing an empty migration script
Let see what this new
1 2 3 4 5 6 7 8 9 10 11 12
# revision identifiers, used by Alembic. revision = '1f922efc9109' down_revision = '5145ac2c72c4' from alembic import op import sqlalchemy as sa def upgrade(): pass def downgrade(): pass<span id="delete_me"></span>
It told us what is the current revision and what is its previous/down revision. And also, and empty
downgrade method. And as we are already know what to do, now let see what these two method should contains:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
def upgrade(): print "Adding fullname column" op.add_column('users', sa.Column('fullname', sa.String(101))) print "Merging firstname + lastname into fullname" connection = op.get_bind() connection.execute("update users set fullname = subquery.newfullname from (select id,concat(firstname, ' ', lastname) as newfullname from users) as subquery where users.id = subquery.id", execution_options = None) print "Dropping firstname and lastname column" op.drop_column('users', 'firstname') op.drop_column('users', 'lastname') def downgrade(): connection = op.get_bind() op.add_column('users', sa.Column('firstname', sa.String(101))) op.add_column('users', sa.Column('lastname', sa.String(101))) print "Simply save fullname into firstname" connection.execute("update users set firstname = fullname"); print "Dropping fullname column" op.drop_column('users', 'fullname')
These code main responsibility is in making sure that although we are dropping
lastname, the data contained in those field will always still be there in the field
fullname. Beside that, in doing downgrade (which is dropping
fullname and bringing back
lastname) we also make sure that the
fullname will simply be stored into
firstname (sure you can split it into
lastname too if you like!). This will make sure that our application database when doing upgrade or downgrade will always be reversible.
Remember, as this is a manual upgrade, we must manually add
fullname field and removing
Have a try on these two data migration technique to be able to fully leveraging Python data migration feature introduce by Alembic!
Doing Migration in Openshift
You have done your database migration in your local system. That's great. But, unfortunately our application lives in Openshift. How is it possible for us to upgrade the database there? For those veteran in system administrator will know the answer : use ssh to connect to openshift remote Linux box and issued its upgrade command there, as detailed below:
- SSH login to your Openshift application
rhc ssh --app <applicationname>
- Got to our application repo
- We need to modify
alchemy.inifile, and change its
sqlalchemy.urlvalue into the correct Openshift setting. Let first inspect what is our Openshift DB_URL , by doing this command,
env | grep DB_URL
It will output the complete database URL for your application, but what you need is only its username and Host IP Address. You don't need password (trusted connection) and port (defaulted to 5432). Edit it using vim,
vi alembic.iniNow change
sqlalchemy.urlinto the format
Save this file and exit,
- Everything is set, just run the usual upgrade command,
alembic upgrade head
It will give you output similar as below,
For those who aware you will notice something problematic :
alembic.ini is still in controlled by git. So, if subsequent
git push command is issued, your changes to Openshift repo will be overriden. In our local environment, we will have to add
.gitignore. This will let each system (local and Openshift) having its own specific
One last note, in your production environment, it's also better if before doing any upgrade/downgrade command, first you have to backup your database. It will be a great guard for you, in case any unexpected failure should occur.
So, are you having fun upgrading and downgrading your database? Great :)
But just don't forget that you need to adjust any code changes for the current state of application model. This is a trivial matter and will be differ case by case for each database migration scheme. For that, I omitted the full explanation of the code changes needed, and I'll just let you inspect code changes in the downloaded application.
Code download : bio-part-11.zip.
Live demo : http://bio-ekowibowo.rhcloud.com
This series will last for another two articles, which is implementing settings page where you can edit your profile and upload avatar and the last article where we will dissect our single
main.py into separate files to hold routes/views, model, form and main application.
I have prepare two topics at my disposal currently, one talking about how to use Django in a web application and the other will explore how to integrate Python into a game engine, or may even be how to develop game using Python (Cocos2D or PyGame anyone?)