Flask Biography Tutorial Part XI : Managing Database Migration in Production Environment Using Alembic

   

Our Openshift's application current state using the latest revision
Our Openshift's application current state using the latest revision
 

Although I never marked our live Bio Application as published, actually at the current state it shouldn't be treated as a production quality application. The main reason is simple : up until now, our database will always be recreated when application started, discarding any data already entered by our users. We do this to allow us focusing our work on perfecting application features than to manage our own database migration script.

Today we are going to have a look on how exactly a database migration technique is implemented. It will allow our application to be stamped as Production Quality application, as it will allow users to registered safely into our application. Whatever development changes we introduce, our users data will not just get thrown to the void. The database schema will be upgraded and existing data will be merged to the new schema.

In our beloved Python world there are plenty of database migration tools available. But as we use SQLAlchemy, well, don't you think that the author of SQLAlchemy should be the  the legitimate author to write its database migration tool? Well then, introducing Alembic!

What is Data Migration?

The main concept of data migration is to let us track how our database changes are gradually introduced into versions, and also making us able to move back and forth to each of the database version (although at some point of time, surely you don't want to keep track all previous versions from the starting revision). For those who have worked on Ruby project will understand how cool this feature is. The same cool feature is also available in our Python application. In which, we are going to use Alembic to manage our database migration feature. Let see how easy it is to add data migration feature into our application!

Preparing Alembic

I am trying to read your mind here... hm..
Got it. You're correct. Use pip Laughing (phew, that wasn't so hard to read..)

1
pip install alembic

Remember that you must first activate your virtualenv environment, so that alembic will be installed into your local env directory instead of global site-package directory. And don't forget to add it into your setup.py file, so that alembic will also be available in your Openshift hosted application. This is how our new setup.py look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from setuptools import setup
setup(name='Bio',
      version='1.0',
      description="Let's you have a fully interactive Curriculum Vitae",
      author='Eko S. Wibowo',
      author_email='swdev.bali@gmail.com',
      url='http://www.python.org/sigs/distutils-sig/',
      install_requires=
      [
        'Flask==0.10.1',
        'Flask-SQLAlchemy==1.0',
        'Flask-Login==0.2.7',
        'Flask-WTF==0.9.2',
        'alembic'
      ],
     )

Having alembic package installed, you can start issuing alembic command (which is installed in env/bin/alembic), and start managing database migration in your site.

Initializing Alembic

Let first initialize alembic for our application as follows:

1
alembic init migration

This command will create migration directory in our application root directory (in the same level as setup.py file). Of course you can always use another directory name beside migration, such as alembic. Try to issue alembic current command, to show what is our data migration step currently. What do you get? Right, a connection error message. Laughing Sorry...

To configure alembic so that it knows our database location, simply open the file alembic.ini in root directory, and modify the line below:

1
sqlalchemy.url = driver://user:pass@localhost/dbname

Please do configure it to match your develoment environment. In my Ubuntu Box, I configure it as follows:

1
sqlalchemy.url = postgresql://swdev@localhost/bio

 

I omitted the password field, as my PostgreSQL already configure to always trust localhost connection. In case you forgot, please do have a read on how to do it here.

If you run  alembic current command again, you will find that alembic now giving you this information:

 

Now our Alembic installation is correct!
Now our Alembic installation is correct!

 

It display our current database revision as None, as we haven't issued any database migration/revision. How do Alembic know in what revision do our database currently? Simple. It create alembic_version table in our database, and store it revisions data there.

Now that everything is in order, let us work with our first database migration : adding currently_live_in field, so that our fans (ehemm!) will know where we live. Just make sure they don't stalking on us! Laughing

But before that, let us see what types of data migration there available in alembic.

Working with Data Migration

There are two types of data migration : automatic and manual. Before continuing, please reaffirm that when we work with Python application using SQLAlchemy, we don't care much of our database structure in the back-end (served by PostgreSQL). Instead, we concentrate on building and restructuring our application model (which currently, still lives in a single main.py file). We can simply dropping and recreate the whole tables when our application still in development phase. But, this approach is not accepted when our application going live! Hence, the need of data migration arise. Or more importantly, an automatic data migration.

Automatic data migration: adding currently_live_in field

An automatic data migration tool will let us work intensely with our application models, forgetting completely our database live in the back-end. When we satisfied with our application models changes, we can issue command to inspect the model changes and automatically generate a script to upgrade our back-end database. Later on we can inspect the upgrade script and alter it as necessary (such as merging data to new column). Let see how easy it's done with alembic.

First, to allow automatic data migration you have to modify migration/env.py file and include our application model path into alembic. Actually, this is the not so trivial obstacle you will face in case you are still new to python programming. Find the line that still commented which said, # target_metadata = mymodel.Base.metadata. You have to uncommented this into target_metadata = main.db.Model.metadata. But it will generate error as we stored our application model inside wsgi/main.py file, and we haven't included the path /wsgi into python path module. Here is how to overcome this situation:

1
2
3
4
5
<span id="delete_me"></span>import os, sys
sys.path.append(os.getcwd() + "/wsgi")
 
import main
target_metadata = main.db.Model.metadata

As we already have the required alembic code changes to allow automatic data migration, now our next step is to  modify our current application model : adding currently_live_in field, as follows:

1
2
3
class Users(db.Model, object):
    ...
    currently_live_in = db.Column(db.String(300))<span id="delete_me"></span>

And run alembic command to do automigration,

1
alembic revision --autogenerate -m 'Adding currently_live_in field'<span id="delete_me"></span>

Alembic will give you output something similar to this:

Creating automigration script

Creating automigration script

We can see that alembic give us more than just adding currently_live_in field, so let us open the generated migration script in  migration/versions/514**** file and focusing on the method upgrade() and downgrade().

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<span id="delete_me"></span>def upgrade():
    ### commands auto generated by Alembic - please adjust! ###
    op.drop_index('portfolio_title_key', 'portfolio')
    op.add_column('users', sa.Column('currently_live_in', sa.String(length=300), nullable=True))
    op.drop_index('users_username_key', 'users')
    op.drop_index('users_email_key', 'users')
    ### end Alembic commands ###
 
 
def downgrade():
    ### commands auto generated by Alembic - please adjust! ###
    op.create_index('users_email_key', 'users', [u'email'], unique=True)
    op.create_index('users_username_key', 'users', [u'username'], unique=True)
    op.drop_column('users', 'currently_live_in')
    op.create_index('portfolio_title_key', 'portfolio', [u'title'], unique=True)

We don't care much about the dropping and creation of index, so let streamlined it a bit into this:

1
2
3
4
5
def upgrade():
    op.add_column('users', sa.Column('currently_live_in', sa.String(length=300), nullable=True))
 
def downgrade():
    op.drop_column('users', 'currently_live_in')<span id="delete_me"></span>

Great!

Now let us run our first database upgrade using this alembic command:

1
<span id="delete_me"></span>alembic upgrade head

We are telling alembic to upgrade our database into the latest version (alembic able to do this by looking at the current revision in alembic_revision table, in which you'll encounter result similar to this:

Output of alembic upgrade command

Output of alembic upgrade command

You can see that the table users will now having new field currently_live_in. Pretty much cool isn't it?

You can also downgrade the current database into the previous state as follows:

1
<span id="delete_me"></span>alembic downgrade -1

Alembic downgrade command

Alembic downgrade command

Having issued alembic downgrade command, we will have our database revision backed into previous state.

Try to run alembic currenturrent command for each step before and after upgrade/downgrade command and see what's the result! 

Manual data migration: merging firstname and lastname into fullname

If you have try and already have a grasp on how automatic data migration work, you will find that actually it is very for you to generate your own migration script, right? It was just a simple database structure changes using the familiar DDL (Data Definition Language). And also, an automatic data migration has its own limitation (which you can read about it here), in which you must manually create your data migration script.

Here, let use manual data migration technique when we want to introduce new field (fullname) by merging the other two fields (fullname and lastname). To prepare a new migration script, issue the following command:

1
alembic revision -m "Merging firstname and lastname into fullname"<span id="delete_me"></span>

You'll be presented to something similar like this:

Preparing an empty migration script

Preparing an empty migration script

Let see what this new migration/versions/1f9***.py contains.  

1
2
3
4
5
6
7
8
9
10
11
12
# revision identifiers, used by Alembic.
revision = '1f922efc9109'
down_revision = '5145ac2c72c4'
 
from alembic import op
import sqlalchemy as sa
 
def upgrade():
    pass
 
def downgrade():
    pass<span id="delete_me"></span>

It told us what is the current revision and what is its previous/down revision. And also, and empty upgrade and downgrade method. And as we are already know what to do, now let see what these two method should contains:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def upgrade():
    print "Adding fullname column"
    op.add_column('users', sa.Column('fullname', sa.String(101)))
    
    print "Merging firstname + lastname into fullname"
    connection = op.get_bind()
    connection.execute("update users set fullname = subquery.newfullname from (select id,concat(firstname, ' ', lastname) as newfullname from users) as subquery where users.id = subquery.id", execution_options = None)
 
    print "Dropping firstname and lastname column"
    op.drop_column('users', 'firstname')
    op.drop_column('users', 'lastname')
   
def downgrade():    
    connection = op.get_bind()
    op.add_column('users', sa.Column('firstname', sa.String(101)))
    op.add_column('users', sa.Column('lastname', sa.String(101)))
    print "Simply save fullname into firstname"
 
    connection.execute("update users set firstname = fullname");
    
    print "Dropping fullname column"
    op.drop_column('users', 'fullname')

These code main responsibility is in making sure that although we are dropping firstname and lastname, the data contained in those field will always still be there in the field fullname. Beside that, in doing downgrade (which is dropping fullname and bringing back firstname and lastname) we also make sure that the fullname will simply be stored into firstname (sure you can split it into firstname and lastname too if you like!). This will make sure that our application database when doing upgrade or downgrade will always be reversible. 

Remember, as this is a manual upgrade, we must manually add fullname field and removing firstname and lastname from Users class.

Have a try on these two data migration technique to be able to fully leveraging Python data migration feature introduce by Alembic!

Doing Migration in Openshift

You have done your database migration in your local system. That's great. But, unfortunately our application lives in Openshift. How is it possible for us to upgrade the database there? For those veteran in system administrator will know the answer : use ssh to connect to openshift remote Linux box and issued its upgrade command there, as detailed below: 

  1. SSH login to your Openshift application
    rhc ssh --app <applicationname>
  2. Got to our application repo
    cd app-root/runtime/repo 
  3. We need to modify alchemy.ini file, and change its sqlalchemy.url value into the correct Openshift setting. Let first inspect what is our Openshift DB_URL , by doing this command,
    env | grep DB_URL
    It will output the complete database URL for your application, but what you need is only its username and Host IP Address. You don't need password (trusted connection) and port (defaulted to 5432). Edit it using vim,
    vi alembic.ini
    Now change sqlalchemy.url into the format postgresql://<username>@<hostaddress>/<databasename>
    Save this file and exit, :wq.
  4. Everything is set, just run the usual upgrade command,
    alembic upgrade head

    It will give you output similar as below,
    Upgrading openshift database
  5. DONE

Is it?

For those who aware you will notice something problematic : alembic.ini is still in controlled by git. So, if subsequent git push command is issued, your changes to Openshift repo will be overriden. In our local environment, we will have to add alembic.ini into .gitignore. This will let each system (local and Openshift) having its own specific alembic.ini configuration. 

One last note, in your production environment, it's also better if before doing any upgrade/downgrade command, first you have to backup your database. It will be a great guard for you, in case any unexpected failure should occur.

Conclusion

So, are you having fun upgrading and downgrading your database? Great :)

But just don't forget that you need to adjust any code changes for the current state of application model. This is a trivial matter and will be differ case by case for each database migration scheme. For that, I omitted the full explanation of the code changes needed, and I'll just let you inspect code changes in the downloaded application.

Code download : bio-part-11.zip

Live demo : http://bio-ekowibowo.rhcloud.com

This series will last for another two articles, which is implementing settings page where you can edit your profile and upload avatar and the last article where we will dissect our single main.py into separate files to hold routes/views, model, form and main application.

I have prepare two topics at my disposal currently, one talking about how to use Django in a web application and the other will explore how to integrate Python into a game engine, or may even be how to develop game using Python (Cocos2D or PyGame anyone?)

Stay tuned! 




Leave comments

authimage
  • Wow! This whole series is great. Thank you, I learned so much :)

    • Nick

Copyright(c) 2014 - PythonBlogs.com
By using this website, you signify your acceptance of Terms and Conditions and Privacy Policy
All rights reserved