Manage versions of your database schema!

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

In software development where the system persists data in a relational database, it is important to keep track of changes to the versions of the schema.

The importance comes from that you always have several database instances to keep track of. There is the production database, the database for system tests, the database for acceptance test, the database for performance test, the database for development team and the database each developer has.

All these will be at different versions and aligned with different versions of the code.

In this blog I will describe how you come about to control your schema versions. I have tried it in two completly different settings.

What you do is to create a table called, say, “version”. Here you store all versions that have affected the database schema and preloaded control data. Thus, when in doubt you can always check the table to see which version a particular database is in.

major minor
0.1 1
0.1 2

You may use any notation and number of columns to describe your versions, but each must be unique and guranteed through a unique index.

Furthermore, all changes to the schema are described in SQL script files. For every version or change, there is one file. You can never change the file once it has come to use. There must be no doubt how the file relates to the contents of the version table.

S_00.01_1.sql
S_00.01_2.sql

The file should start with a begin and end with a commit to make running of the script atomic. The first line should be an insert into the version table. Since the lines are unique, you are guarded against applying the same SQL script twice.

begin
insert into version values (‘0,1’, 2);

commit

The files are named after the version. You should use numbers with leading zeroes padded so they line up nicely in correct order when you list them.

Order matters, since some changes depend on earlier ones.

By this simple technique you get a database where you know which version of the schema it has and you are never afraid of running the database script too many times.

And, besides, at Spa Franchorchamps, keep your foot down through Eau Rouge… ūüėČ

7 Comments

  • 1
    2008-09-07 - 10:03 | Permalink

    Maybe of interest: my colleague Dan Bergh Johnsson blogged about this subject recently: "Version of data in database"

    /ET

     

  • 2
    2008-09-07 - 10:44 | Permalink

    I use this scheme myself with the addition of one column: db_version_info.

    In the db_version_info column I let Subversion (or whatever version control system being used) fill in the repository version and other information about the SQL file.

    UPDATE admin.db_version SET db_major_version = 22;
    UPDATE admin.db_version SET db_minor_version = 0;
    UPDATE admin.db_version SET db_version_info = '$Id$'; /* don't touch, let Subversion fill this in */

    The db_version_info column might in this example contain something like:

    UPDATE admin.db_version SET db_version_info = '$Id: R22-delta.sql 9261 2008-09-05 07:31:28Z helfra $'; /* don't touch, let Subversion fill this in */ 

    This way I will always know exactly which version of a SQL script that has been run.

  • 3
    Patrik Linderl
    2008-09-08 - 08:30 | Permalink

    It seems simple enough to actually work:-)
    I’m interested in if anyone has come up with a good solution for managing schema evolution and upgrading existing databases with existing data.
    In a project I  worked in we used to have one SQL script that was version stamped that handled creation and initialization of the database. This was used for setting up new databases. Then we had another script that was used for upgrading and migration existing databases. The upgrade script was only valid betwen two consecutive versionsi.e. 1.3 to 1.4. So if you wanted to upgrade from say 1.1 to .1.4 you had to run three scripts 1.1->1.2->1.3->1.4.

  • 4
    Torbjörn Gyllebring
    2008-09-08 - 10:03 | Permalink

    A really nice easy to use no-nonsense tool for exactly this sort of versioning is DBDeploy available in both Java and .NET flavours.
    I’ve used it on multiple projects with very nice results.

  • 5
    Mattias Skarin
    2008-09-08 - 10:29 | Permalink

    And at SPA – if you keep your foot down Eau Rouge I will show within a meter where you will hit the wall  ūüôā

  • 6
    2009-08-25 - 03:48 | Permalink

    hi,

    the very same idea is used in “deltasql” (implemented in php and mysql as a web server), available at
    http://gpu-grid.net/wiki/index.php/Deltasql

    Basically, you enter the scripts on the webserver (numbering is done automatically) and the webserver computes for you the needed set of scripts, if you give it the current schema number.
    bye ūüôā

  • 7
    2013-10-02 - 23:14 | Permalink

    […] a technical recipe for accomplishing this when using Flyway in combination with Spring. I wrote in 2008 about schema versions¬†¬†and at that time it was not evident that you should control which schema changes had been […]

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.