Friday, January 13, 2012

Deploy Schemata Like a Boss

One of the many new features in Postgres 9.1 is Extensions. In their simplest form they are a collection of database objects. Think of it as package management for your database. It lets you add, remove and upgrade a collection of objects. All contrib modules shipped with Postgres now use this method.

This is just the beginning though. Normally when you think of contrib modules for postgres you think of some new functions or data types. With extensions you can just as easily manage relations. Create a table in one version of the extension and then add a column in another version. Be sure which versions of your functions and relations exist on a database by managing them as a group instead of as individuals. You can break up your schema into logical chunks and give responsibility of each chunk to an appropriate team in your development group to distribute the workload.


One extension usually has multiple versions. The version identifiers don't need to be numeric. It doesn't understand the concept that version 2 might be newer than version 1. So rather than upgrading, you are merely transitioning between versions. You can think of this like a directed graph. Each node is a version and each edge is an update path. You can also chain paths if there is no direct route. In the example to the right, there is no direct path from β → γ but you can go β → δ → γ. Below you can see all the permutations listed from the database.


example=# SELECT * FROM pg_extension_update_paths('myext') ORDER BY source,target;
 source | target |    path  
--------+--------+------------
 α      | β      | α--β
 α      | γ      | α--δ--γ
 α      | δ      | α--δ
 β      | α      | β--δ--γ--α
 β      | γ      | β--δ--γ
 β      | δ      | β--δ
 γ      | α      | γ--α
 γ      | β      | γ--β
 γ      | δ      | γ--α--δ
 δ      | α      | δ--γ--α
 δ      | β      | δ--γ--β
 δ      | γ      | δ--γ
(12 rows)
There is also a special source called 'unpackaged' that you can use to take an existing schema and put it into an extension. It is simply a script that takes every object in your database and adds it to your extension. This way you don't have to do a big migration to get your schema working out of an extension.

So you've decided you want to try this? Great! Here is what you will need:

A control file (<extname>.control)
default_version = 'β'
comment = 'my extension'
relocatable = true
default_version is the version that will be installed unless otherwise specified. relocatable means that it can be put in an arbitrarily named schema and also move from one schema to another. The only reason that wouldn't be true is if you explicitly reference a schema name in your scripts. There are some other settings you can use as well, such as requires which adds a dependency to another extension. For example if your extension requires the pgcrypto extension.

A Makefile:
EXTENSION = myext
DATA = myext--α.sql myext--α--β.sql myext--α--δ.sql myext--β.sql myext--β--δ.sql myext--γ.sql myext--γ--α.sql myext--γ--β.sql myext--δ.sql myext--δ--γ.sql
PG_CONFIG = /usr/pgsql-9.1/bin/pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)
EXTENSION is the name of your extension. DATA is the list of all the install and update SQL scripts. Every time you add a new version you need to append to this line. PG_CONFIG is the path to your pg_config binary for the install you want to use. The last two lines are just standard Postgres Makefile code for pulling in all the install specific information it needs.

The other pieces you need are the SQL scripts to do all the work. The install scripts (<extname>--<version>.sql) are basically just a bunch of create statements. Something like you would get from a pg_dump. They create all the objects for a particular version. The update scripts (<extname>--<source>--<target>.sql) consist of a combination of create, alter and drop statements. This is all the SQL that you would need to modify a schema of the source version to convert it to a schema of the target version. This will usually be a create table or alter table or a create or replace function. If you need to remove an object you will first need to disassociate it from the extension.
ALTER EXTENSION myext DROP TABLE foo;
Then you can run a standard drop table. If you want to be able to "update" to the previous version, you will need a separate script for that as well. I would suggest not trying to make a graph as complex as the one above. Most likely yours will be completely linear.

So how easy is it to install the extension? Run make install in your extension directory, then:
example=# CREATE SCHEMA myext;
CREATE SCHEMA
example=# CREATE EXTENSION myext WITH SCHEMA myext;
CREATE EXTENSION
That easy. And to update?
example=# ALTER EXTENSION myext UPDATE TO 'γ';
ALTER EXTENSION
You can use \dx to list installed extensions:
example=# \dx
                 List of installed extensions
  Name   | Version |   Schema   |         Description        
---------+---------+------------+------------------------------
 myext   | γ       | myext      | my extension
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)
As you can see from this brief introduction this is pretty powerful stuff. It's still a little rough around the edges, but it will be getting better. I recommend taking a look at the documentation to get a more complete understanding of all this can do.