Migrating from Microsoft SQL Server to PostgreSQL

One of the servers I manage at work is still running Microsoft Windows 2000. This system is hosting a few old forgotten web sites and it runs Microsoft SQL Server containing a few databases still in use. This server was already there when I started working at the university. Fortunately I never had to do much work on it and there never was a serious crash: it would have been a serious headache for me restoring this system because I do not have much knowledge about Windows and SQL Server.

During some spare time at work I decided taking a look at migrating the MS SQL Server databases to PostgreSQL. And all in all it was not very complicated. Fortunately, the databases itself are not very large and have a rather simple structure. The first thing I did, was recreating the structure of the database in PostgreSQL. With a tool like pgAdmin3 this is pretty straightforward. An alternative is to export the database structure in SQL format and adapt the resulting file so that it contains valid PostgreSQL table creation statements. The latter method is explained on the PostgreSQL wiki.

To migrate the data itself, I wrote a little Perl script. Actually I do not have much knowledge about Perl but it is a fairly straightforward language, so this did not prove to be too difficult. I connect to the MS SQL Server database using DBI, the Perl Database Interface, with the ODBC and FreeTDS drivers. The script gets all records from the tables you define, and inserts them in the PostgreSQL database using the PostgreSQL DBI driver.

These are the required packages on a Debian Squeeze system:

libdbd-odbc-perl
libdbd-pg-perl
tdsodbc
unixodbc
tdsodbc

Define the FreeTDS ODBC driver in /etc/odbcinst.ini:

[FreeTDS]
Description	= TDS driver (Sybase/MS SQL)
Driver		= /usr/lib/odbc/libtdsodbc.so
Setup		= /usr/lib/odbc/libtdsS.so
UsageCount	= 2

Then define the database in odbc.ini:

[database]
Driver = FreeTDS
Trace = No
Server = hostname
Port = 1433
Database = databasename

The script assumes that all field names in the PostgreSQL database are using lowercase names and that the MS SQL database is using ISO8859-15 encoding and the PostgreSQL database is using UTF-8 encoding. For every table, I make a migrateTable call with 3 arguments: the table name, an array containing all field names and the name of the primary key field for which you created a sequence (named fieldname_seq).

Once the database itself was migrated, the applications using the database had to be modified to use PostgreSQL. This was not too difficult, because the applications were PHP scripts using ODBC. These additional packages were needed:

odbc-postgresql
php5-odbc

To configure the PostgreSQL ODBC driver, add this in /etc/odbcinst.ini:

[PostgreSQL Unicode]
Description	= PostgreSQL ODBC driver (Unicode version)
Driver		= /usr/lib/odbc/psqlodbcw.so
Setup		= /usr/lib/odbc/libodbcpsqlS.so
Debug		= 0
CommLog		= 0
UsageCount	= 0

Then define the databse in odbc.ini:

[database]
Driver              = PostgreSQL Unicode
ServerName          = localhost
Database            = dbname
Username            = username
sslmode             = require

Then be sure not to call odbc_connect with the SQL_CUR_USE_ODBC option (which was actually needed with FreeTDS to fix some weird errors), because it causes segfaults.

The result? WISE’s publications pages is now using PostgreSQL as a back-end!