Archive
Tags
android (3)
ant (2)
beautifulsoup (1)
debian (1)
decorators (1)
django (9)
dovecot (1)
encryption (1)
fix (4)
gotcha (2)
hobo (1)
htmlparser (1)
imaplib (2)
java (1)
json (2)
kerberos (2)
linux (7)
lxml (5)
markdown (4)
mechanize (6)
multiprocessing (1)
mysql (2)
nagios (2)
new_features (3)
open_source (5)
optparse (2)
parsing (1)
perl (2)
postgres (1)
preseed (1)
pxe (4)
pyqt4 (1)
python (41)
raid (1)
rails (1)
red_hat (1)
reportlab (4)
request_tracker (2)
rt (2)
ruby (1)
scala (1)
screen_scraping (7)
shell_scripting (8)
soap (1)
solaris (3)
sql (2)
sqlalchemy (2)
tips_and_tricks (1)
twitter (2)
ubuntu (1)
vmware (2)
windows (1)
zimbra (2)

Want to have fun? Try migrating an existing web application between different database technologies! With Django and SQLAlchemy, it actually isn't that hard! I used the following procedure to migrate both deathcat.org and this blog to Postgres. I'm assuming your know you to use Postgres and you are doing this as a Postgres superuser. All of this assumes ident authentication for Postgres, but should be easily tweaked for other configurations.

Make a directory in your Django application to store these scripts, like scripts/. Make sure this directory resides at the same level as manage.py. Now, get the code for my SQLAlchemy table copier and put it in a new file called puller.py. Comment out the line that reads table.metadata.create_all(dengine).

Now put this in a file called migrate2pg.sh:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#!/bin/bash

database=my_django_db
mysql_user=django_user
mysql_pass=django_passwd
mysql_connection_string="mysql://$mysql_user:$mysql_pass@localhost/$database?charset=utf8"
postgres_connection_string="postgres:///$database"

tables=$(echo 'show tables' | mysql -u $mysql_user -p"$mysql_pass" $database | xargs echo | cut -d ' ' -f 2-)

echo $tables

echo "Dropping old postgres database, if any"
dropdb $database

echo "Creating new database"
createdb $database

echo "Setting up Django schema"
../manage.py syncdb --noinput

echo "Removing initial data"
echo 'DELETE FROM auth_permission' | psql $database
echo 'DELETE FROM django_content_type' | psql $database

echo "Importing data from MySQL"
python puller.py \
    -f $mysql_connection_string \
    -t $postgres_connection_string \
    $tables

echo "Fixing sequences"
for table in $tables
do
    echo Fixing "${table}'s sequence"
    echo "select setval('${table}_id_seq', max(id)) from ${table};" | psql $database
done

Tweak the variables at the top as necessary for your case. Run a bash migrate2pg.sh and read the messages. One error you will see is a during the Fixing sequences phase when the script attempts to fix django_session_id_seq sequence. Ignore this error.

The final part is to give permissions or ownership to the user who will be accessing the data. I'm assuming you can do this, but if you are using Postgres ident authentication and apache, then here's a helpful script for you.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/bin/bash

database='my_django_db'

echo "Granting apache rights to ${database}"
echo "GRANT ALL ON DATABASE ${database} TO apache;" | psql $database

tables=$(echo '\dt' | psql $database | awk -F '|' '/table/ {print $2}')
sequences=$(echo '\ds' | psql $database | awk -F '|' '/sequence/ {print $2}')

echo "Tables:" $tables
echo

echo "Sequences:" $sequences
echo

tablesql=$(for table in $tables; do echo "ALTER TABLE $table OWNER TO apache;"; done)
seqsql=$(for seq in $sequences; do echo "ALTER TABLE $seq OWNER TO apache;"; done)

echo "Table Alteration SQL:" $tablesql
echo
echo "Sequence Alteration SQL:" $seqsql
echo

echo $tablesql $seqsql | psql $database
Posted by Tyler Lesmann on September 4, 2009 at 16:47 and commented on 3 times
Tagged as: django mysql postgres python sql sqlalchemy

SQLAlchemy is one of the best pieces of software I have had the pleasure to use. We needed an in-office mirror of a few tables on our Oracle server for Crystal Reports. We did not need full replication, so I decided to write something to make table copying easy. SQLAlchemy made this easy and fun.

This uses the latest version of SQLAlchemy at this time, 5.3. SQLAlchemy only abstracts the database interaction. You will still need to install the database driver, like MySQLdb and cx_Oracle.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#!/usr/bin/env python

import getopt
import sys
from sqlalchemy import create_engine, MetaData, Table
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base

def make_session(connection_string):
    engine = create_engine(connection_string, echo=False, convert_unicode=True)
    Session = sessionmaker(bind=engine)
    return Session(), engine

def pull_data(from_db, to_db, tables):
    source, sengine = make_session(from_db)
    smeta = MetaData(bind=sengine)
    destination, dengine = make_session(to_db)

    for table_name in tables:
        print 'Processing', table_name
        print 'Pulling schema from source server'
        table = Table(table_name, smeta, autoload=True)
        print 'Creating table on destination server'
        table.metadata.create_all(dengine)
        NewRecord = quick_mapper(table)
        columns = table.columns.keys()
        print 'Transferring records'
        for record in source.query(table).all():
            data = dict(
                [(str(column), getattr(record, column)) for column in columns]
            )
            destination.merge(NewRecord(**data))
    print 'Committing changes'
    destination.commit()

def print_usage():
    print """
Usage: %s -f source_server -t destination_server table [table ...]
    -f, -t = driver://user[:password]@host[:port]/database

Example: %s -f oracle://someuser:PaSsWd@db1/TSH1 \\
    -t mysql://root@db2:3307/reporting table_one table_two
    """ % (sys.argv[0], sys.argv[0])

def quick_mapper(table):
    Base = declarative_base()
    class GenericMapper(Base):
        __table__ = table
    return GenericMapper

if __name__ == '__main__':
    optlist, tables = getopt.getopt(sys.argv[1:], 'f:t:')

    options = dict(optlist)
    if '-f' not in options or '-t' not in options or not tables:
        print_usage()
        raise SystemExit, 1

    pull_data(
        options['-f'],
        options['-t'],
        tables,
    )

The main event is in pull_data. I create several objects to access both sides. My make_session function creates two of these, a Session, which I use for table transactions, and an Engine, which I use for database transactions. The MetaData is used to pull table schema here.

For every table, I pull the schema of the source table, Table(table_name, smeta, autoload=True), and create that table in destination database, table.metadata.create_all(dengine). The coolest thing about this is that SQLAlchemy does all of the hard work. Oracle fields become MySQL fields, or whatever is appropriate. If the table exists, nothing is done.

The destination system is ready for data now. My quick_mapper spits out ORM for a given table, which makes new records a breeze. I loop through every record from the source, create a dictionary of the column names and data, and make new records on the destination server, using the dictionary as keyword arguments. One point to note is that merge will update records that have the same primary key as the new record.

At the end, I commit the changes. If anything goes wrong, nothing will be written. I have added command-line functionality as well, works like so assuming the script is called puller.py:

Usage: puller.py -f source_server -t destination_server table [table ...]
    -f, -t = driver://user[:password]@host[:port]/database

Example: puller.py -f oracle://someuser:PaSsWd@db1/TSH1 \
    -t mysql://root@db2:3307/reporting table_one table_two
Posted by Tyler Lesmann on April 27, 2009 at 16:28 and commented on 18 times
Tagged as: python sqlalchemy