Database migration can be a complex process, especially when moving from legacy systems to a modern framework like Django. This guide will walk you through the process step by step, covering everything from initial assessment to final deployment.
Initial Assessment and Planning
Before diving into the migration process, you need to:
- Analyze your legacy database structure
- Map existing relationships between tables
- Document data types and constraints
- Identify potential data quality issues
- Plan for handling legacy data that doesn't fit Django's conventions
Setting Up Django Models
First, create Django models that correspond to your legacy database structure. Here's an example of how to handle common legacy database patterns:
from django.db import models class Customer(models.Model): # Map legacy 'customer_id' to Django's automatic primary key legacy_id = models.IntegerField(unique=True) name = models.CharField(max_length=100) # Handle legacy VARCHAR(1) status fields status = models.CharField(max_length=1, choices=[ ('A', 'Active'), ('I', 'Inactive'), ('P', 'Pending') ]) # Convert legacy datetime fields created_at = models.DateTimeField() class Meta: db_table = 'legacy_customers' # Use existing table name
Creating a Migration Script
The most reliable way to migrate data is using Django's inspectdb command and custom management commands. Here's a complete example:
from django.core.management.base import BaseCommand from django.db import connections, transaction from myapp.models import Customer import datetime class Command(BaseCommand): help = 'Migrate legacy customer data to Django models' def handle(self, *args, **kwargs): with transaction.atomic(): with connections['legacy_db'].cursor() as cursor: # Fetch data from legacy database cursor.execute(""" SELECT customer_id, customer_name, status, created_timestamp FROM old_customers """) rows = cursor.fetchall() # Process and insert data for row in rows: Customer.objects.create( legacy_id=row[0], name=row[1], status=row[2], created_at=datetime.datetime.fromtimestamp(row[3]) )
Handling Common Challenges
1. Different Database Encodings
If your legacy database uses a different encoding (common with older systems), handle it in your settings:
DATABASES = { 'legacy_db': { 'ENGINE': 'django.db.backends.mysql', 'OPTIONS': { 'charset': 'latin1', } } }
2. Data Transformation
Often, you'll need to transform data during migration. Create utility functions for complex transformations:
def transform_legacy_status(old_status): status_map = { '0': 'I', # Inactive '1': 'A', # Active 'PENDING': 'P' # Pending } return status_map.get(old_status, 'U') # Unknown as default
3. Handling Relations
For complex relationships, migrate tables in the correct order to maintain referential integrity:
# First, migrate customers Customer.objects.bulk_create(customer_objects) # Then, migrate orders with customer references Order.objects.bulk_create(order_objects)
Testing the Migration
Always test your migration thoroughly:
- Create a copy of your legacy database
- Run the migration on the copy
- Verify data integrity with validation queries:
def validate_migration(): with connections['legacy_db'].cursor() as cursor: # Check record counts cursor.execute("SELECT COUNT(*) FROM old_customers") legacy_count = cursor.fetchone()[0] django_count = Customer.objects.count() assert legacy_count == django_count, \ f"Count mismatch: Legacy={legacy_count}, Django={django_count}"
Performance Optimization
For large datasets, consider these optimization techniques:
- Use bulk_create for batch insertions: ```python batch_size = 1000 objects_to_create = []
for row in rows: objects_to_create.append(Customer( legacy_id=row[0], name=row[1] ))
if len(objects_to_create) >= batch_size: Customer.objects.bulk_create(objects_to_create) objects_to_create = []
2. Disable autocommit and indexes during migration: ```python from django.db import connection with connection.cursor() as cursor: cursor.execute('SET autocommit = 0;') cursor.execute('ALTER TABLE customers DISABLE KEYS;') # ... perform migration ... cursor.execute('ALTER TABLE customers ENABLE KEYS;') cursor.execute('COMMIT;') cursor.execute('SET autocommit = 1;')
Post-Migration Tasks
After completing the migration:
- Verify data integrity
- Update sequences/auto-increment values
- Rebuild indexes
- Run your application's test suite
- Perform a test deployment
- Create a rollback plan
Conclusion
Migrating legacy data to Django requires careful planning and execution. By following this guide and adapting the code examples to your specific needs, you can ensure a smooth transition to your new Django application. Remember to always backup your data and test thoroughly before performing the migration in production.
Additional Resources
- Django Documentation on database migrations
- Django's inspectdb command documentation
- Database-specific migration tools
- Django debug toolbar for verification