Often a startup tosses out a database to get a web site up or product out of the door. This is reasonable when there is neither the time(-to-market) nor the money (for SQL Data Modelers and developers).
For a web site there is usually one of two patterns of evolution:
- Use the production database for development.
- There’s no problem of matching database to next generation code base.
- Side-effects can be horrible from miscoding or security breaches.
- Use a clone of the production database and then evolve it with the next generation code.
For a product database, 2. above is the typical option. We will consider this type of option alone in this post. The second pattern we need to look at is a non-issue for a one-man development shop, but arises with there are multiple developers:
- A single shared development database
- Any problems impacts the entire development team
- Each developer has their own development database
- Each developer evolves the database to suit their own needs
Even with ‘get-out-of-jail (= SQL skill sets)’ free cards of dbml / Linq-to-Sql, problems arise.
The problem is simple: What TSQL is needed to convert old DB to new DB while maintaining existing (customer) data. For the rest of this post we will call the databases ‘Prod’ and ‘Dev’. In reality this could be Prod –> Dev1 –> Dev2 –> Dev3 –> Dev4—> NextRelease.
The process can be simplified by identifying the differences between Prod and Dev. This is actually simple to do via code. Below we identify changes of objects between the two. There are two statements (what is in Prod that is not in Dev, what is in Dev that is not in Prod, and what has changed (i.e. Table –> Updatable View).
Select name,type from dev.sys.objects EXCEPT Select name,type from prod.sys.objects Select name,type from prod.sys.objects EXCEPT Select name,type from dev.sys.objects
The above uses the EXCEPT to remove everything that is the same. The same logic may be applied to other sys.tables, for example
Select name,type_desc,is_unique, ignore_dup_key, is_primary_key from dev.sys.indexes EXCEPT Select name,type_desc,is_unique, ignore_dup_key, is_primary_key from prod.sys.indexes Select name,type_desc,is_unique, ignore_dup_key, is_primary_key from prod.sys.indexes EXCEPT Select name,type_desc,is_unique, ignore_dup_key, is_primary_key from dev.sys.indexes
There compares can also be on done on other system objects and views:
Select Table_Name, Column_name, Column_Default, Is_Nullable, Data_Type, Character_Maximum_Length, Numeric_Precision, Numeric_Precision_Radix, Numeric_Scale, Collation_Name from prod.Information_Schema.Columns EXCEPT Select Table_Name, Column_name, Column_Default, Is_Nullable, Data_Type, Character_Maximum_Length, Numeric_Precision, Numeric_Precision_Radix, Numeric_Scale, Collation_Name FROM dev.Information_Schema.Columns
The one thing that is needs to be remembered is that some columns are expected to be different and thus should NOT be included in the compare, for example, db_id, object_id etc.
Once you know the difference, you know what needs to be altered in get the schema between prod and dev to match. 95% of the time this can be done by just finding the object in SSMS and right click, with Drop and Create:
and a few ALTER Tables.
You still have the issue of data migration and the issue of not being able to immediately add a Not Null column or new Foreign Key constraint until after the data issues are resolved. You also may have sequencing issues (which can become very hairy when there are cyclical foreign keys)
In the next post, I will look at patterns for putting the above TSQL into a build or installation script.
Buy or Roll Your Own or Contract Out?
There are a variety of commercial tools available (see wikipedia). The problem with these tools is that there is often some art involved – some analysis that needs to be done by a human. Rolling your own script depends on two items: do you have people with the skill set and experience to do it? and is this the best use of this resource? Contracting out means finding someone or some firm that has the experience and willing to take it on as an ad-hoc jobs with deadlines (when you need it, you need it promptly!). I should put a shameless plug in for myself here because I usually have enough spare hours each week to do this type of ad-hoc work – email me if you need such services. IMHO, the cheapest solution is the contract out – existing development resources can keep focus on more important things; there is no time lost on learning commercial tools and often better quality results because of a fresh-pair of eyes on the data and migration issues (i.e. less risk of assumptions that are false).