Sometime ago, in reducing the scope of my community web site, I deleted a bunch of SQL Server tables no longer in use with names like Classifieds, Banners, Posts, and Weather. However, I didn’t clean up the stored procedures that called these tables. I have made a decision recently to move to SQL Azure and want to reduce the amount of “dead” code I am uploading into the cloud.
I embarked on the journey of figuring out which stored procedures called tables that no longer exist. This is next to impossible to determine because SQL Server does a poor job of tracking the dependencies of stored procedures. In fairness, SQL Server doesn’t really need to keep track of the dependencies – all it needs to do is to be able to run. The running of the stored procedure generates a runtime error if the table does not exist. It is the DBA that needs to track the validity of the stored procedure in the scope of the project.
If you run the following on your favorite testing database in SQL Server 2005/2008 what do you expect to happen?
CREATE PROC spTest AS SELECT * FROM asdfasdfasdfasdfasdfasdfasdf
If you said run-time error because the table does not exist, you are wrong. This command executes correctly, stored procedures can be created that reference nonexistent tables.
Tracking My Test
The next thing I wanted to do was look at the stored procedure dependency listing in the sys.sql_dependencies. To do this, I need to get the object id of the stored procedure that I just created:
SELECT * FROM sys.objects WHERE name = 'spTest'
This gave me the object id of 893714732, which I could use to determine the dependencies with this query:
SELECT * FROM sys.sql_dependencies WHERE object_id = 893714732
This returned me no rows – which mean that SQL Server had tracked no dependencies on the stored procedure – i.e. no tables. Now I was on to something, I could determine all objects that had no rows in the sys.sql_dependencies catalog view and output them. This would significantly reduce the amount of time I need to review the 900 stored procedure by hand.
I came up with this query that returns all the objects which are stored procedures and have no rows in the sys.sql_dependencies catalog view.
SELECT * FROM sys.objects WHERE object_id NOT IN ( SELECT object_id FROM sys.sql_dependencies ) AND type = 'P' ORDER BY name
It returned a large number of stored procedures that referenced tables that did exist. These procedures I wanted to keep (and didn’t want returned in the results), so these query results were false negatives that I was not expecting. That meant that SQL Server was not keeping track of the dependencies in sys.sql_dependencies as I expected. In fact, for some of my stored procedures it did not know about the dependencies at all – I confirmed this by running query 2 and 3.
I found that if I dropped and recreated the stored procedures or altered the stored procedures without changing code, the sys.sql_dependencies catalog view updated with the correct dependencies. In testing, it appears sys.sql_dependencies catalog view is only updated when the object referenced in object_id is updated, not when the dependent objects are updated (or removed).
To remove all the false negatives, I need to drop and create all 900 stored procedures in my database. Fortunately, SQL Server Management Studio has an easy way to do this:
- Right Click on the Database node in SQL Server Management Studio
- Choose Tasks | Generate Scripts…
- Choose All the stored procedures
- In the advanced settings choose to DROP and CREATE the stored procedures. You have to drop them since they already exist.
- Generate the script to the query window.
Executing the generated script will take the stored procedure offline for a very small amount of time that might affect a production server – do this during downtime or on a staging database. However, once you have executed this Query #4 will return with much better results.
This is not a fully automated process, because you cannot use the results of Query #4 without checking each query visually. It just reduces the number of queries that you have to check by hand.
Some hidden gotchas:
- Queries that don’t reference objects (like string manipulation queries).
- Queries that use the EVAL function to reference dependencies. Sometimes developers will concatenate Transact-SQL as strings inside the stored procedure and then call the EVAL. The Transact-SQL that references the table is not compiled until runtime and will not have any dependencies in the sys.sql_dependencies catalog view.
I am using sys.sql_dependencies catalog view because I use SQL Server 2005. This has been depreciated for sys.sql_expression_dependencies catalog view introduced in SQL Server 2008. I did not test with the sys.sql_expression_dependencies catalog view.
There are other solutions out there, however they require running every stored procedure with a test harness – I was unwilling to do this, since some of my procedures take variables, and some implement data deletes.
I spent a couple of days figuring this out and could find no better solution. However, I am open for any thoughts or better ideas – post them in the comments.