The addition of native XML support to SQL Server has been one of the smartest database moves that Microsoft has done and has moved it ahead of all competitors. There is a challenge because many database developers are Xml-illiterate or marginally literate so it often is under utilized or misused.
A common “trick” that I used to judge a Sql developer skills with XML is to ask them this question:
- You have a Class table with ClassId, ClassName etc
- A Student table with StudentId, Student Name etc
- A Registration table containing only ClassId and StudentId
Please write a TSQL statement to return all students with the name of the classes they are in as a XmlDocument.
What I looked for is if the person asks if “Do you want two nodes (Student, Class) or three nodes (Student, Registration, Class)” – Answer I want two: Student, Class. I don’t want to see the Role table.
If they seem confused about what I want as output (I tend to look for folks that will fill in the dots automatically), then I sketch this on the board:
<School> <Student StudentName="John Adams"> <Class ClassName="SQL 101" /> </Student> <Student StudentName="John Smith"> <Class ClassName="SQL 102" /> </Student> <Student StudentName="John Tams"> <Class ClassName="SQL 101" /> <Class ClassName="SQL 102" /> </Student> </School>
If they asked how do you get the <School> in it, it tells me they should go back to school. Or worst, they try concatenating <school> + … + </school> to the query. The solution is of course:
Select StudentName, ClassName From Student Join (Select StudentId,ClassName From Class A Join Registration B On A.ClassId=B.CLassid) Class ON Student.StudentId=Class.StudentId Order by StudentName for Xml Auto, Root('School')
Of course, Order by StudentName, ClassName would also be fine. The next question is simple… in SQL Management Studio, how many rows and columns will be displayed (using the default settings).
Of course, it’s one row and one column – if they get that wrong… to assume zero real experience is likely a reasonable assumption.
If the developer asserts that they are also C# or .Net developers (or more dangerously, that .Net is their primary strength, but they are strong in SQL), then I hit them with my .Net/SQL XML differentiating questions. You have the above query and it returns a 30K XmlDocument. Show me how you would get this data from SQL Server into C#. I may remind them that we get exactly one row and one column back from the above query.
- The reminder is actually an attempt to leave them down the garden path to the cliff of lemmings. A developer who has not done it, would immediately jump to ExecuteScalar() assuming this was a simple pro-forma question. It’s not. The Xml will be truncated at 2033 characters… oops we are 28K short and have invalid Xml. (see KB article Q310378: XML Data Is Truncated When You Use SqlDataReader)
Now there are two reasonable solutions to this.
- Use the ExecuteXmlReader properly (there are a lot of examples on the web that will work for XML < 2033 characters ONLY, including on Microsoft.com)
Or convert it to a nvarchar(max) in TSQL which you can get with an ExecuteScalar. The TSQL would be:
Select cast( (Select StudentName, ClassName From Student Join (Select StudentId,ClassName From Class A Join Registration B On A.ClassId=B.CLassid) Class ON Student.StudentId=Class.StudentId Order by StudentName for Xml Auto, Root('School')) as nvarchar(max))
You must remember that the pattern is:
SELECT CAST( (select …. xml auto) AS NVARCHAR(MAX))
The extra ( ) around the select is essential.
The question of course, is which performs better. I will address that in a future blog.
OK, why do this???? The answer is simple – if you are using web services or WCF, the most effective (performance wise) is to avoid any serialization/de-serialization of .Net objects and just pass a string from the database to WCF to the client. XML is sweet because it can express hierarchy (and reduce data volume by eliminating redundant data). More in future blogs.