Tuesday, October 07, 2008

How big of an allocation does max represent in the case of varchar(max) and varbinary(max)?

2 ^ 31 - 1 bytes or just about 2gb.

Can I specify a varchar(9000) or varbinary(12000)?
No.  If you are going to specify an actual number instead of (max) then your limit is 8000.  The size can be 1 to 8000, or max, there is nothing between 8000 and max.

Is a varchar(max) or varbinary(max) column stored in the data row?
If the size of the varchar or varbinary item is less than 8000 then it is stored in the data row. 
If it is greater than 8000 then the item is moved out of the data row into a special storage location for large objects.

Should varchar(max) be used to replace the TEXT column type?

Yes.  varchar(max) can handle larger strings than the text type can.  varchar(max) is easier to update than text with standard SQL statements.


Tuesday, October 07, 2008 6:01:23 PM (Pacific Standard Time, UTC-08:00)  #    Disclaimer  |  Comments [0]  | 
 Thursday, September 25, 2008

This tip applies to both Microsoft SQL Server and to Oracle, and probably several other databases.

When you use the asterisk (aka star) in a select statement it requires all columns of data from the result set to be accessed.

Why not SELECT *?


If you are querying a small table with 3 or 4 columns, then the following two statements won’t be very different:

SELECT * FROM

Or

SELECT column1, column2, column3, column4 FROM

But when your query gets more complicated, and you then join in two more tables, which combined with the original table have a total of 40 columns, then all 40 columns are returned in your record set.

SELECT * FROM Table1, Table2, Table3 WHERE …   

will return all 40 columns

SELECT column1, column2, column3, column4
FROM Table1, Table2, Table3 WHERE …   

will only return 4 columns.

Recently I was optimizing a query using the SELECT * syntax which returned about 25 columns, one of which was a VARBINARY(MAX) column.  This query was taking about 7 seconds to run.  I then substituted the 6 column names that I needed for the select *, and with no other changes, that same query ran in about .1 seconds.

Additionally assume that you had the following query:

SELECT column1, column2, column3, column4 FROM Table1

And you had an index that contained column1, column2, column3, and column4.  This would be called a covered index, and when your query was run, it would never hit any of the data pages, and instead pull all of your results from the index.  This can much quicker than having to access the data pages.

So my recommendation is to never use the SELECT * syntax, specify what you want in your result sets.  Even if your table is small, specify what you want, because over time your table may grow to have more columns.

Why not COUNT(*)

Instead of COUNT(*), you should use COUNT(1).   Either way, you will get the same results. 

The COUNT(*) counts the number of rows in your result set by bringing all of the columns into memory, but using COUNT(1) doesn't require any of these columns to come into memory.  This also applies when your query includes a GROUP BY with COUNT(1).

My recommendation is to NEVER use COUNT(*), and instead use COUNT(1).  They will both give you the same number of rows in your result set, but there is far less work required to get the count when using COUNT(1), and COUNT(*) is just wasteful.




Thursday, September 25, 2008 7:34:19 PM (Pacific Standard Time, UTC-08:00)  #    Disclaimer  |  Comments [0]  | 
 Wednesday, September 17, 2008
The quick and easy way to determine how much memory your SQL Server is using is to look at the Task Manager on the server and find “sqlservr.exe” in the process list, and look at the “Mem Usage” column.

But what if you are not on the server, or if you want to grab this information programmatically from an application, it is not so obvious.  After 15 different Google queries, and many articles saying to just use DBCC MEMORYSTATUS, I finally tracked it down.

The problem with DBCC MEMORYSTATUS, is that it returns nearly 500 values about every aspect of memory possible on the SQL Server.

I finally discovered that if I take all of the output from DBCC MEMORYSTATUS, and insert it into a temporary table, I can then query for the one line of information that I need.  The total memory used, that matches what the task manager shows is called “Working Set”.   The following query returns the same memory usage numbers as the task manager shows.

CREATE TABLE #tmp

   ( row_id INT IDENTITY PRIMARY KEY, name VARCHAR(100), value BIGINT );

INSERT INTO #tmp EXEC ( 'DBCC MEMORYSTATUS' );

SELECT top 1 value / 1024 'KBUsed'

  FROM #tmp

 WHERE RTRIM(LTRIM(name)) = 'Working Set';

DROP TABLE #tmp;

From here I was able to call this from my C# code as follows:  (conn is of type SqlConnection and was already open and connected).

// get the total memory used by the SQL Server

SQLStatement = @"CREATE TABLE #tmp

   ( row_id INT IDENTITY PRIMARY KEY, name VARCHAR(100), value BIGINT );

INSERT INTO #tmp EXEC ( 'DBCC MEMORYSTATUS' );

SELECT top 1 value / 1024 'KBUsed'

  FROM #tmp

 WHERE RTRIM(LTRIM(name)) = 'Working Set';

DROP TABLE #tmp;";

 

SQLDataAdapter = new SqlDataAdapter(SQLStatement, conn);

dtResult = new DataTable();

SQLDataAdapter.Fill(dtResult);

 

foreach (DataRow drRow in dtResult.Rows)

{

    tbServerMemory.Text = drRow["KBUsed"].ToString();

}

 

SQLDataAdapter.Dispose();

 

This gives you a quick way to check the amount of memory being used by the SQL Server without being on the server or using the Task Manager.


C# | SQL Server | Steve | T-SQL
Wednesday, September 17, 2008 4:24:08 PM (Pacific Standard Time, UTC-08:00)  #    Disclaimer  |  Comments [0]  | 
 Tuesday, September 16, 2008
I recently have had the opportunity to work on performance tuning on Microsoft SQL Server 2005.  My background has been more with performance tuning on Oracle.  The following is something that I have discovered to be very similar between Oracle and Microsoft SQL Server, but the names are different between the two systems.

So this brings me back to the question of how much procedure cache memory does one SQL Statement take up?   The answer is either very little memory, or all of the memory available to the cache, depending on how you do it.

Take the following query as an example.

SELECT column1 FROM table1 WHERE id = ???;

Where ??? represents the actual id number being passed into a query.  For instance:

SELECT column1 FROM table1 WHERE id = 7;
SELECT column1 FROM table1 WHERE id = 928;
SELECT column1 FROM table1 WHERE id = 12341;


Now compare the above queries to the following parameterized query:

SELECT column1 FROM table1 WHERE id = @ID_TO_FIND;

Where @ID_TO_FIND is a parameter that gets set at runtime.

To start with we need to understand how SQL Server handles the processing of queries.  There are 2 steps to running a query, the first is to compile the query (also known as creating the plan), and the second is to run the query.   The process of creating the plan (compiling) is very expensive, so SQL Server keeps a cache of these compiled queries around.  If SQL Server finds an already compiled query, it uses that, otherwise it is forced to recompile the query.

The difference between these two calling conventions is astounding.   In the first calling convention where you pass hard coded strings in each time, the SQL procedure cache will have n unique copies of the first query format cached, where n is the number of unique id parameters that are passed in.
In the second calling convention where the SQL statement is parameterized and is not changing each time, then we will have 1 copy of the SQL statement in the procedure cache, and this one cached copy is reused over and over again.

Given that the amount of memory for the procedure cache is limited (which it is), and that you could have hundreds or more queries running on your SQL server, and for each of those queries was not parameterized, you would multiply it by the number of unique parameters passed into it, you could easily in a day have hundreds of thousands of un-necessary plan compiling being done on your SQL Server.

Without using parameterized queries, you are just wasting the Procedure Cache.  If you are using parameterized queries, then your system will run faster, and scale better.

If you are using Oracle, you have the same concept, but it is called Bind Variables instead of Parameterized Queries.

Tuesday, September 16, 2008 7:44:44 PM (Pacific Standard Time, UTC-08:00)  #    Disclaimer  |  Comments [0]  |