Friday, July 30, 2010

Time to get SCIENCE into Computer Science

Yesterday while hiking I listened to BBC Discovery and they were talking about the first controlled experiments on humans done by a British Navy Surgeon, James Lind, with people suffering from scurvy.  The closing comments about the difficulty of people to keep scientific (basing decisions on hard good data and not popular belief, religious belief, anecdotes etc) struck home with me.


Often I have had a discussions about which technology paths to take and usually end up using the following criteria for my recommendations:

  • Number of lines of code to produce (lower is better) intended results
  • Application performance – how fast is it going to run
  • Application scalability – can we handle bigger load
  • Expected time to completion: Expected in a mathematical sense:
    • Estimated Time x Probability of being correct.
  • Maintainability of code base, which breaks down into:
    • Availability on the market of people skilled enough to do maintenance and development
    • Trends on the market (growing or decreasing PERCENTAGE of people with those skill sets on the market).
    • Ramp-up time for learning needed skills

Often I find that the other side will not address these issues in our discussion. The discussion can go religious (or power play)  “This is the way we should do it…”, or everyone that I talked to says it’s a problem (all of your friends are JAVA/ORACLE developers)  i.e. anecdotes.


A lot of bad software decision comes from not being scientific. QUANTIFY the criteria and then make the decision. Get the SCIENCE back into Computer Science.

Sunday, July 25, 2010

SQL Server and Rich-Sparse Data (Real Estate) – Part I

Over the last two years I have been involved with two clients that deal with Real Estate (different market segments so no conflict of interest for me). I would describe real estate data as being sparse, rich data. The typical scenario is that the data on one house consists of MLS data and County Assessor data. In the case of SQL Server 2008R2,the number of fields involve quickly exceed the number of columns available in a SQL Server table (non-wide 1024 columns), but less than the number of columns of a wide table (30,000).  Wide tables were introduced in SQL Server 2008, so if you are forced to run on older versions of SQL Server the options changes.


In the case of wide tables, you are restricted to 4096 columns for Inserts, Select or Updates (despite having 30000 columns available). The wide table has a column set which is essentially an untyped XML that combines all the sparse columns [more info]. This means that you could use XML if you are running SQL Server 2005. There’s a bit of a dilemma here, using XML gives you the ability to update all 30000 rows by updating one XML column instead of having to sparse and create up to 8 inserts; the other side of the coin is that you would need to generate the computed columns etc from the XML.


For displaying data, I may need to do 8 selects with a wide column but can get all 30K attributes in a single column with a XML column type.


Before diving into these issues, I should cite the operation needs that I often encountered, namely:

  • A need to retrieve the original/earliest record in full
  • A need to list what changed between updates --- only what changed.
  • A need to query the current record.

For discussion purposes, we will assume that the data coming in is either XML or some format that may be converted to Xml with all of the properties being expressible as attributes in XML (i.e. a flaten structure – in some cases, it means doing cross-products of child data). The logic become:

  • Get the key from the Xml
  • If the key exists in the table, go to Update.
  • Insert:
    • Insert into [OriginalXml]
    • Insert into [CurrentXml]
    • Insert record with no attributes into [DeltaXml] with:
      • Received/Processed date
      • VersionNo  =0
      • End of Processing
  • Update:
    • Retrieve record from [CurrentXml]
    • Update [CurrentXml]
    • Walk all attributes in currentRecord and newRecord and retain only those attributes that have changed.  Use a XmlTextReader for performance.
      • Insert into [DeltaXml] the resulting Xml with:
        • VersionNo=Max(VersionNo)+1 for key
    • End of Processing.

Now if a user wants to see the history of a property, it is easy to show, just do a select on DeltaXml and spit out the items, for example:

  1. Dec 11, 2009 <key=”3” AP=”209000” BTH=”4” />
    1. Asking Price Changed: $209,000
    2. Bathrooms Changed: 4
  2. Dec 31, 2009 <key=”3”  BTH=”3” />
    1. Bathrooms Changed: 3
  3. Jan 10, 2010 <key=”3” AP=”199000”  />
    1. Asking Price Changed: $199,000

In the above example, the original Bathroom count may have been 3, it was changed to 4 (clerical error?), and then changed back to 3.  You get performance in retrieving the data because you do not have to walk 30K fields. Additionally, you have the complete history available (update by update) by just applying each deltaRecord to the originalRecord without sucking up massive storage requirements.

In theory, you could discard the original (saving space) and work backwards from the currentRecord but user usage is that they want to see the originalRecord – so I rather omit the overhead and performance hit of doing a reconstruction. Second, having both the [OriginalXml] and [CurrentXml] allows audits to be done: 

  • [OriginalXml] +[DeltaXml] = [CurrentXml]

In Part II I will look at the issues in searching records and performance differences between raw XML and wide tables.

Wednesday, July 21, 2010

#if and [Conditional]

Old school folks who have done Fortran, C, C++ etc knows the old friend, #if.  Net introduced a [Conditional] attribute and give examples for use. On occasion I have seen newer developers  voicing a dislike of seeing #if in code and advocate the use of [Conditional] as a standard.  This does not fly for me – simply because it results in code becoming twisted.


Often the reason for the resistance is simply unfamiliarity with it, or in some cases, some past experience working with horrible usage (I reviewed the code base from a major (not-Microsoft) software provider of tax software and saw nightmare usage patterns with the #if) has left a conditioning on their psyche.


For example, the following code WILL NOT COMPILE

Code Snippet
  1. public void Init(HttpApplication context)
  2. {
  3.     context.EndRequest += EndRoutine;
  5. }
  6. [Conditional("TRACE")]
  7. private void EndRoutine(object o, EventArgs e)
  8. {


Error    1    Cannot create delegate with 'ErrorTracking.EndRoutine(object, System.EventArgs)' because it has a Conditional attribute  


This variation will compile.

Code Snippet
  1. public void Init(HttpApplication context)
  2. {
  3. #if TRACE
  4.     context.EndRequest += EndRoutine;
  5. #endif
  6. }
  7. private void EndRoutine(object o, EventArgs e)
  8. {


It is possible to still use the conditional by chaining routines – but I dislike it severely because it add an extra call and complicate code reviews (i.e. I believe in KISS)

Code Snippet
  1. public void Init(HttpApplication context)
  2. {
  3.     context.EndRequest += EndRoutine;
  5. }
  6. private void EndRoutine(object o, EventArgs e)
  7. {
  8.     EndRoutine2(o, e);
  9. }
  10. [Conditional("TRACE")]
  11. private void EndRoutine2(object o, EventArgs e)
  12. {

Another use of conditional compile results in code being modified. Since the Conditional Attribute does not allow items to be returned NOR the use of out parameters, you are forced to use properties or the instance or instance methods. Both of these approaches end up obtuficating the code.


Example of error when using out

Code Snippet
  1. [Conditional("TRACE")]
  2. private void EndRoutine2(object o, EventArgs e, out string x)


Error    1    Conditional member ErrorTracking.EndRoutine2(object, System.EventArgs, out string)' cannot have an out parameter.


The use of #if and similar cannot be avoided by the use of the [Conditional] attribute. The [Conditional] attribute is ideal for tracing and logging, but for a lot of other uses – it fails.

Tuesday, July 20, 2010

Tracking user navigation on a web site

Typically you put up a website and hope everything goes well and is used.  It is often a good idea to track the pages that users actually go to, and from where.  A novice developer would likely add code to every page to record this information. A better developer may add code to a master page to do the same. A better (and simpler) solution is to just drop a IHttpModule on to the website and have it record. If you produce many websites, then just compile it to a DLL and add it to each


The code is very simple, as shown below.

Code Snippet
  2. using System;
  3. using System.Web;
  4. using System.Configuration;
  5. using System.Data;
  7. using System.Data.SqlClient;
  9. public class RequestTracking : IHttpModule
  10. {
  11.     private HttpApplication httpApp;
  12.     private string _Connection;
  13.     public void Init(HttpApplication httpApp)
  14.     {
  16.         this.httpApp = httpApp;
  17.         httpApp.BeginRequest += new EventHandler(httpApp_BeginRequest);
  18.         _Connection = ConfigurationManager.ConnectionStrings["logdb"].ConnectionString;
  19.     }
  21.     void httpApp_BeginRequest(object sender, EventArgs e)
  22.     {
  23.         using (var sp = new SqlCommand("sp_LogRequest", new SqlConnection(_Connection)) { CommandType = CommandType.StoredProcedure })
  24.         {
  25.             sp.Parameters.AddWithValue("RelPath", httpApp.Context.Request.AppRelativeCurrentExecutionFilePath);
  26.             if (httpApp.Context.Session != null)
  27.             {
  28.                 sp.Parameters.AddWithValue("Session", httpApp.Context.Session.SessionID);
  29.             }
  30.             sp.Parameters.AddWithValue("IPAddress", httpApp.Request.UserHostAddress);
  31.             try
  32.             {
  33.                 sp.Connection.Open();
  34.                 sp.ExecuteNonQuery();
  35.             }
  36.             catch { }
  37.             finally { sp.Connection.Close(); }
  38.         }
  39.     }
  40.     public void Dispose()
  41.     { }
  42. }


The amount of information capture is sparse:

  • The requested file as a relative path
  • The SessionId (if a session exists)
  • The Client IP Address

You can add a lot more information if you wish. The information is inserted into a SQL with three tables, the page reference table:

  • PageId Int Identity(1,1)
  • PageUrl varchar(255)

The Session reference table:

  • SId Int Identity(1,1)
  • SessionId varchar(255)
  • ClientIP  varchar(22)

And into the log table:

  • LogId int Identity (1,1)
  • PageId
  • Sid 
  • ReceivedTime Datetime

This allows you to see how long the client spent on each page, what page they go to next, etc. A useful exercise is often to create a chart of all of the pages and the percentage of time they go between each. For example, you can use Visio to generate a site map, now just add the % of times between each page to each link. The results may cause you to restructure the website or identify pages that are not being used as expected.

Friday, July 9, 2010

The Art of Application UI Design

One of my project involves a client that has an excellent idea but not experience in UI design. This often is the beginning of conflicts between “this is what I envision” and “this is what is best design practices”.  There can be a lot of head-bumping, for example for each dialog/page title bar:

  • Customer wants the firm logo and service mark on all of them
    • Wants to really sell this motto
  • Developer wants the name of the dialog/functionality there so the user knows where they are at
    • For a support call, it makes it easy to help, ask for the title at the top
    • It’s allow the user to scroll through dialog titles to select where they want to go (depending on what a dialog is, and the environment)

I do not know the solution, my recommendation is to ask the customer to read some design books as a start.  Some examples of items on my shelf dealing with web design:

  • Web Design in a Nutshell, Jennifer Niederst, O’Reilly
  • Web Navigation, Designing the User Experience, Jennifer Fleming, O’Reilly
  • Creating Killer Web Sites, David Siegal, Hayden Books
  • The Art & Science of Web Design, Jeffrey Veen, New Riders
  • Experience design, Nathan Shedroff, New Riders
  • User-Centered Web Design, John Cato, Addison-Wesley
  • Homepage Usability, 50 Websites Deconstructed, Jakob Nielsen & Marie Tahir
  • Eyetracking Web Usability, Jakob Nielsen and Kara Pernice
  • Prioritizing Web Usability, Jakob Nielsen and Hoa Loranger
  • Designing Web Usability by Jakob Nielsen
  • Train of Thoughts, Designing the effective web experience, John C. Lenker, Jr

There are equivalent books for Windows applications. For new tech items like iPhone and Android there are not design books out – however, in the case of the Android with the mockup being  pseudo-XHTML, a lot of the same design principles apply.


Design is like drawing or painting: there are fads and there are fundamentals. There’s a need to have an ‘eye’ and to be a commercial artist – that is, accepting what the trends are and adapting to them instead of doing your own thing.


Unfortunately today, many developers think they know how to design but fundamentally they have no interest in the art of UI design.  They love to push bytes, not layout.


For myself, before the days of personal computers, I did calligraphy and illumination as a hobby (and won some prizes). The skills for laying out a decorative calligraphic page have been recycled into UI design.  There can be another dimension of UI design, the American with Disability Act (aka Section 508) which in artistic terms is equivalent to working with a limited palette. It means that you have to be creative to make things work well. This is where the commercial artist usually does well, they tend to have less of themselves in the project (but may be fully engaged – the less of themselves means sticking to their own whims).


So where do you go?

  • If you are looking at hiring a graphic artist then ask them to name some of the design books and authors that influence them… IMHO, if Jakob Nielsen is not mentioned, then be wary.
    • A printed page designer may not be a good application developer.  They may be inexperienced with the workflow design issues on a page, accelerators, etc
    • A video designer may not be a good application developer for the same reason as above, plus they are accustomed to dynamic change always happening and catching the user eye on to some aspect of the screen.
  • If you are hiring a developer, inquire about any artistic hobbies (do they have an eye for or interest in art?), as well as the above question.
  • If you are working for a boss as an employee or for equity, then you need to make sure that someone is charged with, and is competent for UI design (this means experience doing this for a few products, have read appropriately, taken courses). In some cases, it may be you – because no one else is interested (big mistake).
    • If it is you, hit Amazon or Abe books and get reading!!!
  • If you are working for a client who have definite ideas of what they want – you should suggest the above books, but bottom line is “The customer is always right”; if the work turns out horrible, never list this client. Just nod and take the money!

Remember the art of development is an art! It is far more than just pushing bytes…. IMHO

Extending Sitemaps to provide rich page features and controls

On one web project there was a need to turn on and off the ability to print on individual pages. The print control is on the master page and the administrator wishes to be able to adjust which pages may print easily. The solution is actually simple:


  • Add a new attribute to the Sitemap node, for illustration we will use a XmlSiteMap as shown below with @mapPrint:
  • <?xml version="1.0" encoding="utf-8" ?>
    <siteMap xmlns="" >
        <siteMapNode url="" title=""  description="">
            <siteMapNode url="default.aspx" title="Home Page"  description="" mayPrint="true" />
            <siteMapNode url="about.aspx" title="About Us"  description=""  mayPrint="false"/>
  • Add a static class and method to the application:
  • public static class Utility
        public static bool MayPrint(this SiteMapNode node)
            return node[ConfigurationManager.AppSettings["mayPrint"] ?? "mayPrint"] != "false";        
  • In the master page, just read the value!
  • public partial class SiteMaster : System.Web.UI.MasterPage
        protected void Page_Load(object sender, EventArgs e)
            PrintControl.Visible = System.Web.SiteMap.Provider.CurrentNode.MayPrint();

You will notice that I actually have not hard coded the attribute name. I have provided a default value which I may alter from web.config as needed. Items like this I would usually add to a Web Server Control Library so I can reuse the code everywhere.


This same pattern may be used for any other information that you wish to associate with specific pages. The mechanism is simple and can often eliminate a stack of spagetti coding (which may be hard-coded to make it worst).

Friday, July 2, 2010

The sweetest HTML/Web Site Validator around -- Qualidator

I came across this tool last week and have been using the free version (and likely to upgrade soon to get more features). As the name implies, it not only evaluate technical conformity, but also evaluate the quality of the Html and Css on the page.


Validation separates  HTML Workmen from HTML Professionals / Craftmen.

  • Workmen simply gets the coarse job done. You want a door installed –it’s installed. There may be misalignments, gaps, missing cosmetic hardware, etc. The workmen may see them, but unless nagged  (or threaten) will not do anything more.
  • Craftmen takes pride in their craft – the door will fit, be aligned, look good, open and close smoothly, trim will be well cut and fitted etc.

On the web you will hear voices saying that validation is not needed etc. You will hear fewer voices saying it’s absolutely essential for any professional site. Often the difference is a lack of skills, maturity, and discipline in one of these groups. “If you don’t have the bandwidth, attack as being unnecessary!”


One of the sweet features is detection of Spaghetti Markup. These are items that may become royal pain for maintenance or branding a site.


Try it! The URL is:

Thursday, July 1, 2010

Doing Web-Based Standards Validation

There are a lot of sites, especially W3C that provides site-wide validation. On the other side, are web sites that require logons or internal only – never the twain can meet. Or can they?


I have a reasonable solution:

  • To Web.Config, add an AppSetting “StandardsReview” that indicates if this site is in review mode.
  • Automate things like logins with a specific account if any page is called and the user is not logged in.
    • I use Master Pages, so this is an easy.
  • Put an if/else around any code that you want to protect from accidental calls. Since you are using a specific account above – the account may be a safe testing-account and this coding is not needed. It really should be such a safe account.

The following assumes that the validators will walk all of the links from the home page. On the home page I drop this simple code:

<div style="display: none">
    <uc:SiteValidationSitemap runat="server" ID="SiteValidation" Visible="false" />

Note that it is hidden visually and defaults to be not visible. If you are in StandardsReview, then Visible is set to true.


What is in this user control? The page is trivial:

<%@ Control Language="C#" AutoEventWireup="true" CodeBehind="StandardValidationSiteMap.ascx.cs" Inherits="StandardValidationSiteMap" %>
<asp:Panel runat="server" ID="HRefLinks" />


And behind the scene we have:

public partial class StandardValidationSiteMap : System.Web.UI.UserControl
    protected void Page_Load(object sender, EventArgs e)
        HyperLink hl;
        DirectoryInfo di=new DirectoryInfo(Server.MapPath("~/"));
        foreach(FileInfo fi in di.GetFiles("*.aspx", SearchOption.AllDirectories))
            HRefLinks.Controls.Add(hl=new HyperLink(){Text=fi.Name, Visible=true});
            hl.NavigateUrl=String.Format("~/{0}", fi.FullName.Substring(di.FullName.Length));

What do we get? In normal operation:

<div style="display: none">

and in StandardsReview every page is listed:

<div style="display: none">
    <div id="SiteValidation_HRefLinks">
        <a href="Account.aspx">Account.aspx</a>
        <a href="Account.Deposit.aspx">Account.Deposit.aspx</a>
        <a href="Account.Deposit.Confirmation.aspx">Account.Deposit.Confirmation.aspx</a>

Now we can just expose the website on the real internet.  Set up IIS to run this site. If you are like me, with a dynamic IP address,  then just find that address and prefix it to the address and test away. Some of my favorite validation tools are:

At the extreme case for an internal system – you can likely VPN into it and then by doing port forwarding, you can expose it for validation (just don’t IT Security about what you are doing….)