Tuesday, July 20, 2010

Tracking user navigation on a web site

Typically you put up a website and hope everything goes well and is used.  It is often a good idea to track the pages that users actually go to, and from where.  A novice developer would likely add code to every page to record this information. A better developer may add code to a master page to do the same. A better (and simpler) solution is to just drop a IHttpModule on to the website and have it record. If you produce many websites, then just compile it to a DLL and add it to each


The code is very simple, as shown below.

Code Snippet
  2. using System;
  3. using System.Web;
  4. using System.Configuration;
  5. using System.Data;
  7. using System.Data.SqlClient;
  9. public class RequestTracking : IHttpModule
  10. {
  11.     private HttpApplication httpApp;
  12.     private string _Connection;
  13.     public void Init(HttpApplication httpApp)
  14.     {
  16.         this.httpApp = httpApp;
  17.         httpApp.BeginRequest += new EventHandler(httpApp_BeginRequest);
  18.         _Connection = ConfigurationManager.ConnectionStrings["logdb"].ConnectionString;
  19.     }
  21.     void httpApp_BeginRequest(object sender, EventArgs e)
  22.     {
  23.         using (var sp = new SqlCommand("sp_LogRequest", new SqlConnection(_Connection)) { CommandType = CommandType.StoredProcedure })
  24.         {
  25.             sp.Parameters.AddWithValue("RelPath", httpApp.Context.Request.AppRelativeCurrentExecutionFilePath);
  26.             if (httpApp.Context.Session != null)
  27.             {
  28.                 sp.Parameters.AddWithValue("Session", httpApp.Context.Session.SessionID);
  29.             }
  30.             sp.Parameters.AddWithValue("IPAddress", httpApp.Request.UserHostAddress);
  31.             try
  32.             {
  33.                 sp.Connection.Open();
  34.                 sp.ExecuteNonQuery();
  35.             }
  36.             catch { }
  37.             finally { sp.Connection.Close(); }
  38.         }
  39.     }
  40.     public void Dispose()
  41.     { }
  42. }


The amount of information capture is sparse:

  • The requested file as a relative path
  • The SessionId (if a session exists)
  • The Client IP Address

You can add a lot more information if you wish. The information is inserted into a SQL with three tables, the page reference table:

  • PageId Int Identity(1,1)
  • PageUrl varchar(255)

The Session reference table:

  • SId Int Identity(1,1)
  • SessionId varchar(255)
  • ClientIP  varchar(22)

And into the log table:

  • LogId int Identity (1,1)
  • PageId
  • Sid 
  • ReceivedTime Datetime

This allows you to see how long the client spent on each page, what page they go to next, etc. A useful exercise is often to create a chart of all of the pages and the percentage of time they go between each. For example, you can use Visio to generate a site map, now just add the % of times between each page to each link. The results may cause you to restructure the website or identify pages that are not being used as expected.

1 comment:

  1. Or if you have access to your raw log files you could use this...