Thursday, December 9, 2010

Running Multiple Threads on Windows Azure

The default worker role created by Visual Studio 2010 generates code that only leverages a single thread in your compute instance. Using that generated code, if you make synchronous network requests to SQL Azure or to a web service (for example via REST), your dedicated core for the instance becomes underutilized while it waits for the response from the network. One technique is to use the asynchronous functions in ADO.NET and the HTTPWebRequest classes to offload the work to the background worker. For more information about asynchronous calls read: Asynchronous Programming Design Patterns . Another technique that I will cover in this blog post is how to start up multiple threads, each for a dedicated task, for this purpose I have coded a multi-threaded framework to use in your worker role.

 

Goals of the framework:

  • Remain true to the design of the RoleEntryPoint class, the main class called by the Windows Azure instance, so that you don’t have to redesign your code.
  • Gracefully deal with unhandled exceptions in the threads.

RoleEntryPoint

The RoleEntryPoint class includes methods that are called by Windows Azure when it starts, runs, or stops a web or worker role. You can optionally override these methods to manage role initialization, role shutdown sequences, or the execution thread of the role. A worker role must extend the RoleEntryPoint class. For more information about how the Windows Azure fabric calls the RoleEntryPoint class read: Overview of the Role Lifecycle.

 

The included framework provides the the ThreadedRoleEntryPoint class which inherits from RoleEntryPoint and is used as a substitute for RoleEntryPoint in WorkerRole.cs. Here is an example of using the framework:

 

public class WorkerRole : ThreadedRoleEntryPoint
{
    public override void Run()
    {
        // This is a sample worker implementation. Replace with your logic.
        Trace.WriteLine("Worker Role entry point called", "Information");

        base.Run();
    }

    public override bool OnStart()
    {
        List<WorkerEntryPoint> workers = new List<WorkerEntryPoint>();

        workers.Add(new ImageSizer());
        workers.Add(new ImageSizer());
        workers.Add(new ImageSizer());
        workers.Add(new HouseCleaner());
        workers.Add(new TurkHandler());
        workers.Add(new Crawler());
        workers.Add(new Crawler());
        workers.Add(new Crawler());
        workers.Add(new Gardener());
        workers.Add(new Striker());

        return base.OnStart(workers.ToArray());
    }
}

 

Inside the OnStart() method we create a list of class instances that are passed to the ThreadedRoleEntryPoint class, each is given its own thread. Also note that some of the classes are listed more than once. Multiple classes mean that simultaneous identical work is being performed – it also allows us to balance the work. In the example above crawling is three times more important than house cleaning.

 

It would be nice to reuse the RoleEntryPoint class to create subclasses for each thread that we want to start, however Windows Azure requires that there only be one and only one class the subclasses the RoleEntryPoint class in the cloud project. Because of this restriction I have created an abstract class called: WorkerEntryPoint which all the thread class must inherit from. This class as the same lifecycle methods as the RoleEntryPoint class:

  • OnStart Method
  • OnStop Method
  • Run Method

To create a threaded worker you override these methods in just the same way as if you were inheriting from the RoleEntryPoint class. However, the only method you have to override is the Run method. Typically it would look something like this:

internal class Striker : WorkerEntryPoint
{
    public override void Run()
    {
        while (true)
        {
            // Do Some Work

            Thread.Sleep(100);
        }
    }
}

Handling Exceptions

One thing that a Windows Azure worker role does nicely is to try to stay up and running regardless of errors. If an exception occurs within the Run method, the process is terminated, and a new instance is create and is restarted by Windows Azure. When an unhandled exception occurs, the stopping event is not raised, and the OnStop method is not called. For more information about how the Windows Azure fabric handles exceptions read: Overview of the Role Lifecycle. The reason .NET terminates the process is that there are many system exceptions that you can’t recover from like: OutOfMemoryException without terminating the process.

When an exception is thrown in one of the created threads we want to simulate the Windows Azure processes as close as we can. The framework allows Windows Azure to terminate the process on unhandled exceptions; however it tries to gracefully shutdown all threads before it does. Here are the goals for exception handling within the framework:

  • Gracefully restart all threads that throw an unhandled non-system exception without terminating the other threads or process space.
  • Use the managed thread exception handling to terminate the process on unhandled system exceptions.
  • Leverage Windows Azure role recycling to restart threads when the role is restarted.

When there is an unhandled exception on a thread created with the Start method of the Thread class, the unhandled exception causes the application to terminate. For more information about exception handling in threads read: Exceptions in Managed Threads. What this means is that we need to build in some exception handling in our threads to that framework can exit gracefully. The ProtectedRun method accomplishes this:

 

/// <summary>
/// This method prevents unhandled exceptions from being thrown
/// from the worker thread.
/// </summary>
public void ProtectedRun()
{
    try
    {
        // Call the Workers Run() method
        Run();
    }
    catch (SystemException)
    {
        // Exit Quickly on a System Exception
        throw;
    }
    catch (Exception exception)
    {
        // Perform Error Logging or Diagnostic
    }
}

 

The main RoleEntityPoint class which manages all the threads loops across all the threads and restarts any that are terminated (see code below). The threads become terminated when they exit the ProtectedRun method as opposed to unhandled exception which terminates the process space.

while (!EventWaitHandle.WaitOne(0))
{
    // WWB: Restart Dead Threads
    for (Int32 i = 0; i < Threads.Count; i++)
    {
        if (!Threads[i].IsAlive)
        {
            Threads[i] = new Thread(Workers[i].Run);
            Threads[i].Start();
        }
    }

    EventWaitHandle.WaitOne(1000);
}

 

The while loop terminates when a stop is requested from the Windows Azure Fabric, by tripping the EventWaitHandle to a signaled state. The EventWaitHandle provides thread protection, since the stopping request is made on a different thread.

Error Reporting

Windows Azure Diagnostic assembly has diagnostics class for collecting and logging data. These classes can be used inside a worker role to report back information and error about the state of the running role. For more information about diagnostics in Windows Azure see: Collecting Logging Data by Using Windows Azure Diagnostics.

 

When using the threaded role framework you need to call the diagnostic classes from the threaded class (which inherits from WorkerEntryPoint class) went you have something to report. The reason behind this is that the unhandled system exceptions in the threaded class will cause the process space to terminate. Your last chance to catch and log error information is in the threaded class. Generally I use a try/catch in the Run method to catch all exceptions, report them to diagnostics and then re-throw the expectation – which is not different than what I would implement in the default single threaded worker role.

Summary

The included framework enables multi-threaded Windows Azure worker roles within the same context of the Windows Azure Service Runtime. Using multiple threads can increase your processor utilization when your application is waiting for network requests – like calls to SQL Azure to complete.

 

Download the Framework.

 

{6230289B-5BEE-409e-932A-2F01FA407A92}

12 comments:

  1. Thanks for this code example and well written post.

    I get this error when I try to start it in Debug mode:

    Microsoft.WindowsAzure.ServiceRuntime Critical: 201 : Role entrypoint could not be created:
    System.MissingMethodException: Cannot create an abstract class.

    Any idea, what can be the problem?

    /Thomas

    ReplyDelete
  2. I ran into the same problem and it's partially our fault. If we had included these classes as a seperate assembly, everything would have worked fine, but we pasted them directly into the worker role project. Azure seeks the worker role assembly for the first class that derives from RoleEntryPoint, and tries to load the abstract class in this library.

    That said, Azure really should try to load an Abstract class, but once you've moved it out into a classlibrary project, reference it, things will be fine

    ReplyDelete
  3. I have this working under local dev mode... awesome! I deployed it into Azure... and NOTHING happens...! I even added a tracer/dblog entry in the code under the Run() and OnStart().. still nothing.. Any ideas?

    ReplyDelete
  4. I am also facing similar problem, it works in local dev mode, but on azure, its not working, services, keeps retrying

    ReplyDelete
  5. It is not working. Looks good but its just not working :(

    ReplyDelete
  6. So why aint it working on Azure but only in local dev mode?

    ReplyDelete
  7. After downloaded the source code, I realized that the ThreadedRoleEntryPoint is an abstract class and the WorkerEntryPoint isn´t. However, the autor wrote in his post that the WorkerEntryPoint should be an abstract class. This is the cause of the MissingMethodException thrown when implementing this. Changing this has solved the problem for me.

    ReplyDelete
  8. Any thoughts on how make the web.config settings accessible to the worker threads?

    ReplyDelete
  9. Hello,
    What is worker Role?
    Tell me some examples on Worker Rlole in Windows Azure AND also How to deploye this worker role in to windows azure?

    Olz help me I am new in Cloud

    ReplyDelete
  10. When you restart dead threads you're calling Run() instead of ProtectedRun()

    if (!Threads[i].IsAlive)
    {
    Threads[i] = new Thread(Workers[i].Run); <----
    Threads[i].Start();
    }

    ReplyDelete
  11. There's something I don't get from this code. Inside the OnStop() method of ThreadedRoleEntryPoint class the threads on which the ProtectedRun() method is running (i.e. where the Run() method of threaded roles is running) is aborted, and after that the OnStop() method is called for each threaded role. The way I understand worker roles life cycle, the OnStop() should be called first, wait for some time and if the Run() thread is still running, then it must be aborted. Is this correct?

    ReplyDelete