Running Multiple Threads on Windows Azure
The default worker role created by Visual Studio 2010 generates code that only leverages a single thread in your compute instance. Using that generated code, if you make synchronous network requests to SQL Azure or to a web service (for example via REST), your dedicated core for the instance becomes underutilized while it waits for the response from the network. One technique is to use the asynchronous functions in ADO.NET and the HTTPWebRequest classes to offload the work to the background worker. For more information about asynchronous calls read: Asynchronous Programming Design Patterns . Another technique that I will cover in this blog post is how to start up multiple threads, each for a dedicated task, for this purpose I have coded a multi-threaded framework to use in your worker role.
Goals of the framework:
- Remain true to the design of the RoleEntryPoint class, the main class called by the Windows Azure instance, so that you don’t have to redesign your code.
- Gracefully deal with unhandled exceptions in the threads.
RoleEntryPoint
The RoleEntryPoint class includes methods that are called by Windows Azure when it starts, runs, or stops a web or worker role. You can optionally override these methods to manage role initialization, role shutdown sequences, or the execution thread of the role. A worker role must extend the RoleEntryPoint class. For more information about how the Windows Azure fabric calls the RoleEntryPoint class read: Overview of the Role Lifecycle.
The included framework provides the the ThreadedRoleEntryPoint class which inherits from RoleEntryPoint and is used as a substitute for RoleEntryPoint in WorkerRole.cs. Here is an example of using the framework:
public class WorkerRole : ThreadedRoleEntryPoint { public override void Run() { // This is a sample worker implementation. Replace with your logic. Trace.WriteLine("Worker Role entry point called", "Information"); base.Run(); } public override bool OnStart() { List<WorkerEntryPoint> workers = new List<WorkerEntryPoint>(); workers.Add(new ImageSizer()); workers.Add(new ImageSizer()); workers.Add(new ImageSizer()); workers.Add(new HouseCleaner()); workers.Add(new TurkHandler()); workers.Add(new Crawler()); workers.Add(new Crawler()); workers.Add(new Crawler()); workers.Add(new Gardener()); workers.Add(new Striker()); return base.OnStart(workers.ToArray()); } }
Inside the OnStart() method we create a list of class instances that are passed to the ThreadedRoleEntryPoint class, each is given its own thread. Also note that some of the classes are listed more than once. Multiple classes mean that simultaneous identical work is being performed – it also allows us to balance the work. In the example above crawling is three times more important than house cleaning.
It would be nice to reuse the RoleEntryPoint class to create subclasses for each thread that we want to start, however Windows Azure requires that there only be one and only one class the subclasses the RoleEntryPoint class in the cloud project. Because of this restriction I have created an abstract class called: WorkerEntryPoint which all the thread class must inherit from. This class as the same lifecycle methods as the RoleEntryPoint class:
- OnStart Method
- OnStop Method
- Run Method
To create a threaded worker you override these methods in just the same way as if you were inheriting from the RoleEntryPoint class. However, the only method you have to override is the Run method. Typically it would look something like this:
internal class Striker : WorkerEntryPoint { public override void Run() { while (true) { // Do Some Work Thread.Sleep(100); } } }
Handling Exceptions
One thing that a Windows Azure worker role does nicely is to try to stay up and running regardless of errors. If an exception occurs within the Run method, the process is terminated, and a new instance is create and is restarted by Windows Azure. When an unhandled exception occurs, the stopping event is not raised, and the OnStop method is not called. For more information about how the Windows Azure fabric handles exceptions read: Overview of the Role Lifecycle. The reason .NET terminates the process is that there are many system exceptions that you can’t recover from like: OutOfMemoryException without terminating the process.
When an exception is thrown in one of the created threads we want to simulate the Windows Azure processes as close as we can. The framework allows Windows Azure to terminate the process on unhandled exceptions; however it tries to gracefully shutdown all threads before it does. Here are the goals for exception handling within the framework:
- Gracefully restart all threads that throw an unhandled non-system exception without terminating the other threads or process space.
- Use the managed thread exception handling to terminate the process on unhandled system exceptions.
- Leverage Windows Azure role recycling to restart threads when the role is restarted.
When there is an unhandled exception on a thread created with the Start method of the Thread class, the unhandled exception causes the application to terminate. For more information about exception handling in threads read: Exceptions in Managed Threads. What this means is that we need to build in some exception handling in our threads to that framework can exit gracefully. The ProtectedRun method accomplishes this:
/// <summary> /// This method prevents unhandled exceptions from being thrown /// from the worker thread. /// </summary> public void ProtectedRun() { try { // Call the Workers Run() method Run(); } catch (SystemException) { // Exit Quickly on a System Exception throw; } catch (Exception exception) { // Perform Error Logging or Diagnostic } }
The main RoleEntityPoint class which manages all the threads loops across all the threads and restarts any that are terminated (see code below). The threads become terminated when they exit the ProtectedRun method as opposed to unhandled exception which terminates the process space.
while (!EventWaitHandle.WaitOne(0)) { // WWB: Restart Dead Threads for (Int32 i = 0; i < Threads.Count; i++) { if (!Threads[i].IsAlive) { Threads[i] = new Thread(Workers[i].Run); Threads[i].Start(); } } EventWaitHandle.WaitOne(1000); }
The while loop terminates when a stop is requested from the Windows Azure Fabric, by tripping the EventWaitHandle to a signaled state. The EventWaitHandle provides thread protection, since the stopping request is made on a different thread.
Error Reporting
Windows Azure Diagnostic assembly has diagnostics class for collecting and logging data. These classes can be used inside a worker role to report back information and error about the state of the running role. For more information about diagnostics in Windows Azure see: Collecting Logging Data by Using Windows Azure Diagnostics.
When using the threaded role framework you need to call the diagnostic classes from the threaded class (which inherits from WorkerEntryPoint class) went you have something to report. The reason behind this is that the unhandled system exceptions in the threaded class will cause the process space to terminate. Your last chance to catch and log error information is in the threaded class. Generally I use a try/catch in the Run method to catch all exceptions, report them to diagnostics and then re-throw the expectation – which is not different than what I would implement in the default single threaded worker role.
Summary
The included framework enables multi-threaded Windows Azure worker roles within the same context of the Windows Azure Service Runtime. Using multiple threads can increase your processor utilization when your application is waiting for network requests – like calls to SQL Azure to complete.
{6230289B-5BEE-409e-932A-2F01FA407A92}
Thanks for this code example and well written post.
ReplyDeleteI get this error when I try to start it in Debug mode:
Microsoft.WindowsAzure.ServiceRuntime Critical: 201 : Role entrypoint could not be created:
System.MissingMethodException: Cannot create an abstract class.
Any idea, what can be the problem?
/Thomas
I ran into the same problem and it's partially our fault. If we had included these classes as a seperate assembly, everything would have worked fine, but we pasted them directly into the worker role project. Azure seeks the worker role assembly for the first class that derives from RoleEntryPoint, and tries to load the abstract class in this library.
ReplyDeleteThat said, Azure really should try to load an Abstract class, but once you've moved it out into a classlibrary project, reference it, things will be fine
Perfect!
DeleteI have this working under local dev mode... awesome! I deployed it into Azure... and NOTHING happens...! I even added a tracer/dblog entry in the code under the Run() and OnStart().. still nothing.. Any ideas?
ReplyDeleteI am also facing similar problem, it works in local dev mode, but on azure, its not working, services, keeps retrying
ReplyDeleteIt is not working. Looks good but its just not working :(
ReplyDeleteSo why aint it working on Azure but only in local dev mode?
ReplyDeleteAfter downloaded the source code, I realized that the ThreadedRoleEntryPoint is an abstract class and the WorkerEntryPoint isn´t. However, the autor wrote in his post that the WorkerEntryPoint should be an abstract class. This is the cause of the MissingMethodException thrown when implementing this. Changing this has solved the problem for me.
ReplyDeleteAny thoughts on how make the web.config settings accessible to the worker threads?
ReplyDeleteHello,
ReplyDeleteWhat is worker Role?
Tell me some examples on Worker Rlole in Windows Azure AND also How to deploye this worker role in to windows azure?
Olz help me I am new in Cloud
When you restart dead threads you're calling Run() instead of ProtectedRun()
ReplyDeleteif (!Threads[i].IsAlive)
{
Threads[i] = new Thread(Workers[i].Run); <----
Threads[i].Start();
}
There's something I don't get from this code. Inside the OnStop() method of ThreadedRoleEntryPoint class the threads on which the ProtectedRun() method is running (i.e. where the Run() method of threaded roles is running) is aborted, and after that the OnStop() method is called for each threaded role. The way I understand worker roles life cycle, the OnStop() should be called first, wait for some time and if the Run() thread is still running, then it must be aborted. Is this correct?
ReplyDeleteI found the code you wrote here useful.I would like to know if you are OK with me using it without restriction, for example, under MIT licensing? If not, what licensing applies.
ReplyDeleteLinks to Overview of the Role Lifecycle are no longer working
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete