Microsoft CCR: clean way to write parallel code in .Net

Having to parallelize almost every bit of code here at Delver, some common patterns emerged while we wrote a lot of back-end code.
I remember reading about a new framework from Microsoft Robotics Division called “Microsoft CCR” (Concurrency and Coordination Runtime) a few months ago in the “concurrent affairs” column at the MSDN Magazine but I didn’t pay much attention to it at the time. Two weeks ago, it jumped back to my mind so I revisit the article and started diving a little deeper into it, thinking about what sort of problems it can solve in my world and if it does, where could I use it to our benefit. If you don’t know anything about the CCR, there is great public content published already like the CCR User Guide but I’ll try to give you a 2 minutes intro of the general architecture. The way CCR is built is very much like the SOA world, using messages to communicate between “services”. The major components are the Dispatcher which is actually an array of OS Threads, the DispatcherQueue which holds a queue of delegates to run so the Dispatcher can “look” at the DispatcherQueue and when it has a free OS Thread available, it pulls one delegate out of the queue and run it. So far – we’ve got a classic ThreadPool. There are some differences but I’ll let you read about it in the User Guide. The third component, which is the most important one is Port. Think about Port as a pipe that can receive messages (FIFO style – first in, first out) and hold them until someone will know what to do with them. The last component is the “manager”, the Arbiter; Arbiter expose a set of methods that allows you to listen to a given pipe and if some conditions are met on the messages the pipe contains, we can take the message(s) and transform them into a runnable delegate placed in the DispatcherQueue.


One of the goals for this library is the make sure you’ve got much less places to go wrong, by exposing a set of common async patterns you can easily use to guarantee clean(er) code that is easier to read. Think about sending messages from one pipe to another, creating a flow-based code via messages rather than spaghetti code with a lot of messy indentation. This is a very powerful architecture.
Obviously, the entire CCR framework is thread-safe by design so no need to protect the library. A simple example:



using (Dispatcher dispatcher = new Dispatcher(5, “my_dispatcher”)) //5 OS threads
using (DispatcherQueue queue = new DispatcherQueue(“my_queue”, dispatcher))
{
    Port<Uri> urlsPort = new Port<Uri>();
    
    Arbiter.Activate(queue,
        Arbiter.Receive(true, urlsPort, delegate(Uri uri) {
            // some code(run async!): for example we can fetch the uri content(HTML) and persist it to your Hard Disk..
         })
    );

    urlsPort.Post(new Uri(“http://www.lnbogen.com”));
    urlsPort.Post(new Uri(“http://www.delver.com”));
}


There is no need to create an array of Threads and some sort of Queue<Uri> in order to pull the items. The “ThreadPool” is implemented as part of the CCR.
So far no big deal right? well, it turns out that you can easily write common patterns with much less complexity: almost no locks, less spaghetti code and much less code in general.


One of the patterns we (all of us) use a lot is the “execute-and-wait-till-finish” pattern where you’ve got a list of independent items you want to run in parallel, but you want your main thread to wait for them to finish before continue. The simplest way to achieve it is by creating an array of Thread, activating them with a delegate and then call Join() on each one of the Threads. Let’s define a few more requirements for this pattern:



  1. We want to be able to know about all the errors that occurred during the execution.
  2. We want to be able to set a timeout so each operation(inside a Thread) won’t take more than a sensible time.
  3. We want to be able to know which items were timed out and when.
  4. We want to be able to know which items were completed successfully.
  5. BONUS: We want to avoid writing the obvious. 

Well, in order to implement these requirements from scratch, we need to use an outer timer with some sort of List (for example) so each thread will “register” to it when it begins and “unregister” when it’s done. The timer should be able to interrupt the thread and be optimized to “wake up” as soon as possible (determined by the registered threads and which thread needs to wake up first(next in line)). In addition, we need some sort of List of exceptions to collect all the exceptions that occurred and make sure we lock shared objects. We’ll need to use Thread[] and some sort of Queue to enqueue\dequeue items to\from it. A lot of code for a simple pattern.

With Microsoft CCR it’s much easier.
Assuming that we want to handle a list of strings:


List<string> messagesToWorkOn = new List<string>();
for (int i = 0; i < 10; i++)
   messagesToWorkOn.Add(“message “ + i);


Here is the final API I’ve implemented on top of the CCR:


using (SpawnAndWaitPattern<string> saw = new SpawnAndWaitPattern<string>(5, “mythreadspool”))
{
   AggregatedResult<string> result = saw.Execute(messagesToWorkOn, TimeSpan.FromMilliseconds(500),
                                             delegate(string msg)
                                             {
                                                if (msg.Contains(“1”))
                                                   Thread.Sleep(2000); // simulate time-out
                                                else if (msg.Contains(“5”))
                                                   throw new ArgumentException(“5 is a very bad value…”); // simulate exception
   
                                                Console.WriteLine(“The message: “ + msg + ” processed successfully.”);
                                             });

   Console.WriteLine(“Done!”);
   Console.WriteLine(“Summarized Report:\n completed results: “ 
      + result.SuccessfulItems.Count + “\n exceptions occurred: “ + result.FaultedItems.Count 
      + “\n timed-out results: “ + result.TimedoutItems.Count);
}


We’ve got 5 OS Threads, we’re waiting for up to 0.5 second per item and we’ve got a full result object, holding all the requirements from above.


The code of SpawnAndWaitPattern class is quite neat and contains 0 locks (on my behalf, the CCR manage its own locks). The CCR schedule the tasks for me; combining it with thread-safe Port and we’ve got a very powerful yet simple framework at our hands. I decided to attach the entire solution (with A LOT of nice-green-comments) including the Ccr.Core.dll file so you could play with it:


CcrPatterns.rar (162.64 KB)


Have fun.

 

Making WCF Proxy useable

The way WCF proxies are designed is to live until shi* happens.

Let’s assume that we have a CalcualtorService with one method named Divide(int a, int b). Sasha, a cool programmer-dude, trying to produce some usefull software writes:


public MyCalcualtorForm : Form {

   private
CalculatorProxy _calc = new CalculatorProxy();

   Calc_Click(…) {
      _calc.Divide(firstNumber, secondNumber);
   }
}


What is the first error you can think of that could happen? Yep, DivideByZeroException.
Once the proxy gets an exception, it enters into a “Faulted” state which makes the proxy unusable(=you cannot use it again).
The quickest solution is to work “by the book” and create a new instance each and every time we need to call the service:


Calc_Click(…) {
   using (CalculatorProxy calc = new CalculatorProxy())
      calc.Divide(firstNumber, secondNumber);
}


But what’s bad in this solution?



  1. Performance – you pay (not a lot but neither little) for each creation of the proxy. Sure, it will probably not be your bottleneck, but heck, why is it useful? Most of the time the proxy will not throw an exception and yet we need to create it every time just to avoid the faulted state scenario. 
  2. Design – If we declare this exception BY CONTRACT, I would expect that the proxy will still be usable afterwards. Do we really want to return Enum\int\string as status instead of throwing exception just because of poor design?
  3. TDD – you know that I’m in love with it. Now imagine Dependency Injection. Component A recieve ICalculatorProxy and use it to… calculate something. Working “by the book” is no good as we want to recieve an instance of the proxy from the outside in order to mock it. Right, so we inject a proxy from the outside (got to love Windsor) and life is pretty sweeet. Darn! Wait! one poor (even by design) exception and our proxy goes dead. Very un-TDDish of Microsoft.

I had to come with a solution as no one will take TDD away from me. I present to you ProtectedProxy: this little IL-code-at-the-end-of-the-day will able you to recover from faulted state by creating a new proxy on each exception thus making your proxy… useable (couldn’t think about a better word to describe it). Think about a situation where your proxy is trying to call the service but the service is down; In Semingo, we decided that we want to keep trying until the service is up. Via ProtectedProxy, you can determine how many times do you want to recover and when you should finally kill the proxy. Oh yea, ProtectedProxy uses Windsor in order to create new proxies if needed and logging messages to log4net. Good stuff.


In the example above, all Sasha had to do was to:
1). Initialize the _calc field by:
        ProtectedProxy<ICalculatorProxy> _calc = new ProtectedProxy<ICalculatorProxy>(new CalculatorProxy());
2). call _calc via:
        _calc.Instance.Divide(firstNumber, secondNumber);

But enough said, code please:


// Written by Oren Ellenbogen (07.08.07) – trying to protect our proxies so they could recover from:
// (A) The service is not up yet, but we want to try again later.
// (B) The service throws (ANY) exception, we still want our proxy to function (bubble the exception, but still keep on working).
// Microsoft intended to use a NEW proxy per call, but for TDD this is not ideal as we would like to inject proxies from outside as mocks
// in order to simulate multiple scenarios.

#region using

using System;
using System.Reflection;
using System.ServiceModel;
using Castle.Core.Resource;
using Castle.Windsor;
using Castle.Windsor.Configuration.Interpreters;
using log4net;

#endregion

namespace Semingo.Services.Proxies.Helpers
{
   
   public interface IProxy : ICommunicationObject { 
      bool Ping();
   }


   /// <summary>
   /// Protect proxy from entering Faulted state by re-creating the proxy via Windsor Container on Faulted.
   /// IMPORTANT: that even if a fatal exception is raised by the service (for example: the service is not up yet), the proxy will be raised again. 
   /// Use it wisely (TIP: you CAN determine the number of ‘recovery’ attempts).

   /// </summary>
   /// <typeparam name=”I”>The proxy interface to protect</typeparam>
   /// <remarks>
   /// The way WCF works is that ANY exception on the service will cause the proxy to enter “faulted” state which means you can not use it anymore.
   /// Imagine a service of CalculatorService that expose the method float Divide(int a, int b). Sending b=0 will raise an exception in the service
   /// and the proxy will get into faulted state. This is not ideal as the proxy itself should be used again.
   /// </remarks>
   public class ProtectedProxy<I> : IDisposable
      where I : IProxy
   {
      private static readonly ILog _logger = LogManager.GetLogger(MethodBase.GetCurrentMethod().DeclaringType); 

      private I _instance;
      private readonly IWindsorContainer _container;
      private int _faultedCounter = 0;
      private bool _disposed = false;
      private const int AlertableNumberOfFaultedTimes = 10; 

      public ProtectedProxy(I instance)
         : this(CreateXmlBasedWindsorContainer(), instance)
      {   
      } 

      public ProtectedProxy(IWindsorContainer container, I instance)
      {
         _container = container;
         ShieldInstance(instance);
      } 

      /// <summary>
      /// Returns the number of faults this proxy had so far
      /// </summary>
      public int NumberOfFaults
      {
         get { return _faultedCounter; }
      } 

      public I Instance
      {
         get
         {
            ThrowIfInstanceAlreadyDisposed(); 

            if (_instance.State == CommunicationState.Faulted || _instance.State == CommunicationState.Closed || _instance.State == CommunicationState.Closing)
               {
                  _logger.Warn(“Notice: The proxy state is invalid (“ + communicationObj.State + “). The Faulted event should have been raised and handle this state – this need to be checked.”);
                  HandleFaultedInstance();
               }

            return _instance;
         }
      } 

      public void Close()
      {
         Dispose();
      } 

      private void ThrowIfInstanceAlreadyDisposed()
      {
         if (_disposed)
            throw new ObjectDisposedException(“The protected proxy for the type: “ + _instance.GetType().FullName + ” was closed. Cannot return a live instance of this type.”);
      } 

      private void ShieldInstance(I instance)
      {
         _instance = instance; 
         _instance.Faulted += delegate { HandleFaultedInstance(); };
      } 

      private void HandleFaultedInstance()
      {
         ThrowIfInstanceAlreadyDisposed(); 

         _faultedCounter++; 

         if (_faultedCounter >= AlertableNumberOfFaultedTimes)
            _logger.Warn(“ALERT! The proxy for the type “ + _instance.GetType().FullName + ” got faulted for the “ + _faultedCounter + ” time. Recreating the proxy but we must verify if this is valid.”);
         else if (_logger.IsDebugEnabled)
            _logger.Debug(“Proxy for type “ + _instance.GetType().FullName + ” got faulted (current state: “ + ((ICommunicationObject)_instance).State + “) – recreating the proxy. Number of faulted instances so far: “ + _faultedCounter + “.”); 

         ProxyHelper.CloseProxy(_instance); // close current proxy
         ShieldInstance(_container.Resolve<I>()); // re-create the proxy, faulted proxies are no good for further use.
      } 

      private static IWindsorContainer CreateXmlBasedWindsorContainer()
      {
         try
         {
            return new WindsorContainer(new XmlInterpreter(new ConfigResource(“castle”)));
         }
         catch (Exception err)
         {
            _logger.Warn(“Unable to create xml based windsor container, using empty one.”, err);
            return new WindsorContainer(); // for testing (the proxy will be mocked anyway).
         }
      } 

      #region IDisposable Members 

      ///<summary>
      ///Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.
      ///</summary>
      public void Dispose()
      {
         Dispose(true);
         GC.SuppressFinalize(this);
      } 

      protected virtual void Dispose(bool disposing)
      {
         _logger.Info(“Attempting to dispose the protected proxy for the type: “ + _instance.GetType().FullName + “, disposed already? “ + _disposed); 

         if (_disposed) return

         if (disposing)
         {
            ProxyHelper.CloseProxy(_instance);
         } 

         _disposed = true;
      } 

      #endregion
   }


   public static class ProxyHelper
   {
      private static readonly ILog _logger = LogManager.GetLogger(MethodBase.GetCurrentMethod().DeclaringType); 

      public static void CloseProxy(object proxy)
      {
         if (proxy == null) return
         CloseProxy(proxy as ICommunicationObject);
      } 

      /// <summary>
      /// Close the proxy in a safe manner (will not throw exception)
      /// </summary>
      /// <param name=”proxy”>The proxy to close</param>
      public static void CloseProxy(ICommunicationObject proxy)
      {
         if (proxy == null) return

         try
         {
            if (proxy.State == CommunicationState.Closing || proxy.State == CommunicationState.Closed || proxy.State == CommunicationState.Faulted)
               proxy.Abort();
            else
               proxy.Close();
         }
         catch (CommunicationException)
         {
            proxy.Abort();
         }
         catch (TimeoutException)
         {
            proxy.Abort();
         }
         catch (Exception err)
         {
            _logger.Error(err);
            proxy.Abort();
         }
         finally
         {
            proxy = null;
      }
   }
}


Hours of joy…


Almost forgot, on the next post – “How to TDD WCF code” – stay tuned…

 

Bring lock back until 12, it has a busy day tomorrow

One of the downsides of using lock is obviously performance. While locking an object, any other thread trying to acquire the lock on that object will wait in line. This can open up a deep hole to performance hit. Rule of thumb while working with locks is to acquire it as late as possible and release it as soon as possible. To demonstrate the order of magnitude bad usage of locks can affect your performance, I decided to write a little demo. 
So let’s assume we have a component that is responsible for executing tasks while getting new ones in the process (on different threads). I tried to make this example as simple as possible. Let’s start with our “task” class:


public class Task
{
   private int _id;
   private string _name;

   public Task(int id, string name) {
      _id = id;
      _name = name;
   }

   public int Id { // getter, setter }
   public string Name { // getter, setter }
}


We have a TasksRunner that’s responsible for getting new tasks and saving it to internal list and executing the current tasks every X milliseconds (via timer). In order to simulate a real-life process, I’ve made sure that executing a single task is expensive. Let’s start with the non-optimized solution:


public class TasksRunner
{
   private List<Task> _tasks;
   private System.Timers.Timer _handleTasksTimer;

   public TasksRunner()
   {
      _tasks = new List<Task>();

      _handleTasksTimer = new Timer(200); 
      _handleTasksTimer.Elapsed += new System.Timers.ElapsedEventHandler(_handleTasksTimer_Elapsed);
      _handleTasksTimer.Start();
   }


   public void AddTask(Task t)
   {
      lock (_tasks)
      {
         _tasks.Add(t);
         Console.WriteLine(“Task added, id: “ + t.Id + “, name: “ + t.Name);
      }
   }

   //Execute the (delta) tasks in a thread from the ThreadPool
   private void _handleTasksTimer_Elapsed(object sender, System.Timers.ElapsedEventArgs e)
   {
      ExecuteCurrentTasks();
   }

   public void ExecuteCurrentTasks()
   {
      lock (_tasks)
      {
         foreach (Task t in _tasks)
            ExecuteSingleTask(t);
         
         _tasks.Clear();
      }
   }

   private void ExecuteSingleTask(Task t)
   {
      Console.WriteLine(“Handling task, id: “ + t.Id + “, name: “ + t.Name);
      Thread.Sleep(1000); //simulate long run
   }
}


AddTask will acquire the lock on _tasks and add the new task to the list while ExecuteCurrentTasks will acquire the lock (on _tasks) and simulate real execution on the task. Notice that during the execution, calling AddTask will wait until the current execution will be finished. Using Roy‘s ThreadTester, we can run the following in order to notice the behavior so far:


static void Main(string[] args)
{
   TasksRunner runner = new TasksRunner();
   ThreadTester threadTester = new ThreadTester();
   threadTester.RunBehavior = ThreadRunBehavior.RunUntilAllThreadsFinish;

   Stopwatch watch = new Stopwatch();
   watch.Start();

   int numberOfTasksToCreate = 100;
   threadTester.AddThreadAction(delegate
      {
         for (int j = 0; j < numberOfTasksToCreate; j++) 
         {
            runner.AddTask(new Task(j, “job “ + j));
            Thread.Sleep(100);
         }
      });

   threadTester.StartAllThreads(int.MaxValue); //wait, no matter how long

   Console.WriteLine(“Total time so far (milliseconds): “ + watch.ElapsedMilliseconds);
   Console.WriteLine(“Tasks added so far: “ + runner.TasksAdded);
   Console.WriteLine(“Tasks executed so far: “ + runner.TasksExecuted);
   Console.WriteLine(“Waiting for tasks to end…”);

   while (runner.TasksExecuted < numberOfTasksToCreate)
      Thread.Sleep(1000);

   runner.Shutdown();

   Console.WriteLine(“done!”);
   Console.WriteLine(“Total time so far (milliseconds): “ + watch.ElapsedMilliseconds);
   Console.WriteLine(“Tasks added so far: “ + runner.TasksAdded);
   Console.WriteLine(“Tasks executed so far: “ + runner.TasksExecuted);
}


Running this test will give us a very poor result for adding & executing 100 tasks takes around ~99 seconds.


No doubt, the lock on _tasks while executing each and every task in the list is too expensive as we’re depend on ExecuteSingleTask (which is expensive by itself). This way, each new task we’re trying to add must wait until the current execution is finished. An elegant solution to this problem, suggested by my teammate Tomer Gabel, is to use a temporal object to point to the current tasks thus freeing the lock much quicker. So here is an optimized version of ExecuteCurrentTasks:


public void ExecuteCurrentTasks()
{
   List<Task> copyOfTasks = null;
   lock (_tasks)
   {
      copyOfTasks = _tasks;
      _tasks = new List<Task>();
   }

   foreach (Task t in copyOfTasks)
      ExecuteSingleTask(t);
}


This little refactoring give us around ~11 seconds for adding & executing 100 tasks.


Smoking!

 

How to set a free TDD-enabled environment



Download and install the following:

 

Testing framework:


   NUnit (version: 2.4.1 or newer)

 

Code Coverage tool:

 

   NCover (version 1.5.8 or newer)

 

Visual Studio .Net integration tool:

 

   TestDriven.Net (version 2.7.* or newer)

 



Making it play together:

 

First of all, let me apologize for the lame example, it’s kinda late for me to get creative.

Let’s create a Class Library named “Calculator.Library.Tests” and write the following test in it:


[TestFixture]
public class CalculatorTests
{
   [Test]
   public void Divide_TwoValidNumbers()
   {
      Calculator c = new Calculator();

      int expected = 3;
      int actual = c.Divide(6, 2);

      Assert.AreEqual(expected, actual);
   }
}


 

Now we’ll create another project(Class Library) named “Calculator.Library” and write the following code in it:


public class Calculator
{
   public int Divide(int a, int b)
   {
      if (b == 0)
         throw new DivideByZeroException(“err”);

      return a / b;
   }
}




Quick run of our test (print to console output):


Put the cursor in the test-method(Add_TwoValidNumbers) or in the test-class(CalculatorTests) -> right-click -> Run Test(s).


Run tests and see results in a “nice”(depends on your definition for nice) UI:

 

Right-click on the project “Calculator.Library.Tests” -> Test With -> NUnit 2.4

 

View Code Coverage:

 

Right-click on the test-method\test-class\test-project -> Test With -> Coverage. A new application named NCoverExplorer will be open – there you could explore the coverage of the code. As default, you’ll see the coverage in your tests as well which is not interesting. This can be easily fixed by changing the settings in NCoverExplorer -> View -> Options… -> Exclusions (Tab) -> pattern: “*.Tests” -> Add.

 

You can see that we have 75% coverage at the moment:

 

ncoverexplorer.JPG

 

 

We can now add the following test to check the missed path (and then simply call Test With -> Coverage again):


[Test]
[ExpectedException(typeof(DivideByZeroException))]
public void Divide_TryToDivideByZero_ThrowsException()
{
   Calculator c = new Calculator();
   c.Divide(6, 0);
}


 

Easy, Fast, Free and (almost) Fully Integrated in my development environment.
Life is pretty good.

 

How to mock static class or static member for testing

One of the problems with static members or static classes is that you can’t mock them for proper unit-testing. Instead of taking this fact for granted, let’s demonstrate it. Assume that we have the following classes (please note, this is just an example, not production code or anything that I’ll be proud of later on):


public class CsvDataExtractor
{
    private string _documentPath;

    public CsvDataExtractor(string documentPath)
    {
        _documentPath = document;
    }

    public string ExtractFullName()
    {
        string content = CsvDocumentReader.GetContent(_documentPath);
        string fullName = // extract full name logic
        return fullName;
    }
}


[Test]
public void ExtractFullName_DocumentHasFirstNameAndLastNameOnly()
{
    CsvDataExtractor extractor = new CsvDataExtractor(“c:\test.csv”);
    
    string result = extractor.ExtractFullName();
    
    Assert.AreEqual(“ellenbogen”, result);
}


 

This test is not good as we can’t test the ExtractFullName by itself – we’re testing that CsvDocumentReader.GetContent works as well. In addition, we’re depend on external file because this is what CsvDocumentReader.GetContent expects to receive.

 


Here are our options to solve this dependency so we could test ExtractFullName method by itself:

 

1). Refactor the static class into “instance” class and implement some sort of IDocumentReader interface.


Now CsvDataExtrator can get IDocumentReader in its constructor and use it in ExtractFullName. We could mock the interface and determine the result we want to get from the mocked object. Here is the refactored version:


public interface IDocumentReader
{
   string GetContent(string documentPath);
}

public class CsvDataExtractor
{
   private string _documentPath;
   private IDocumentReader _reader;

   public CsvDataExtractor(IDocumentReader reader, string documentPath)
   {
      _documentPath = document;
      _reader = reader;
   }

   public string ExtractFullName()
   {
      string content = _reader.GetContent(_documentPath);
      string fullName = // extract full name logic
      return fullName;
   }
}

[Test]
public void ExtractFullName_DocumentHasFirstNameAndLastNameOnly()
{
   DocumentReaderStub reader = new DocumentReaderStub();
   reader.ContentToReturn = “oren,ellenbogen”;

   CsvDataExtractor extractor = new CsvDataExtractor(reader, “not important document path”);

   string result = extractor.ExtractFullName();

   Assert.AreEqual(“ellenbogen”, result);
}

internal class DocumentReaderStub : IDocumentReader
{
   public string ContentToReturn;
   public string GetContent(string documentPath) { return ContentToReturn; }
}


 

Pros: (1) We can use mocking framework (Rhino Mocks for example is a great choice, and I’m not getting payed for it. I swear) to create stubs\mocks really fast and almost with no code. (2) Static classes and static members can not be easily tested, so we can save a few painful minutes\hours to the our teammates by refactoring now. (3) This one relates to reason 1 – If we need to mock several methods of our static class CsvDocumentReader, this is a better solution as we can achieve it easily with mocking framework.
Cons: (1) It’s not always possible to refactor the original static class. For example, we can’t change Microsoft’s Guid static class to control NewGuid() method, assuming that we need to mock it.

 

2). Wrap the static class with an instance class and delegate calls:


We can create an interface that expose all the functionality of the static class. Then all we need to do is wrap it with a simple class that will delegate the calls to the static class:


// the interface should look exactly like the static class we want to wrap !
public interface IDocumentReader
{
   string GetContent(string documentPath);
}

// This class will simply delegate calls to the static class
public class CsvDocumentReaderWrapper : IDocumentReader
{
    public string GetContent(string documentPath)
    {
        return CsvDocumentReader.GetContent(documentPath); // call the original static class
    }
}



The CsvDataExtractor implementation and the tests are exactly the same as option 1.

 

Pros: You can enjoy all the Pros of option 1.

Cons: (1) Wrapping is costly performance and especially in maintenance. In addition, you suffer from all of the Cons of option 1.


 

3). Use “protected virtual” method to call the static member and override it:


public class CsvDataExtractor
{
   private string _documentPath;

   public CsvDataExtractor(string documentPath)
   {
      _documentPath = document;
   }

   public string ExtractFullName()
   {
      string content = GetDocumentContent();
      string fullName = // extract full name logic
      return fullName;
   }

    protected virtual string GetDocumentContent()
    {
        return CsvDocumentReader.GetContent(_documentPath);
    }
}

[Test]
public void ExtractFullName_DocumentHasFirstNameAndLastNameOnly()
{
   TestableCsvDataExtractor extractor = new TestableCsvDataExtractor(“not important document path”);
   extractor.ContentToReturn = “oren,ellenbogen”;

   string result = extractor.ExtractFullName();

   Assert.AreEqual(“ellenbogen”, result);
}

public class TestableCsvDataExtractor : CsvDataExtractor
{
    public string ContentToReturn;
    
    public TestableCsvDataExtractor(string documentPath) : base(documentPath)
    {
    }
    
    protected virtual string GetDocumentContent()
    {
        return ContentToReturn;
    }
}


Pros: (1) if we can’t refactor the original static class – it’s a very fast & simple  solution to use (.vs. wrapping it).

Cons: (1) Not a perfect solution from an “elegant code” prospective.

 

 

 

I love to use option 3 if refactoring is too hard to (or not possible to) achieve but no doubt, option 1 will give you the best results if you’re looking a few steps ahead.

 

Writing Thread Safety tests for instance data

In my last post, I wrote about Implementing a simple multi-threaded TasksQueue. This post will concentrate in how to test for Thread Safety of the queue. Reminder: our queue is used by multiple consumers which means that I must make sure that before each Enqueue\Dequeue\Count, a lock will be obtained on the queue. Imagine that I have 1 item in the queue and 2 consumers trying to dequeue this item at the same time from different threads: The first dequeue will work just fine but the second will throw an exception (dequeue from an empty queue). We’re actually trying to make sure that this queue works as expected in multi-threaded environment. So far about our goal.

So how can we test it?
Testing for the queue’s thread safety through testing of TasksQueue, the way it’s written now, can be quite hard and misleading. The ConsumeTask method calls dequeue inside a lock but what if we had a thread-safety-related-bug there? do we test only that the dequeue works as expected? not really. ConsumeTask (1) dequeue an item and then (2) “consume it”. We’re actually testing 2 behaviors\logics – this way, it’s really hard to test only for the queue’s thread safety. We should always test a single method for a specific behavior and eliminate dependencies. Only when we cover our basis, we can check for integration between multiple components (the underlying queue and the TasksQueue).


One way of allegedly achieving this goal is to create a decorator around the queue, let’s call it SafeQueue, which will encapsulate a queue and wrap it with thread-safe forwarding of the calls (it will lock some inner object and call the original queue). The SafeQueue could be tested then by its own and used by our TasksQueue. This will “enable” us to remove the locking in the TasksQueue and use Set\WaitOne instead of Pulse\Wait in order to notify our consumers on arrival of a new task: 


while (_safeQueue.Count == 0)
   Monitor.WaitOne();

// NOTICE: by the time we get here, someone could have pulled the last item from the queue on another thread!
string
item = _safeQueue.Dequeue();


WATCH OUT: This is a deadly solution that will make our TasksQueue break in a multi-threaded environment. Just like that, our code is not thread-safe anymore although we’re using a SafeQueue that expose (atomic) thread-safe methods\properties. This is exactly why instance state should not be thread-safe by default (more details at Joe Duffy’s post).


The locking of the queue should remain in our TasksQueue, but we should separate the dequeue part from the handling part and check each one by its own. We’ll check the dequeue part for thread-safety(assuming that the underlying queue was tested by itself) and the handling part for pure logic. We can now test that for X calls for enqueue we get the same X calls for dequeue.


Here is the refactored code:


private void ConsumeTask()
{
   while (true)
   {
      string task = WaitForTask();

      if (task == null) return; // This signals our exit

      try
      {
         // work on the task
      }
      catch (Exception err)
      {
        // log err & eat the exception, we still want to resume consuming other tasks.
      }
   }
}


protected virtual string WaitForTask()
{
   lock (_locker)
   {
      // if no tasks available, wait up till we’ll get something.
      while (_queue.Count == 0)
         Monitor.Wait(_locker);

      // try to put it outside of the lock statement and run the test(bellow)
      return
_queue.Dequeue(); 
   }
}


public virtual void EnqueueTask(string task)
{
   lock (_locker)
   {
      _queue.Enqueue(task);
      Monitor.Pulse(_locker);
   }
}


Now we can create a simple test for the thread safety by overriding both of the enqueue\dequeue methods:


internal class TestableTasksQueue : TasksQueue
{
   private static int _dequeueCount = 0;
   private static int _enqueueCount = 0;

   public TestableTasksQueue(int workerCount) : base(workerCount) {}

   protected override string WaitForTask()
   {
      string item = base.WaitForTask();
      Interlocked.Increment(ref _dequeueCount);
      return item;
   }

   public override void EnqueueTask(string task)
   {
      base.EnqueueTask(task);
      Interlocked.Increment(ref _enqueueCount);
   }

   public static int DequeueCount
   {
      get { return _dequeueCount; }
   }

   public static int EnqueueCount
   {
      get { return _enqueueCount; }
   }
}


The tricky part here is the test itself. Because of subtle multi-threading issues, we can’t actually know when 2 (or more) threads will try to dequeue on the same time, so we have to run this test enough times in order to detect bugs. Here is a little sample:


[TestFixture]
public class TasksQueueTests
{
   [Test]
   public void Counting_DequeueAndEnqueueCountsShouldBeEqual()
   {
      for (int j = 0; j < 1000; j++)
      {
         using (TestableTasksQueue queue = new TestableTasksQueue(5))
         {
            for (int i = 0; i < 100; i++)
               queue.EnqueueTask(“test” + i);
         }

         Assert.AreEqual(TestableTasksQueue.DequeueCount, TestableTasksQueue.EnqueueCount);
      }
   }
}


Well, it’s not that elegant, I know, but thread-safety is hard to test.
I would love to hear some suggestion from you regarding this issue.

 

Implementing a simple multi-threaded TasksQueue

In one of my tasks, I had to create a simple mechanism that will able my Team to enqueue tasks, while multiple threads will work on them on the background. I’ve used the Wait\PulseAll pattern in order to signal the workers when a new task is available. Here it goes:


public class TasksQueue : IDisposable
{
   private readonly object _locker = new object();
   private Thread[] _workers;
   private Queue<string> _tasksQueue = new Queue<string>();


Nothing fancy so far. Just notice that we’re holding a private lock object and an array of Thread that will represent our workers.
The constructor simple initialize the threads and activate them:


public TasksQueue(int workerCount)
{
   if (workerCount <= 0) throw new ArgumentException(“The number of working threads can not be equal or less than 0”, “workerCount”);
   
   _workers = new Thread[workerCount];
   for (int i = 0; i < workerCount; i++) { // Create and start a separate thread for each worker
      Thread workerThread = new Thread(ConsumeTask);
      workerThread.Name = “WorkerThread #” + i;
      _workers[i] = workerThread;
      workerThread.Start();
   }
}


I’ve bold the “ConsumeTask” method that each Thread activates. Let’s see what we have there:


private void ConsumeTask()
{
   while (true)
   {
      string task;
      lock (_locker)
      {
         // if no tasks available, wait up till we’ll get something.
         while (_tasksQueue.Count == 0) {
            // Monitor.Wait is equal to:
            // (1) Monitor.Exit(locker); (give others a chance to lock _locker) 
            // (2) wait for pulse 
            // (3) on pulse – Monitor.Enter(locker);
            Monitor.Wait(_locker); 
         }

         // The first working thread that will be pulsed will reacquire the lock on “locker”, thus
         // it’s impossible that 2 working threads will try to dequeue the same task.
         task = _tasksQueue.Dequeue();
      }

      if (task == null) return; // This signals our exit

      try
      {
         Console.WriteLine(DateTime.Now + ” Start Processing task: “ + task + ” on thread: “ + Thread.CurrentThread.Name);
         // Simulate a time-consuming task
         Console.WriteLine(DateTime.Now + “Done Processing task: “ + task + ” on thread: “ + Thread.CurrentThread.Name);
      }
      catch (Exception err)
      {
         // log err & eat the exception, we still want to resume consuming other tasks.
         //Debug.Fail(“Exception occurred while trying to consume the job (thread#: ” + Thread.CurrentThread.Name + “): ” + err.Message);
      }
   }
}


We’re looping(infinite) and trying to see if our queue is empty – if so, we’re waiting for a new task to arrive. Till then – the thread will be blocked(inactive). When we have a new task (or the queue is not empty), we dequeue a task and “process” it. Let’s see the Enqueue method:


public void EnqueueTask(string task)
{
   lock (_locker)
   {
      if (task != null)
         Console.WriteLine(DateTime.Now + ” Enqueue task: “ + task);

      _tasksQueue.Enqueue(task);
      Monitor.Pulse(_locker); // wake 1 worker (thanks Alan) to handle the new task
   
   }
}


This leaves us with disposing the all thing (killing our threads gracefully):


public void Dispose()
{
   EndCurrentWorkingThreads();
}

private void EndCurrentWorkingThreads()
{
   if (_workers == null || (_workers != null && _workers.Length == 0))
      return;

   // Signal the threads to shutdown gracefully
   for (int i = 0; i < _workers.Length; i++)
      EnqueueTask(null); //will shutdown the worker that receive a null task.

   // Wait up…
   foreach (Thread worker in _workers)
      worker.Join();
}


That’s it, but how can we use it you might ask – here it goes:


using (TasksQueue q = new TasksQueue(2))
{
   for (int i = 0; i < 10; i++)
      q.EnqueueTask(“Task “ + i);

   Console.WriteLine(“Waiting for completion…”);
}

Console.WriteLine(“All tasks done!”);

 

Application structure

Every new application we’re trying to raise from scratch, especially when it’s a big one, we’re drawn to the basic questions of how to structure our code so it will be easy to maintain, easy to extend and easy on the eyes(= it makes sense). This post is meant for teams with more than 4 programmers working on the source of a 2+ (human) years project. If you work alone and the client doesn’t really care, heck, you can do it in one big assembly and name it [your_name]Rules.


I’ve discovered along the years that it really bothers me to see unorganized solutions or bad naming. I call it “structure smell” and as you might have guessed, I’m a sensitive guy. I’ve structured my thoughts about the way I see things so I could use it later on as a reference for myself and for my Team. Before I’ll continue, keep in mind that most of these questions are philosophical, so there is no one holy answer, it’s just a matter of point of view. I tried to point out best practices based on my experience. In addition, instead of writing user-story\feature\requirement\bug fix\UI change\you-name-it, I would use the term “task” instead. I’ll even go one step further and say that a given task should be limited to 0.5-1.5 days so it would be easy to see progress over time(if you’re on the agile boat as I am) and help us focus on the domain\context we’re working at during the task.


Enough said, let’s get going:


“Should we build one big solution?”


The immediate answer on this one is absolutely not.
The quick reason behind it as no matter what you do, while working on a task, you usually don’t need all of the projects at the same time. I see no reason to compile so many projects if you’re working only on 2-3(or even 5-6) of them at a time. I know that Visual Studio .Net is smart enough to avoid needless compilation of projects that we’re not changed, but keep in mind that John, your teammate, is working on different tasks than you are which means that he can make some changes, checked them in and your next “Get Latest Version” might cause unnecessary compilation on your side. If you haven’t noticed(who am I kidding), VS.NET can become an heavy memory consumer for big solutions, add to it our beloved ReSharper(that must analyze all of the projects in the solution to give you smart ideas), it can get quite messy.


The second reason, is simplicity. Why looking at 40 projects when you need only just a few? sure, you can collapse them or even organize them in Solution Folders(in VS.NET 2005), but it’s much easier to keep the noise out.


“So How should we split our solutions and projects?”


On this scale of projects, it would not be a great idea to create projects based on layers (DataAccess project, Business layer project, UI project etc). This way, each layer(=ClassLibrary) would be filled with too many classes and in time, it will be hard to find your path in one project with more than 200 files in it. Another bad side effect for splitting the projects by layers is that it will narrow the way you think about solutions (to problems). Instead of trying to create pure OO components you’ll immediately start breaking one piece into “this is UI, this is BL, this is DAL” and possibly branch your code into the wrong assemblies by cold 0 or 1 decisions. Life is one big gray CLR.


So I’ll try to define the way I see solutions, projects and namespaces and how should we use them:


1). Solution represent a domain in your application.

Domain is a complete sub-system in the application. It’s much bigger than a single component and it’s usually bind a list of components into one large sub-system that we can address as one big black box. The sub-system expose interfaces to other sub-systems in the application.

If I had to develop Lnbogen’s Calendar for example, I would consider these sub-systems: Common, Engine, DataStorage, Site, Widgets. Each one of these sub-systems deserves it’s own solution.

2). Project is a component in that domain or a mini-sub-system in the application.

A component is a all-around solution in a specific domain. The consumer of the component expect it to perform its task from A-Z even if that requires some of interaction with other objects. It should be transparent to the component’s consumer. Let’s say that we have a Calendar component, I would like to be able to call myCalendar.CreateNewMeeting(user, [meeting details]…) without taking care of insert it to the database, update some sort of cache(if exists) or to trigger alerts manually in case of collision. I expect the component to provide a full solution to my problem. Obviously, we don’t expect the Calendar to save the meeting to the data storage by it’s own but rather to receive some sort of IDataSource that will take care of it, but that should be made behind the scene as the purpose is to expose complete functionality.

In addition, a project might be “Entities” or “Utilities” where in these scenarios, the project represent a mini-sub-system.


3). Namespace group components and types under the same domain or “logic context”

Namespaces allow us to group types that are logically belong to the same domain and create a proper hierarchy so the programmer could easily find is way around the available types.


“What about naming?”


Naming is crucial for a few reasons: (A) It ables us to quickly understand the purpose of an assembly\class\method as its consumers, (B) good naming of classes\methods => less documentation => more 1:1 between your docs & your code and (C) it helps you to keep the most important principle of coding – be proud of your (and your team’s) code. It’s a beautiful thing to see Semingo.[…]. I’m loving it!

Naming rules:
1). Name your solutions by the domain they represent.
2). Name your projects by the components or mini-sub-system they represent. Template: [CompanyName].[Application].[ComponentName\MiniSubSystem]
3). Name your namespaces by the domain they group (the types) by.


Example (Lnbogen’s Calendar):


Directories tree:

– Lnbogen
 – Calendar (root Directory)
   – build
      – v1.0
      – v1.1
      (etc)
   – tools
      (list of assemblies, exe or other 3rd party components you might use)
   – documents
   – db
      (maybe backup of database files for easy deployment)
   – src
      – Common
         | Common.sln
         – Lnbogen.Calendar.Entities 
         – Lnbogen.Calendar.Entities.Tests 
         – Lnbogen.Calendar.Utilities            
         – Lnbogen.Calendar.Utilities.Tests 
      – Engine
         | Engine.sln
         – Lnbogen.Calendar.Framework
         – Lnbogen.Calendar.Framework.Tests
         – Lnbogen.Calendar.TimeCoordinator
         – Lnbogen.Calendar.TimeCoordinator.Tests
         – Lnbogen.Calendar.RulesEngine
         – Lnbogen.Calendar.RulesEngine.Tests
         – Lnbogen.Calendar.Service (*1)
         – Lnbogen.Calendar.Service.Tests
      – DataStorage
         | DataStorage.sln
         – Lnbogen.Calendar.DataStorage.Framework
         – Lnbogen.Calendar.DataStorage.Framework.Tests
         – Lnbogen.Calendar.DataStorage.HardDiskPersisenceManager
         – Lnbogen.Calendar.DataStorage.HardDiskPersisenceManager.Tests
         – Lnbogen.Calendar.DataStorage.WebPersisteneceManger
         – Lnbogen.Calendar.DataStorage.WebPersisteneceManger.Tests
         – Lnbogen.Calendar.DataStorage.DatabasePersistenceManager
         – Lnbogen.Calendar.DataStorage.DatabasePersistenceManager.Tests
         – Lnbogen.Calendar.DataStorage.Service (*1)
         – Lnbogen.Calendar.DataStorage.Service.Tests
      – Site
         – Lnbogen.Calendar.UI
         – Lnbogen.Calendar.UI.AdminSite
         – Lnbogen.Calendar.UI.UserSite
      – Widgets
         – Lnbogen.Calendar.Widgets.Framework
         – Lnbogen.Calendar.Widgets.Interfaces (for plug-ins support)
         – Lnbogen.Calendar.Widgets.Service
         (more directories per widget)
      – Integration
         – Lnbogen.Calendar.Integration.InternalWorkflow.Tests 
         – Lnbogen.Calendar.Integration.ExternalWorkflow.Tests (test that the services we expose to the world work as expected)
      – References
         (here you should put all the dlls that you use as “file reference” in the various solutions)
         
*1: for example, this could be WCF wrapper of the underlying engine that enable other internal components to talk with the CalendarEngine\DataStorage as one complete component.


You can notice that I’ve chosen to drop the “Engine” or “Common” while selecting the name of the directories. “Common” is not really a domain but rather a logic separation of things that belong to many domains (usually all of them). “Engine” is the real deal, there is no Calendar without the engine right? So in this case I feel comfortable to drop the obvious (Lnbogen.Calendar.Framework won’t sound better as Lnbogen.Calendar.Engine.Framework).


Solution structure:


In VS.NET 2005, there is a nice feature named “Solution Folder” (right-click on the solution->Add->New Solution Folder) which is a lovely way to group projects. The Solution Folder is a virtual folder(you won’t see it on your HD) so you don’t have to get worried about too much nesting. 

Here is the pattern I love to use, demonstrated on the Engine.sln:

Engine (Solution)
   _Core (Solution Folder) (*2)
      – Lnbogen.Calendar.Framework
      – Lnbogen.Calendar.TimeCoordinator
      – Lnbogen.Calendar.RulesEngine
      – Lnbogen.Calendar.Service
   Tests (Solution Folder)
      – Lnbogen.Calendar.Framework.Tests
      – Lnbogen.Calendar.TimeCoordinator.Tests
      – Lnbogen.Calendar.RulesEngine.Tests
      – Lnbogen.Calendar.Service.Tests
   ExternalComponents (Solution Folder)
      – Lnbogen.Calendar.Entities (via “Add existing project”)
      – Lnbogen.Calendar.Utilities (via “Add existing project”)
   3rdPartyComponents (Solution Folder)
      – (Open Source projects that I might use will go here)
   Solution Items
      (add any dll that you use as file reference in this solution)


*2: The reason I’m using “_” is to make sure it’s the first Solution Folder. I just think it’s more productive way of looking on your projects. I use the same thing for my interfaces and call the file that contains them _Interfaces.cs.



On the next post, I’ll try to focus on strong naming and versioning of assemblies.

 

ClientSideExtender, version 0.0.0.1

Damn, it was so much fun to play a little with TDD and abstract the lousy API given by Microsoft to register client-side script. I’ll write about the process and design changes I’ve made due to testability reasons. TDD is a great design tool, it’s amazing to witness the “before” and “after” of your code, all because of the requirements to test things separately.


Here are a few API samples, taken from the Demo project (you can play with it and see the results):


public partial class _Default : Page
{
   protected void Page_Load(object sender, EventArgs e)
   {
      ClientSideExtender script = ClientSideExtender.Create(this);

      script.RegisterMethodCall(“alert”).WithParameters(“hello world!”).ToExecuteAt(Target.EndOfPage);

      script.RegisterVariable<string>(“myStringVar”).SetValue(“test”).ToExecuteAt(Target.EndOfPage);
      script.RegisterVariable<int>(“myIntegerVar”).SetValue(5); // Target.BeginningOfPage as default

      script.RegisterScriptBlock(“alert(‘proof of concept – state:’ + document.readyState);”).ToExecuteAt(Target.PageLoaded);
   }
}


Keep in mind that I’m only supplying a different API (abstraction) of Microsoft’s implementation. In order to accomplish that, I’m using Windsor to wire the ClientSideExtender with the new ajaxian ScriptManager(supports UpdatePanel), which will actually be responsible to register the script under the hood. You can look at the web.config (under the <castle> element) for more details.


Source:
Lnbogen.Web.UI.zip (253.56 KB)

 

Using Rhino Mocks to Unit Test Events on Interfaces, take 2

Phil Haack gives an excellent example of how to test a registration to event while implementing MVP pattern (via Rhino Mocks). If you did not read it yet, take a pause for 2 minutes and give it a look; It is worth it, I promise.


The only thing that bothers me in the example is that the need to assert the triggering of the event kind of spoiled the code. I think that in those scenarios, it’s much better to use some sort of Stub and override the class under test in order to keep the it cleaner. So instead of having this code:


public class Presenter
{
   IView view;
   public Presenter(IView view)
   {
      this.view = view;
      this.view.Load += new EventHandler(view_Load);
   }

   public bool EventLoaded
   {
      get { return this.eventLoaded; }
      set { this.eventLoaded = value; }
   }

   bool eventLoaded;

   void view_Load(object sender, EventArgs e)
   {
      this.eventLoaded = true;
   }
}



I would have created something like this:


public class Presenter
{
   IView view;
   public Presenter(IView view)
   {
      this.view = view;
      this.view.Load += new EventHandler(view_Load);
   }

   protected virtual void view_Load(object sender, EventArgs e)
   {
       // production code here
   }
}


// This will go in the PresenterTests class
public class TestablePresenter : Presenter
{
   public bool WasEventLoaded = false;
   protected override void view_Load(object sender, EventArgs e)
   {
       WasEventLoaded = true;
   }
}



Now we can create an instance of TestablePresenter and Assert the WasEventLoaded field.
My guess is that Phil actually did something like this in his project and merely wanted to show an example, but I still thought it was important enough to demonstrate this separation as I firmly believe we must make sure that our need for tests will not actually hurt the design.