How to store billions of tasks?

Imagine that you’re living with one hell of a crazy wife. Every day she’s giving you bunch of tasks. “Mow the grass”, “water the plants”, “take out the garbage”, “replace the light in the kitchen”, “build a fence” etc. For every task you complete, she goes bananas and create 10,000 more on the spot, all related to the task itself (“build a fence” leads to “paint the fence”, “put a nice sign on the fence”, “practice your wax-on, wax-off” … you get the drift). Reluctant to perform all of these tasks but smart enough to know that you’ll lose at least 50% of your property (divorce are nasty), you start collecting these tasks (you write them on papers). You take one paper (single task), perform it and return to your wife with a big-fat smile of your face. She, in return, creates 10,000 new related tasks to the one you’ve just completed, write them on paper and put it in a BIG box. Every once in a while she’s not waiting for you and adding “main” tasks to the box by herself. After performing one task, you pick another one from the box (FIFO), without knowing if it’s a main task or not, you go to your way “eager” to perform that task as requested (as if you have a choice). This goes on and on and on…


The box is the tricky part here. Can one box hold billions of papers? hardly. So you start collecting boxes and now it’s getting harder as you need to add new boxes when needed, find the “right” box to pull tasks from and making sure these boxes won’t break (maintenance) with time.


Here are a few assumptions you can take as is:



  1. Your wife is looking for perfection but only in the “main” tasks which means that if you were to build the fence, you have to do it perfectly so each mini task related to it is crucial.
  2. Your wife tends to forget things so you can assume that occasionally, she’ll add new “main” tasks that were already performed or exist in a different box.
  3. You can’t throw away tasks “just because” as you don’t know if a “mini task” will be thrown by mistake (= your wife will be pissed off. 50% is gone.).
  4. Be cool – you won’t perform the same task twice. This one is on me.
  5. Drinking RedBull (or XL or whatever energy drink you’re familiar with) 24-7-365 you don’t need to rest. You don’t need to sleep. Think robot (funny combination, for 2007).

How to store billions of tasks?


I kinda like the “green feeling” of a forest. Oh right, we also need them in order to breath (El Gor is more convincing than I. Thank God). Most importantly, it costs a lot of money buying so many papers! And the boxes!  You’ll need a lot of green ones ($) – not trees but we can’t “breath” without them as well). Oh well… You’re starting to build a “Boxing mechanism”, hire a few guys to maintain them, getting a VC to give you some extra $$$ and after a few months\years you got it cover!


What do you do if you don’t have the extra $$$ or more important the extra time to develop this kind of storage system?


How to store billions of tasks? You don’t. You can’t
In most scenarios, when things seem too difficult to accomplish (with the given limits) try a different angle: “If you don’t like the answer, ask a different question”.


We know that each task creates a lot of new related tasks right? We also know that keeping those new tasks is the tricky part (the BIG box problem) so what else can be done? Let’s change the question. “How do I make my wife happy?” seems like a smarter question. If it’s expensive to save those tasks why not doing these related tasks on the spot instead of storing them in the box(es)?  How is this going to help us? Now we can throw away tasks because these tasks will be added later on (assumption #2. Thank God your wife is not the robot). Assuming that we can save about 1,000,000 papers in one big box – we’re all set. If the box is full, we’ll simply throw away new “main” tasks, feeling good as we KNOW that we’ll get to them later on (again, assumption #2). Now all we need is one simple box with a limited amount of papers. Less $$$ to waste and much simpler storage system to develop.



Crap, it just hit me. I’m doing some cool sh$% at Semingo! Join us!

Writing Thread Safety tests for instance data

In my last post, I wrote about Implementing a simple multi-threaded TasksQueue. This post will concentrate in how to test for Thread Safety of the queue. Reminder: our queue is used by multiple consumers which means that I must make sure that before each Enqueue\Dequeue\Count, a lock will be obtained on the queue. Imagine that I have 1 item in the queue and 2 consumers trying to dequeue this item at the same time from different threads: The first dequeue will work just fine but the second will throw an exception (dequeue from an empty queue). We’re actually trying to make sure that this queue works as expected in multi-threaded environment. So far about our goal.

So how can we test it?
Testing for the queue’s thread safety through testing of TasksQueue, the way it’s written now, can be quite hard and misleading. The ConsumeTask method calls dequeue inside a lock but what if we had a thread-safety-related-bug there? do we test only that the dequeue works as expected? not really. ConsumeTask (1) dequeue an item and then (2) “consume it”. We’re actually testing 2 behaviors\logics - this way, it’s really hard to test only for the queue’s thread safety. We should always test a single method for a specific behavior and eliminate dependencies. Only when we cover our basis, we can check for integration between multiple components (the underlying queue and the TasksQueue).


One way of allegedly achieving this goal is to create a decorator around the queue, let’s call it SafeQueue, which will encapsulate a queue and wrap it with thread-safe forwarding of the calls (it will lock some inner object and call the original queue). The SafeQueue could be tested then by its own and used by our TasksQueue. This will ”enable” us to remove the locking in the TasksQueue and use Set\WaitOne instead of Pulse\Wait in order to notify our consumers on arrival of a new task: 


while (_safeQueue.Count == 0)
   Monitor.WaitOne();

// NOTICE: by the time we get here, someone could have pulled the last item from the queue on another thread!
string
item = _safeQueue.Dequeue();


WATCH OUT: This is a deadly solution that will make our TasksQueue break in a multi-threaded environment. Just like that, our code is not thread-safe anymore although we’re using a SafeQueue that expose (atomic) thread-safe methods\properties. This is exactly why instance state should not be thread-safe by default (more details at Joe Duffy’s post).


The locking of the queue should remain in our TasksQueue, but we should separate the dequeue part from the handling part and check each one by its own. We’ll check the dequeue part for thread-safety(assuming that the underlying queue was tested by itself) and the handling part for pure logic. We can now test that for X calls for enqueue we get the same X calls for dequeue.


Here is the refactored code:


private void ConsumeTask()
{
   while (true)
   {
      string task = WaitForTask();

      if (task == null) return; // This signals our exit

      try
      {
         // work on the task
      }
      catch (Exception err)
      {
        // log err & eat the exception, we still want to resume consuming other tasks.
      }
   }
}


protected virtual string WaitForTask()
{
   lock (_locker)
   {
      // if no tasks available, wait up till we’ll get something.
      while (_queue.Count == 0)
         Monitor.Wait(_locker);

      // try to put it outside of the lock statement and run the test(bellow)
      return
_queue.Dequeue(); 
   }
}


public virtual void EnqueueTask(string task)
{
   lock (_locker)
   {
      _queue.Enqueue(task);
      Monitor.Pulse(_locker);
   }
}


Now we can create a simple test for the thread safety by overriding both of the enqueue\dequeue methods:


internal class TestableTasksQueue : TasksQueue
{
   private static int _dequeueCount = 0;
   private static int _enqueueCount = 0;

   public TestableTasksQueue(int workerCount) : base(workerCount) {}

   protected override string WaitForTask()
   {
      string item = base.WaitForTask();
      Interlocked.Increment(ref _dequeueCount);
      return item;
   }

   public override void EnqueueTask(string task)
   {
      base.EnqueueTask(task);
      Interlocked.Increment(ref _enqueueCount);
   }

   public static int DequeueCount
   {
      get { return _dequeueCount; }
   }

   public static int EnqueueCount
   {
      get { return _enqueueCount; }
   }
}


The tricky part here is the test itself. Because of subtle multi-threading issues, we can’t actually know when 2 (or more) threads will try to dequeue on the same time, so we have to run this test enough times in order to detect bugs. Here is a little sample:


[TestFixture]
public class TasksQueueTests
{
   [Test]
   public void Counting_DequeueAndEnqueueCountsShouldBeEqual()
   {
      for (int j = 0; j < 1000; j++)
      {
         using (TestableTasksQueue queue = new TestableTasksQueue(5))
         {
            for (int i = 0; i < 100; i++)
               queue.EnqueueTask(“test” + i);
         }

         Assert.AreEqual(TestableTasksQueue.DequeueCount, TestableTasksQueue.EnqueueCount);
      }
   }
}


Well, it’s not that elegant, I know, but thread-safety is hard to test.
I would love to hear some suggestion from you regarding this issue.

Application structure

Every new application we’re trying to raise from scratch, especially when it’s a big one, we’re drawn to the basic questions of how to structure our code so it will be easy to maintain, easy to extend and easy on the eyes(= it makes sense). This post is meant for teams with more than 4 programmers working on the source of a 2+ (human) years project. If you work alone and the client doesn’t really care, heck, you can do it in one big assembly and name it [your_name]Rules.


I’ve discovered along the years that it really bothers me to see unorganized solutions or bad naming. I call it ”structure smell” and as you might have guessed, I’m a sensitive guy. I’ve structured my thoughts about the way I see things so I could use it later on as a reference for myself and for my Team. Before I’ll continue, keep in mind that most of these questions are philosophical, so there is no one holy answer, it’s just a matter of point of view. I tried to point out best practices based on my experience. In addition, instead of writing user-story\feature\requirement\bug fix\UI change\you-name-it, I would use the term “task” instead. I’ll even go one step further and say that a given task should be limited to 0.5-1.5 days so it would be easy to see progress over time(if you’re on the agile boat as I am) and help us focus on the domain\context we’re working at during the task.


Enough said, let’s get going:


“Should we build one big solution?”


The immediate answer on this one is absolutely not.
The quick reason behind it as no matter what you do, while working on a task, you usually don’t need all of the projects at the same time. I see no reason to compile so many projects if you’re working only on 2-3(or even 5-6) of them at a time. I know that Visual Studio .Net is smart enough to avoid needless compilation of projects that we’re not changed, but keep in mind that John, your teammate, is working on different tasks than you are which means that he can make some changes, checked them in and your next “Get Latest Version” might cause unnecessary compilation on your side. If you haven’t noticed(who am I kidding), VS.NET can become an heavy memory consumer for big solutions, add to it our beloved ReSharper(that must analyze all of the projects in the solution to give you smart ideas), it can get quite messy.


The second reason, is simplicity. Why looking at 40 projects when you need only just a few? sure, you can collapse them or even organize them in Solution Folders(in VS.NET 2005), but it’s much easier to keep the noise out.


“So How should we split our solutions and projects?”


On this scale of projects, it would not be a great idea to create projects based on layers (DataAccess project, Business layer project, UI project etc). This way, each layer(=ClassLibrary) would be filled with too many classes and in time, it will be hard to find your path in one project with more than 200 files in it. Another bad side effect for splitting the projects by layers is that it will narrow the way you think about solutions (to problems). Instead of trying to create pure OO components you’ll immediately start breaking one piece into ”this is UI, this is BL, this is DAL” and possibly branch your code into the wrong assemblies by cold 0 or 1 decisions. Life is one big gray CLR.


So I’ll try to define the way I see solutions, projects and namespaces and how should we use them:


1). Solution represent a domain in your application.

Domain is a complete sub-system in the application. It’s much bigger than a single component and it’s usually bind a list of components into one large sub-system that we can address as one big black box. The sub-system expose interfaces to other sub-systems in the application.

If I had to develop Lnbogen’s Calendar for example, I would consider these sub-systems: Common, Engine, DataStorage, Site, Widgets. Each one of these sub-systems deserves it’s own solution.

2). Project is a component in that domain or a mini-sub-system in the application.

A component is a all-around solution in a specific domain. The consumer of the component expect it to perform its task from A-Z even if that requires some of interaction with other objects. It should be transparent to the component’s consumer. Let’s say that we have a Calendar component, I would like to be able to call myCalendar.CreateNewMeeting(user, [meeting details]…) without taking care of insert it to the database, update some sort of cache(if exists) or to trigger alerts manually in case of collision. I expect the component to provide a full solution to my problem. Obviously, we don’t expect the Calendar to save the meeting to the data storage by it’s own but rather to receive some sort of IDataSource that will take care of it, but that should be made behind the scene as the purpose is to expose complete functionality.

In addition, a project might be “Entities” or “Utilities” where in these scenarios, the project represent a mini-sub-system.


3). Namespace group components and types under the same domain or “logic context”

Namespaces allow us to group types that are logically belong to the same domain and create a proper hierarchy so the programmer could easily find is way around the available types.


“What about naming?”


Naming is crucial for a few reasons: (A) It ables us to quickly understand the purpose of an assembly\class\method as its consumers, (B) good naming of classes\methods => less documentation => more 1:1 between your docs & your code and (C) it helps you to keep the most important principle of coding – be proud of your (and your team’s) code. It’s a beautiful thing to see Semingo.[...]. I’m loving it!

Naming rules:
1). Name your solutions by the domain they represent.
2). Name your projects by the components or mini-sub-system they represent. Template: [CompanyName].[Application].[ComponentName\MiniSubSystem]
3). Name your namespaces by the domain they group (the types) by.


Example (Lnbogen’s Calendar):


Directories tree:

- Lnbogen
 - Calendar (root Directory)
   - build
      - v1.0
      - v1.1
      (etc)
   - tools
      (list of assemblies, exe or other 3rd party components you might use)
   - documents
   - db
      (maybe backup of database files for easy deployment)
   - src
      - Common
         | Common.sln
         - Lnbogen.Calendar.Entities 
         - Lnbogen.Calendar.Entities.Tests 
         – Lnbogen.Calendar.Utilities            
         - Lnbogen.Calendar.Utilities.Tests 
      - Engine
         | Engine.sln
         - Lnbogen.Calendar.Framework
         - Lnbogen.Calendar.Framework.Tests
         - Lnbogen.Calendar.TimeCoordinator
         - Lnbogen.Calendar.TimeCoordinator.Tests
         - Lnbogen.Calendar.RulesEngine
         - Lnbogen.Calendar.RulesEngine.Tests
         - Lnbogen.Calendar.Service (*1)
         - Lnbogen.Calendar.Service.Tests
      - DataStorage
         | DataStorage.sln
         - Lnbogen.Calendar.DataStorage.Framework
         - Lnbogen.Calendar.DataStorage.Framework.Tests
         - Lnbogen.Calendar.DataStorage.HardDiskPersisenceManager
         - Lnbogen.Calendar.DataStorage.HardDiskPersisenceManager.Tests
         - Lnbogen.Calendar.DataStorage.WebPersisteneceManger
         - Lnbogen.Calendar.DataStorage.WebPersisteneceManger.Tests
         - Lnbogen.Calendar.DataStorage.DatabasePersistenceManager
         - Lnbogen.Calendar.DataStorage.DatabasePersistenceManager.Tests
         - Lnbogen.Calendar.DataStorage.Service (*1)
         - Lnbogen.Calendar.DataStorage.Service.Tests
      - Site
         - Lnbogen.Calendar.UI
         - Lnbogen.Calendar.UI.AdminSite
         - Lnbogen.Calendar.UI.UserSite
      - Widgets
         - Lnbogen.Calendar.Widgets.Framework
         - Lnbogen.Calendar.Widgets.Interfaces (for plug-ins support)
         - Lnbogen.Calendar.Widgets.Service
         (more directories per widget)
      - Integration
         - Lnbogen.Calendar.Integration.InternalWorkflow.Tests 
         - Lnbogen.Calendar.Integration.ExternalWorkflow.Tests (test that the services we expose to the world work as expected)
      - References
         (here you should put all the dlls that you use as “file reference” in the various solutions)
         
*1: for example, this could be WCF wrapper of the underlying engine that enable other internal components to talk with the CalendarEngine\DataStorage as one complete component.


You can notice that I’ve chosen to drop the “Engine” or “Common” while selecting the name of the directories. “Common” is not really a domain but rather a logic separation of things that belong to many domains (usually all of them). “Engine” is the real deal, there is no Calendar without the engine right? So in this case I feel comfortable to drop the obvious (Lnbogen.Calendar.Framework won’t sound better as Lnbogen.Calendar.Engine.Framework).


Solution structure:


In VS.NET 2005, there is a nice feature named “Solution Folder” (right-click on the solution->Add->New Solution Folder) which is a lovely way to group projects. The Solution Folder is a virtual folder(you won’t see it on your HD) so you don’t have to get worried about too much nesting. 

Here is the pattern I love to use, demonstrated on the Engine.sln:

Engine (Solution)
   _Core (Solution Folder) (*2)
      - Lnbogen.Calendar.Framework
      - Lnbogen.Calendar.TimeCoordinator
      - Lnbogen.Calendar.RulesEngine
      - Lnbogen.Calendar.Service
   Tests (Solution Folder)
      - Lnbogen.Calendar.Framework.Tests
      - Lnbogen.Calendar.TimeCoordinator.Tests
      - Lnbogen.Calendar.RulesEngine.Tests
      - Lnbogen.Calendar.Service.Tests
   ExternalComponents (Solution Folder)
      - Lnbogen.Calendar.Entities (via “Add existing project”)
      - Lnbogen.Calendar.Utilities (via “Add existing project”)
   3rdPartyComponents (Solution Folder)
      - (Open Source projects that I might use will go here)
   Solution Items
      (add any dll that you use as file reference in this solution)


*2: The reason I’m using “_” is to make sure it’s the first Solution Folder. I just think it’s more productive way of looking on your projects. I use the same thing for my interfaces and call the file that contains them _Interfaces.cs.



On the next post, I’ll try to focus on strong naming and versioning of assemblies.

ClientSideExtender, version 0.0.0.1

Damn, it was so much fun to play a little with TDD and abstract the lousy API given by Microsoft to register client-side script. I’ll write about the process and design changes I’ve made due to testability reasons. TDD is a great design tool, it’s amazing to witness the “before” and “after” of your code, all because of the requirements to test things separately.


Here are a few API samples, taken from the Demo project (you can play with it and see the results):


public partial class _Default : Page
{
   protected void Page_Load(object sender, EventArgs e)
   {
      ClientSideExtender script = ClientSideExtender.Create(this);

      script.RegisterMethodCall(“alert”).WithParameters(“hello world!”).ToExecuteAt(Target.EndOfPage);

      script.RegisterVariable<string>(“myStringVar”).SetValue(“test”).ToExecuteAt(Target.EndOfPage);
      script.RegisterVariable<int>(“myIntegerVar”).SetValue(5); // Target.BeginningOfPage as default

      script.RegisterScriptBlock(“alert(‘proof of concept – state:’ + document.readyState);”).ToExecuteAt(Target.PageLoaded);
   }
}


Keep in mind that I’m only supplying a different API (abstraction) of Microsoft’s implementation. In order to accomplish that, I’m using Windsor to wire the ClientSideExtender with the new ajaxian ScriptManager(supports UpdatePanel), which will actually be responsible to register the script under the hood. You can look at the web.config (under the <castle> element) for more details.


Source:
Lnbogen.Web.UI.zip (253.56 KB)

Fluent Interfaces – Let the API tell the story

In my last post about Creating a decent API for client side script registration, Eran raised a few great comments about the readability and proper usage of this style of coding. I decided to answer his questions with a post, as my comment started to fill enough paper to clean a Brazilian forest or two (well, in terms of a response).


Introduced by Martin Folwer, Fluent Interfaces ables the programmer to supply an API that can be used to build a genuine use-case in the system or just a complete logical query\request from a service. This coding style is quite different from the traditional 101 lessons in OOP school. The biggest benefit of Fluent Interface, in my opinion, is that you can read the code out load like the customer is talking to you instead of the programmer that wrote it. Sometimes it gets even better, you can read someone’s else code like she\he was next to you, explaining what she\he meant do do. My take is that using a method to describe use-case\action\query\request will be (almost)always better, in terms of readability, than using parameter(s) as you’ll need the IntelliSense to understand the latter. Here is a simple API, the first one is traditional OOP while the second one applies Fluent Interfaces. Please bare in mind that these samples were written just to set the ground for the difference between these two coding technique:


// take 1 – traditional style
public class ClientSideExtender
{
    public void CallMethod(string methodName, RunAt runScriptAt, bool ignoreExceptions, params object[] parameters);
}



// take 2 – Fluent Interfaces


public class ClientSideExtender
{
   public ScriptCommand CallMethod(string methodName);
}

public class ScriptCommand
{
     public ScriptCommand WithParameters(params object[] parameters);
     public ScriptCommand When(RunAt runScriptAt);
     public ScriptCommand IgnoreExceptions();
}


Assuming that we have a javascript method with this signature “markRow(rowId, shouldDisableOtherRows)”, here is how can one use these API to register client-side method call(accordingly):


clientSideExtender.CallMethod(“markRow”, RunAt.AfterPageLoaded, true, “5″, true);

clientSideExtender.CallMethod(“markRow”).WithParameters(“5″, true).When(RunAt.AfterPageLoaded).IgnoreExceptions();



Obviously, both API will create the same code eventually: <script …>markRow(“5″, true);</script>.
What I really love about Fluent Interfaces is that I don’t need the freakin’ IntelliSense in order to understand what “true” means as a parameter(the difference is marked in red). It ables me to read it out load – I want to call a client-side method named “markRow”, with 2 parameters, execute it after the page is loaded and wrap the entire thing to swallow exceptions (assume that someone else will take care of it). If you want to call a method that doesn’t get any parameter, don’t call to WithParameters method. You can always change the order of the calls if you see it fit (maybe calling IgnoreException before When).


One of the blames I hear(again and again) about Fluent Interfaces is that it “allows” programmers to abuse the code. “You can change the order of the calls or forget to call one and make a big mess” is a common response to the concept. To be totally honest, I don’t eat it as programmers can make a mess of pretty much anything. We’ve all been there, right? I agree that it requires some different way of thinking about creating & using API, but then again, so does learning a new programming principle, a design pattern or a coding techinque. It took several years until people started to chew TDD and accept the advantages of using it. My guess is that in ~1-2 years, Fluent Interfaces will be much more common in the way we’re writing and using code (LINQ rings a bell? well OK, leaving the “sql-like” synthetic sugar aside).



This leads me to my believe about designing Fluent Interface. I say – when appropriate, why not allowing the programmer to choose?
I would create two overloads for CallMethod, as shown above, and let the programmer decide which one she\he would like to use.


I would use Fluent Interface.

Errors from dummies

One of the keynote that my manager took from a clients tour in the US, jumped to my eyes: “100% of the users want to take actions in order to solve problems with the product. They are mad at us for giving them poor general error messages”. He followed his sentence by “they don’t know what to do after seeing a message ‘two ways communication failure’ or ‘unable to retrieve results for X’”. I took a pause and thought about this request and really started to think about how can we write better errors for our clients – errors that will ENABLE the users to solve bugs on their own!


After spending about 2 hours throwing ideas on my white board at the office, I came to the simple conclusion that its…. completely feasible!

Softwares contain bugs in it. You can hire the best programmers, the greatest QA guys, automate your testing, you name it. At the end of the day, your release will contain (more than one or two)bugs. So the problem is that the users can NOT do anything in order to solve those code-related bugs, right? Well, I really hope so because I really like my job. but how can we, the developers, give the user helpful error messages that will guide him to solve the bug that just happened or at least make him feel like one of the R&D team?


I decided to split the bugs into two categories:



  • Bugs that are solely code related: I’m sorry, but unless the user will have access to my source, he won’t be able to make the fix. Now, who the hack wants to give the users access to their code? No matter what you show to the user, you know that he can’t help you solving the problem right now. I believe that you can make him feel like he’s helping solving the problem right now. here is my solution for that scenario:

    • Show the user a simple “We sorry but it seems that an error occurred in our engine. We managed to log the error and our best guys are working on it! BUT, we need your help in order to make this bug go away FAST, will you help us?”.
    • After this message, show a little “form” with the title “Want to help us fix this bug quicker?” that will contains the following fields:

      • A short description of the form – just a label – “Please help us reproduce the actions you did so we will be able to fix this bug as soon as possible. We know that you have little time, if any, so we did most of the job for you by. It won’t take more than 30 seconds starting now…”
      • Bug title (write the action the users tried to do as default – for example “Edit user’s account”).
      • How serious is this bug for you: a simple select-list from 1(it bothers me but it’s not that important) to 5(I can’t work!).
      • Bug description (leave it empty) – let the user know he can leave it blank.
      • How to reproduce this bug (here is the catch: you’ll need to record the user actions in order to be able to do this) – write all the steps that the user did before he saw this bug. Let him edit the text with the flow (maybe he could add some value). Now, it’s critical that you’ll supply a good record mechanism as the user don’t want to work for you. If filling this form will take more than half a minute, forget it, he won’t help you.
      • BIG button – “I’m done, send it away”.

    • The message will be sent to the system administrator if the network is local or to the support service if the network is connected to the Internet. Attached to the user’s message you can add your personal log info (that depends on the application architecture of course, if you have Internet application you can log everything on the server anyway).
    • IMPORTANT: If the user sends the bug “help” form, and the bug solved quicker because of a good “how to reproduce” by the user and it was important to many users of your software - reward him and notify about it to the rest of the community (of your software). We want to let our “QA users” the incentives they deserve. I believe that letting the user “make some action” in order to help fixing bugs is better than just showing him “an error occurred, please contact your system administrator for further details”. (btw – who came with that fucked up idea that the world adopted so quickly?!)

  • Bugs that are configuration related or business related: Hey! we have a chance here to make our users dream come true! there is a good chance that they CAN fix the problem by their own two hands. If the error happened due to Foreign Key issues, for example - the user is trying to delete a record that is referenced by other records, tell him that he needs to delete “X, Y and Z” (give him links that those records) before deleting the current one and explain why(the user’s knows nothing about database or software “logic”). If it’s something the user forgot to do, meaning the business flow is incomplete, tell him what he needs to do NOW and direct him (supply links or automatically direct him the the current place\position). If the current session of the user was ended so you can not save his form, try to save the user’s time and save the form into the cache, let him log-in and then direct him to the previous page he was at and fill the page from the cache for him. Can you imagine how grateful he will be? If the user made a configuration mistake and gave bad data so now he can’t send e-mails or run some process – analyze what went bad and give him a proper message. Avoid general messages on those scenarios like “SMTP error…” or “unable to delete the row” just because he forgot to fill some not nullable field. What you are required to do is analyze the StackTrace of the exception and to match it with the recorded flow (as I mentioned in the previous category) and try to provide a really good action items for the users. Let me tell you this, users will appreciate your effort in giving them a full details about what went bad and how they can play a part of the software developers and make the software better.

Most importantly, don’t show developer errors to the user just so he could call you(or your service) and say “Hi, I’m a customer of yours and I’ve got a strange error ‘SqlException: bad syntax in line 291 near ;’” ;-).

Growing organism – based on a real life story

While I was sitting in McDonald’s today, eating some junk-burger, I looked across my table and noticed a beautiful girl. Man, she was something else. A true beauty. I looked at her walking to another table, 20 fit from my table and sitting down. In front of her was a woman. Try to guess on your own, how does the woman look like:


a. A true super-hot-mama, even more beautiful than her daughter(?).
b. Good looking woman.
c. An OK woman.
d. My GOD! What a monster ?!


I’ll let you ponder for a few seconds…


Now that I’m sure you’ve got the right answer, have you ever thought about these questions:


In 15 years, will this beautiful girl will look like her mother(?) ?
Is there any chance that god is just messing with my head and she’ll be a fine woman even if her mother(?) is a….
What about statistics? Anyone knows what are the odds for a beautiful girl to become a beautiful woman ?


And of course, a sick question to recap this self-debate:


What does her father think about this situation ?


For some strange reason, I started to think about applications. I know, I’m sick, no doubt, but bare with me: How many small beautiful applications you developed, become a monstrous Enterprise Applications that are eating their programmers for breakfast ? Go back to my questions and replace the word “girl” with “small application” and the word “mother” with “Enterprise Application”. I think that this symptom of “growing organism” exists in developing real applications and we encounter it every day when we maintain legacy code or patching-up a system that shouldn’t survive the prototype phase.


I guess that just like in life, with good monitoring of our state and good activities, this transformation, this growing, could still be graceful.


I just wonder what are the odds…

Safe events with .Net

This is another brilliant idea from Juval Lowy. I got to see him speak about unify ASP.NET and WinForms security models last week and during the session (which was really great by the way, but that’s for another day), I’ve notice this genius row:


public event EventHandler<EventArgs> Click = delegate {};


Isn’t it a beauty ?!


I know, it doesn’t ring a bell yet, so I’ll make the picture a bit clearer. But before we move forward, let’s take a few steps backward.


Starting from scratch, a reminder:


We want to raise our Click event.
I’m sure you are familiar with these lines:


private void OnClick(EventArgs e)
{
   if (this.Click != null) // check if the invocation list is empty
      this.Click(this, e);
}


The reason we do it is we we don’t want to raise an event with an empty invocation list; This will raise an exception, which is a big no-no if we can avoid it. simple ah ?


This implementation is problematic as between the check (if statement) and the activation the listener can be removed. A quick fix will look like this:


private void OnClick(EventArgs e)
{
   EventHandler<EventArgs> handler = this.Click;
   if (handler != null) // check if the invocation list is empty
      handler(this, e);
}



Ok, so it’s a little safer invocation now (I remember a version with locking, but the above will do for now).


Rewind:


Now let’s get back to Juval’s genius line I’ve showed before:


public event EventHandler<EventArgs> Click = delegate {};


attaching delegate{} to the invocation list will create (behind the scene) a class, with a random name, with an empty method (matching EventHandler<EventArgs> signature), with a random name.


Why is this so brilliant ? Because now we verified that the the event invocation list is
always filled with *at least* one (empty)listener.


We can’t clear the empty listener as we don’t have it’s instance (it was generated for us) nor it’s name.


The checks for not-empty invocation list can be thrown to the garbage. We now have a safe event, guaranteed.


This is certainly a best practice for working with events (following Framework Design Guidelines structure):


   Do   Consider attaching an empty delegate to your event.



Recap


public class MyClass
{
    public event EventHandler Click = delegate {};
    
    private void OnClick(EventArgs e)
    {
        // no need to check if (this.Click!=null)
        this.Click(this, e);
    }
}


nice and easy.



Extra: one more step for safe events:


If we’re talking about safe events, there is an advanced scenario in which you want the “eat” exception raised by one of the listeners, so the the other listeners will still be triggered. In the above, any exception in one of our listeners will stop the triggering of our invocation list.


The solution is quite simple also:


private void OnClick(EventArgs e)
{
   EventHandler<EventArgs> handler = this.Click;
   if(handler != null)
   {
      Delegate[] list = handler.GetInvocationList();

      foreach (Delegate del in list)
      {
         try
         {
            EventHandler<EventArgs> listener = (EventHandler<EventArgs>)del;

            listener(this, e);
         }
         catch { }
      }
   }
}


Juval even wrote a cool generic utility class named EventsHelper which you can get here.


Brilliant.

Developing SEE Infrastructure: Take a ride with me and practice your TDD

I’m having 1-on-1 coaching with Roy Osherove on TDD coming up in the following 2-3 weeks. My goal is to practice real TDD work process to determine if our department can benefit from this developing methodology (or as a design tool).


I’ve read a big bunch of articles and blogs (Jeremy D. Miller, Sam Gentile, Scott Bellware, Roy Osherove and others) about TDD and even practiced it for a bit during the last two years, but to be sincere, it wasn’t a real Test Driven Development. I stopped TDD-ing in the middle and moved back to write-with-haste, switched to TDD and so forth. I was lazy and I had no one to guide me through. I had to “guess” the right process and to read a big set of articles to see if I’m on the right track. I was lacking of some good feedback.


This 1-on-1 with Roy should give me a clear insight about the process and immediate feedback. If the process will prove itself, we’ll arrange a 3 days course for my department and try to fit TDD to our development process (where and how I’m still not sure, but I’ve got the feeling that I’ll be smarter in a few weeks).


What is SEE Infrastructure all about ?


SEE stand for: Simple Expression Engine which I’ve wrote about before.


I came up with SEE as I wanted to SEE what sort of filter the GUI requests from our Data Services.


I’ll give you an example for a client’s request and a solution based on our old infrastructure and how SEE changed the picture.


Example:


Let’s imagine we have a screen with one GridView. Our goal is to show all the (1) active orders (2) from today with (3) price bigger than 1000 NIS. It should be easy as counting 1,2,3 right ?


– OLD –


With our old infrastructure the code will look something like this:


(1) Orders.aspx:


OrdersFilter filter = new OrdersFilter();
filter.IsActive = true;
filter.Date = DateTime.Today;
filter.Price = 1000;

EntityCollection<Order> orders = OrdersService.Instance.Get(filter);
// … bind orders to our GridView …


(2) OrdersDal.cs:


in our Data Access Object for the Orders entity, we’re required to override a method which builds the dynamic SQL based on the given filter:


if (this.Filter.IsActive != null)
{
   query.Append(” AND Orders.IsActive = @IsActive”);
   paramaters.Add(DbServices.CreateParameter(“IsActive”, SqlDbType.Bit, this.Filter.IsActive));
}

if (this.Filter.Date != null)
{
   query.Append(” AND Orders.OrderDate = @Date”);
   paramaters.Add(DbServices.CreateParameter(“Date”, SqlDbType.DateTime, this.Filter.Date));
}

if (this.Filter.Price != null)
{
   query.Append(” AND Orders.Price > @Price”);
   paramaters.Add(DbServices.CreateParameter(“Price”, SqlDbType.Double, this.Filter.Price));
}


Now, look at the 2 lines marked in red. The filter at the GUI sent Price = 1000 while the DAL object looks for Price > 1000 (remember, this is the client’s request).


Not only we’ve got a mismatch, the coding wasn’t trivial nor “easy”. We had to know(=remember) what method to override at our OrdersDal.


– New –


Orders.aspx:


FilterExpression filter = new FilterExpression();
filter.Where(
   Where.EqualTo(Order.Field.IsActive, true),
   Operator.And(),
   Where.EqualTo(Order.Field.Date, DateTime.Today),
   Operator.And(),
   Where.GreaterThan(Order.Field.Price, 1000)
);

EntityCollection<Order> orders = OrdersService.Instance.GetByExpression(filter);
// … bind orders to our GridView …



That’s it. No mismatch – the GUI programmer can now see exactly what results the Orders Service will return and no need to look at OrdersDal.
Coding was short and fun.



I thought that this small but important infrastructure will be a nice platform to practice a “Real-World” work with TDD. The infrastructure at its current form is working quite well so I know how to API should look in general and what are my “big” problems (Mapping, Resolving db function names and a few more).


Is SEE revolutionary ?


Hardly!
Expressions like languages are all over the place lately:



  1. LINQ – to be honest, this is a really great query language but it ruined my Visual Studio .Net 2005!! The product requires some installation that simply is a disaster for a developer station.
  2. HQL - Nhibernate. This is actually very nice but I get no errors during writing.
  3. eSql – Looks great. is it safe for production? I’m not so sure… anyway, it’s all with C# 3.0 and suffers from symptom (1).

I can add additional 2-3 infrastructures to the list but you get the picture.  


So why do I\you still need SEE for ?


SEE is an home-made infrastructure which was developed by the KISS (keep it simple, stupid) principle. I know that there are many folks out there who still write\generate custom Data Access Objects. Integrating SEE in your custom DAO objects will be very simple as the infrastructure gives a solution to a very narrow problem domain. There is no need to learn a new “quotation marks prisoner” language. The developer can enjoy the VS.NET IntelliSense and as you could see in my previous example, the API is very easy to understand.


Moving toward one of the other languages\technologies can take some time as the learning curve can be quite high.

You could SEE with a very small time investment by your side.



Where do you come along ?


I will upload any code I’ll write during this coaching lessons so you can see our progress, bit after bit. In addition, I’m going to write a prolonged post after each lesson, to share with you my insights. This is the interesting part though – my infrastructure will change according to your requests (well, some of them anyway, and only the “good” and “simple” ones ;)). Feel free to suggest new features or to change method\classes names\relations. This will able me to practice changes to our SEE infrastructure as part of the TDD practice.


I hope that we’ll enjoy the process and learn new things on the way,
Oren.

Who Am I ?

I’ve just created an interface named:


/// <summary>
/// Defines an object which support xml representation in the system.
/// </summary>
public interface ISupportXmlFormat
{
   /// <summary>
   /// Return the object representation as xml string.
   /// </summary>
   /// <returns>Xml string</returns>
   string ToXml();
}


I know, this is a silly name but it made me laugh quite a bit so I thought to share with you my geeky sense of humor ;-)


It is hard sometimes to think about a good name for an interface. I general, I follow Roy Osherove‘s Interface naming guidelines and name my interfaces by their purpose or “what can be done to them”.


Do you have a better name to suggest ?