The Bulletin Board Pattern

A rather common problem of software system design is to organize background work in a robust, reliable, and scalable way. For example incoming queries need to be processed, messages sent, or remote systems need to be called. Single work tasks emerge but are not to be done at the time and place they are generated, but instead work work is to performed in the background and asynchronously – possibly using a separate infrastructure from where it originated.

work rushing towards processing

Messaging for Work Distribution

Suitable orchestration of distributed work is not completely trivial though and there is a number of pitfalls. Driven to non-functional requirements such as reliability and robustness and asynchronicity, people often tend to messaging service systems, such as Apache Active MQ, as a convenient mechanism to announce work to the system and to distribute work to processing elements of the system.

However, message oriented middlewares do inherently not have an understanding of a message’s meaning. In the human analogy, the transmission of a message corresponds to the delivery of a letter via the mail service. Apart from quality of service aspects such as express delivery or the requirement to produce a return receipt the delivery is completely oblivious with respect to the letter’s content. Once the letter is on its way it has no relationship to other pending letters and once delivery is completed, the mail service is out of the picture and whatever must happen next is with the receiver.

Hence, using an approach like this for work assignment will inherently be ignorant of work details, pending work, as well as pretty much any particular state the designated processor is in.

How about Bulletin Boards?

Instead consider a bulletin board that holds a table of pending tasks. Instead of receiving isolated work tasks, an interested task processor may use a rule set to select one or multiple tasks depending on its state as well as the overall systems current workload (as seen on the bulletin board).

Considering the bulletin board is a design element of the solution, we may decide to note down highly specific business attributes with the tasks to allow sophisticated task selection rules. For example a worker may process similar tasks much more efficiently than random sequences of tasks. Or business rules may imply a time of day specific prioritization based related business data such as a customer status.

In other words: Messaging is stateless. A bulletin board can be arbitrarily stateful

But How?

So it seems there are some advantages in using a bulletin board and a “pick your work” approach rather than a generic work assignment. But how are we going to build that?

The correct answer is of course to use a relational database system (RDBMS). The whole setup asks for it! In its simplest incarnation, the bulletin board in its most simple incarnation just a database table that holds all pending tasks, some attributes we need for management and whatever business data we deem useful for smart work organization. How about reliability and robustness? After all we just decided to build ourselves.

Whatever RDBMS you are using, most likely there will be an approach for backup/restore and replication/fail-over available. Typically we will want to have a recovery feature: If there was an outage, the application crashed etc. we will want the application to retry whatever was running last. That is, tasks should be picked up “at least once”, possibly multiple times (and consequently task execution should be idempotent).

A simple recovery implementation would work like this: When picking one or a set of tasks the processor leaves a designation that identifies it with the “checked out” task. It does so again when finishing a task. In a recovery situation, these markers can be used by the processor to discover any previously started but not yet finished work to pick up and process first. It’s not complex, and in contrast to many other “advanced” approaches, it is highly transparent and simple to work with when things go bad.

Summarizing

Getting work done is the reason to built software systems. Knowing and organizing its work is a central aspect of its design and should not be delegated to external tools as if as an afterthough but should and can easily be implemented as a fundamental built-in feature.

References