The Linda Problem of Distributed Computing

Suppose an important function of your solution is pricing calculation for a trading good.

What is the more appropriate solution approach:

  1. You develop a software module that implements pricing computation
  2. You develop a REST server that returns pricing computation results

I am convinced that more than a few developers would intuitively chose b).

Taking a step back and thinking about it some more (waking your lazy “System 2”) it should become clear that choice a) is much stronger. If you need to integrate pricing computation in a user interface, need a single process deployment solution, AND a REST interface – it’s all simple adaptations of a). While having b) gives little hope for a). So why chose b)?

This, I believe to be an instance of a “conjunction fallacy”. The fact that b) is more specific, more tangible, more representative as a complete solution to the problem makes it more probable to your intuition.

Back to the observation at hand: Similar to the teaser example above, I have seen more than one case where business functions got added to an integration tier (e.g. an ESB) without any technological need (like truly unmodifiable legacy systems and the like). An extremely poor choice considering that remote coupling is harder to maintain, has tremendously more complex security and consistency requirements. Still it happens and it looks good and substantial on diagrams and fools observers into seeing more meaning than justified.

Truth is:

Distribution is a function of load characteristics not of functional separation

(or more generally speaking: Non-functional requirements govern distribution).

The prototypical reason to designate boxes for different purposes is that load characteristics differ significantly and some SLA has to be met (formally or informally). For many applications this does not apply at all. For most of the rest a difference between “processing a user interaction synchronously” and “performing expensive, long-running background work asynchronously” is all that matters. All the rest is load-balancing.

Before concluding this slightly unstructured post, here’s a similar case:

People love deployment pipelines and configuration management tools that push configuration to servers or run scripts. It definitely gives rise to impressive power-plant-mission-control-style charts. In reality however: Any logical hop (human or machine) between the actual system description (code and config) and the execution environment adds to the problem and should be avoided (as probability of success decreases exponentially with the number of hops).

In fact:

The cost of system update is a function of the number of configurable intermediate operations from source to execution

and as an important corallary:

The cost of debugging an execution environment is a function of the number of configurable intermediate operation from source to execution

 –

More on that another time though.

This post was inspired by “Thinking, Fast and Slow” by Daniel Kahneman that has a lot of eye-opening insights on how our intuitive vs. non-intuitive cognitive processes work. As the back cover says: “Buy it fast. Read it slowly”

 

Advertisements

3 thoughts on “The Linda Problem of Distributed Computing

  1. Hi Henning,
    +1 for “Thinking, Fast and Slow” by Daniel Kahneman, one of the most inspiring books I’ve read in recent years!

    I agree that it seems people these days are just often put a services into a seperate process, without really thinking through whether it really makes sense.
    I think it also hast to do with http://en.wikipedia.org/wiki/Conway's_law
    , e.g it becomes easier to assign a team to the service (fits very well with scrum , doesn it).
    And then there is the argument, it worked for Amazon, so it must be good idea.
    But in cloud applications there is currently with most runtimes (JVM, Ruby, whatever) no other way to scale than to have more processes. More processes also can increase fault tolerance, the ABAP approach 😉 In fact with recent technologies (https://www.docker.io/) startup time of a service becomes a major concern, because starting a container (Linux Continaer instead of VM) is now immediate. Therefore you might want to have more, smaller services as separate processes because they start faster.
    For cloud applications you probably need to solve the issues with distribution anyway. That’s why for cloud applications there is this tendency to distribute …

    Like

    • Absolutely. Thanks for that Conway Link! I was actually about to conclude the post by saying that “every gap in the process gets filled automatically – by a problem (usually a new team)”…. which could be considered a corollary to Conway’s law.
      One of the fallacies in thinking (albeit intuitively tempting – hence the Linda-Problem) is two mix the issue of “processes for scale” and “functional de-composition” of the application. A process separation definitely happens along some functional boundary, but taking the latter as a reason for the former is a mistake.

      Like

  2. Pingback: Microservices Nonsense | z the world

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s