Any half-decent developer knows that a monolithic code base does not scale. Not even for a single developer. There are many reasons for that, the most important being that humans are simply not made to concentrate on more than a handful of concepts at a time (I have a hard time memorizing more than four items on my shopping list).
We need abstractions that help us isolate conceptual complexity so we can focus our concentration on few isolated aspects at a time. One such helping abstraction is that of modularization of a code base into logical units to keep scope and reach of changes bound to a limited set of responsibilities. Followed by clean separation of module interfaces/APIs from their implementation details (i.e. implement information hiding, see as far back as Parnas).
This is the common practice in server-side Java (EE) development: Break the code base into modules, manage cross-module dependencies, have a build tool assemble everything into a deployable WAR or EAR archive. This approach has many striking positive aspects:
- Expectation on the runtime environment in terms of module management is zero. It’s ok that it is all treated as a Web application, although the Web presentation aspect may be the least of all.
- Any build tool can be made to copy JARs around and ZIP them up to yet another archive.
Let’s call this a soft modularization. Even without going for extreme cases like BIRT, this approach breaks down at scale of code base size and complexity and third party reuse. This is because every module adds a debt in terms of dependencies and because of simply being around. The former is due to conflicting or potentially conflicting use of some other libraries. Even if there is no type / version conflicts at some point in time (or none you noticed yet), any third party component upgrade might change that at any time.
The latter, i. e. what I called the effect of being around, refers to runtime penalties due to longer code introspection, adding to messy Spring configuration, spreading out its own configurations… adding noise and complexity to manage.
Get runtime isolation or get stuck
So we do really need (and should want) more runtime isolation that helps us keeping those “having too much in one basket” problems under control.
One true success story of code isolation is what all modern operating systems do: Process isolation. Practically speaking however, (real or pseudo) cross-process integration of software modules is simply not cutting it in terms of invocation performance (anybody remember EJB’s with remote interfaces as application components?).
So we want sharing of objects and memory while still being able to protect modules running into type conflicts.
Ah… so classloaders again?
For Java this means class loaders. Class loaders help achieve the following:
- Controlled visibility of types – meaning conscious isolation as well as conscious sharing of types
- Sharing of memory between modules and
- direct, in-process invocation with the ability to pass data by reference
In essence a modularization approach based on the class loading mechanism loads types of a module with a module-specific class loader and has some more or less clever schemes to make types loaded from other class loaders visible to modules based on some configuration. In fact, this is the underpinning of any modular Java runtime (be it open source or proprietary) – that I am aware of at least.
Note: This is no way to build an OS. But for building medium sized solutions or the service endpoints of large distributed applications it is a useful compromise between assembly flexibility and reliability and runtime performance.
Plus, with a little care, and at least during development, the class loader isolation allows for module reload at runtime and hence shorter round trip cycles compared to restarting (or even redeploying) a complete application (and much cleaner and better defined than any hot swap technology or the likes).
It does however imply that you loose one big pro from the list above: You need runtime support, one way or the other, that manages your modules and a corresponding class loader topology. The good old tomcat days are over. Time to grow up!
Now it would seem natural that you could go easily from one level of complexity control measure to the next. But that is not so. The following, admittedly somewhat complicated picture tries to illustrate different levels of achievements over progress (or requirements) in time and scale vs. complexity required to understand with some transitions:
The sequence of transitions while progressing from the left to the right is not as linear as it seems. While the first transition is the most natural, it gives a model that in reality does not lend itself to a smooth next transition in most cases. As soft modularization is also rather soft in its requirements, stricter isolation – required to make the isolated modules model scale – may be harder to achieve when starting so later (at which point in time it typically shows that good separation of concerns and APIs existed more on paper than in code).
Given a messy, badly-modularized codebase, it is tempting to try to simplify matters by doing some big cuts and split things up into two or more applications that are used in a distributed or co-located way.
That is, ignoring the third model in the picture and jumping directly for the distributed model. Doing that in order to solve a problem of a bad code base is almost always a mistake due to the operational overhead implied when going distributed. That is, unless you have a convincing scale, security, or technology-driven reason, you should never go functionally distributed. And if, try to make this a problem of “we are all the same – but we do different things” (i.e. still one code base, service endpoints differing in what parts are actively used).
But those modular runtimes suck!
Yes,…, yes,.. OSGi burnt so much ground. But before getting to that, let’s see what you really want. First of all, it should be possible to keep things the way they are as far as possible. That is, you do not need any dogmatic teaching about “how to do things the right way” (by people that most likely never implemented anything non-technological) that throws you into months long, certain to fail refactorings, while having no experience with the new environment anyway.
So, if everything is softly modularized today, you will want a smooth path into strongly modularized – just as much as you see benefits coming in.
Secondly, third-party libs should be naturally usable as is of course. I.e. without looking to deep, collecting package names or anything like that.
Then there is the much undervalued and underrated awareness required for the context class loader (again, see the other blog) but otherwise that’s that. And most Java libs have learned their lesson a while back so that they typically do work nicely in a multi-classloader , modular environment.
As now types may live in isolation, how do you actually create instances? There obviously needs to be a way to create services and the like from code that is not visible from other modules. So you need some runtime support for that. That sucks, because you need to adapt. In turn, you have a right to expect something that doesn’t try to teach you but rather tries to serve you.
For that purpose OSGi has deep at its heart the service registry. Z2 uses a resource lookup approach (that btw. seamlessly integrates with JNDI and Spring – which is important).
OSGi sucks and fails
Ah now I said it. OSGi, being the most prominent approach to modular Java, deserves some words. In fact, I wanted to write a blog article about it for a long time, but never had the energy. And it’s failing anyway as far as I can tell. It’s maybe a little unfortunate to have no standard modularization approach. On the other hand: Quite possibly that would just mean to make everybody equally unhappy.
Anyway, there are a lot of reasons for OSGi’s failure – in the Enterprise application space at least. Only about half of it due to technological aspects. Let’s say that OSGi is far too complex to be useful and far too code-centric, failed to position itself in any useful category, and tries to convince you about things that are not helpful to you (i.e. how to modularize THE RIGHT WAY) without helping you much to actual match that with your code base at hand.
Some concluding words
This article got a little out of control. The subject is probably to wide to make for a good, short blog. One little final, very personal conclusion: I doubt that there is a single runtime approach to modularization in Java. It is one of those attributes that need further specifics on how it’s going to be used and for what kind of applications or solutions, and most importantly, what is the application life cycle and development approach associated with it. Most likely it’s all of these aspects that drive the way to implement module support (if any). That’s definitely true for Z2. Here development productivity, application life-cycle requirements go first, it’s modularization approach is just a necessary conclusion.