I've been musing lately on why we in IT insist on forgetting so much valuable knowledge. I don't know whether it's because of our youth-obsessed culture and our focus on the newest & best, because of our tendency to prioritise on-the-job over traditional learning, or whether there's simply too much in the "architect's book of knowledge" (ABOK), and we all have to focus on the new to keep up.
What I do know is that we seem to use each paradigm shift as an excuse to forget what went before, and we're in danger of losing a lot of knowledge as a result.
Two quite different cases have brought this into focus. Firstly, one of my clients has a complex software stack, in the depths of which is some communications software. The authors of this software have sought, and achieved, an RBCD ("rock-bottom common denominator") by treating the latest 3G modem as if it were a 1970s hardware-controlled job. Some very good, but relatively young, consultants were completely mystified by this, and I got a lot of phone calls whose subtext was "Andrew - you're old. What's a CTS?"
Now the issue here is a subtle one. I'm not really proposing that RS232 takes precedence in the ABOK over more modern technologies. But you never know when behind that elegant SOA facade there's a piece of code written in 8086 assembler rather than VB.NET, and somehow your project needs the ability to at least recognise the signs in that obscure error message, and take the appropriate corrective or protective action.
The second case is arguably more worrying. A lot of people seem to be musing how to understand the reliability of service-oriented architectures (see, for example, "SOA Algebra" by Richard Veryard). Many seem to think that because "it's SOA" the problem is necessarily different to modelling the reliability of systems constructed using older technologies.
I'm not convinced. I'm not aware of any SOA construct who's reliability cannot be modelled by well-established techniques, such as Fault Tree Analysis. Most compositions of services can be modelled using simple AND and OR gates, but add the VOTE gate concept of my RelQuest tool, and you can model very sophisticated service composition rules, such as "get costs from up to five providers, but fail if we can't get at least two options".
Having modelled the composition structure, you can get at the net reliability analytically, by straightforward application of the probability maths we've understood since the days of Pascal (the French mathematician, not the programming language), or if you're interested in the subtleties of statistical variation you can do a Monte Carlo simulation.
In some ways it's actually simpler to apply these techniques to SOA, because you are dealing with discrete events (service invocations) whose failure probability can in many cases be evaluated directly, rather than indirectly from continuous measures such as MTBF and MTTR.
In answer to Richard's challenge, SOA doesn't need a "new" algebra, it just requires the right application of techniques we should already understand.
If we're serious about SOA, we should be able to build on these established ideas and go much further. Service design tools ought to be able to generate the Fault Tree model directly from a graphical service composition. Even better, a decent ESB should continuously measure the reliability of input services, and use the current composition rules to provide estimates of reliability for those clients and services calling them. However, the tools and WS- protocols with which I'm familiar don't seem to have any of these concepts.
Now RelQuest is almost ten years old, and I freely admit I "borrowed" ideas from much older work when I created it. So why have these ideas been forgotten (or never learned) by those trying to model SOA reliability? Why does SOA represent such a "paradigm shift" that these older concepts appear irrelevant?
Why do we insist on going back to square one all the time?
I never said SOA needed a new algebra. I said it needed an algebra. I expect (indeed hope) that some (perhaps even all) of the elements of this algebra are already known. I agree that most (perhaps even all) of the problems are older than SOA.
But the reason I use the word algebra is because I want a systematic and coherent approach (not just a random collection of techniques) to address a set of problems (composition and decomposition) that are currently handled very badly. Most of the people in the SOA world seem to be ignoring or fudging these problems. And there is very little work being done in this area.
I think that’s fair. We both acknowledge that many of the techniques we need are already established, and the challenge is to bring them into a coherent structure which serves the SOA world appropriately. However, given the thrust of my article, which was really about knowledge management, I’d disagree with Richard’s use of the word “known”. I think a major problem is that we’re looking for new solutions, and the value of existing techniques could be ignored.
If you'd like to comment on this article, with ideas, examples, or just to praise it to the skies then I'd love to hear from you.