"The law of entropy tells us that the battle for simplicity is never over."
Roger Sessions.
All we now similar laconism about the problem of complexity. On the other hand, each Enterprise Architect knows, that our focus is to have a simple architecture. The real challenge behind this mission is that we are trying to optimise something what we cannot measure. The missing measure is the persistent doubtfulness if we go to the right direction. What is the solution?
There are many approaches to come to grisps with complexity: different classifications, benchmarks but there are almost no real measurements - or the measurement is too complex to run. Running a complicated approach is nothing more but more ammo to the devil of complexity.
Warm-up minds
"Simplicity is the ultimate sophistication."
Leonardo da Vinci
Using other words:
"Simplicity is the highest form of sophistication"
Steve Jobs
"If you can't explain it simply, you don't understand well enough."
Albert Einstein
This is a good lesson. If we make too complex architecture the problem would be that we did not understood the requirements and the environment.
"Among competing hypotheses, the hypothesis with the fewest assumptions should be selected."
Occam's razor
Good idea for tender bid selections... Last but not least one another mind warn us not to run over-simplification.
"Things should be made as simple as possible, but not simpler."
Albert Einstein
Now it's clear that complexity is "bad", simplicity is "good" therefore it's time to tell what is the complexity and simplicity finally! We will overview some of the approaches to reach a common definition.
The reason of fighting with the complexity is that it is the main driver of the total cost of ownership. Complex systems are hard to change, to operate or just to make them safe! You will face large amount either or CAPEX or OPEX size if your architecture is not simple enough. Beyond the costs complexity is the largest enemy of our most important target, the ability to change.
Classification
Good complexity? It was just defined complexity is wrong! Marketing prospective is a bit different. If the market proposition is too simple, customers will be unsatisfied, since they will not have the options to select between. Now we call the variations complexity instead based on Wikipedia definition: “Complexity is generally used to characterize something with many parts where those parts interact with each other in multiple ways.”
The following figure shows Profit related to the number of business options, declared as complexity.
The profit is related to customer emotion to the number of choices. The more choices are the more emotions, both positive and negative. We are looking for the maximum value of the sum of them.
We can say then, that complexity is optimal when the profit is maximised. There is no exact algorithm to measure if the direct relation between the number of choices and the profit level, since there are a lot of aspects of profitability. This categorisation is still fine to confirm, that we have to find the best level of complexity, instead of trying to cut back to the minimum.
Anyway, this categorisation becomes useful if we find a good way to measure the complexity!
Cynefin
Cynefin is an interesting concept of classifying situations helping decision makers to find the right decisions. It defines five domains: simple, complicated, complex, chaotic, and a centre of disorder.
Cynefin defines characteristics, leader's job, danger signals and suggested actions to danger signals for all the domains except disorder although it defines, that you are in the disorder domain, if you cannot identify, which domain you are. The next table summarizes the essence of the domains.
Beyond the Wikipedia article linked above I suggest to read this great Harvard Business Review article to get all the details! The only limitation of this concept is, that it is still far to be engineer-ready, there are no exact measure we could monitor. Get a step forward!
Counting
People have the ambition to control and measure thinks. Speaking about engineers this is 100% sure. In complex situations, after the first panic, people tries to count things. Count the number of applications, number of products, number of people in the organisation, number of processes so on, so force. Some cases it helps, since if you have more number of applications, you should me more complex. It's a pity that this count does not tell anything about the complexity. The best example is, when you try to optimise on this number and start retiring applications to be less complex. The typical issues are:
- Retiring one application is not the same cost as retiring other. What is the reason? The size of the application? Size? How to measure this? Other long story...The number of interfaces? It is another count, with the same problem about size, number of fields, size of data and so on. Never ending story.
- If do not fail and will retire a given application, you can decide to forget its functions, but the typical case the you migrate them to other applications. If you migrated what happens with the others? No change at all or they will become larger? Same question as before, how to measure the size?
Take another example. Assuming one have 100 application servers serving similar functionality, while other has 1.000 similar ones. Is the second one is 10 times more complex than the first? I would say no. Having more similar staff is the question of scale but not the complexity. Managing 10 times larger amount needs 10 times of managing resources either they are operating people our automated processes.
The two different examples show linear and non-linear dependencies. Linear case is something what we have great practices to handle, we call them sizing. The non-linear situations are the ones we feel chaotic. Since they are the crucial part of our life we have to do something with them and hereinafter you will see the solution.
Measurement
Without scientific measures, these decisions are necessarily subjective, results in many arguments, but few definite answers. After some trials of the approaches above (and don't want to mention the dead-ends like function point calculation in case of software complexity) I found a solution to drive architectural decisions. The methodology is invented by Roger Sessions and called Simple Iterative Partitions (SIP). The measurement you will see below is the background of SIP. Roger is a great guy, wrote interesting articles and books about IT complexity. A must read one is The Mathematics of IT Simplification. Roger is retired already but still there to help. You can reach him on objectwatch.com.
SIP and the complexity measurement are a simple, clear, and objective approach to calculating the complexity of architectural options. These calculations have the following important characteristics:
- Precise and independent of the person making the measurements.
- Definitive and easily validated.
- Easy to determine and require no tools.
- Can be calculated early in the project lifecycle.
I hope you are hardly waiting the details after this long introduction. In simple word (how else!), collecting the number of functionalities in "boxes" and the dependencies/interface between them would be substituted to the following equity:
C(M,N) = M3.11 + N3.11
The argument calculated from the number of functionalities (M) called Functional complexity and the name of the other is Coordination complexity calculated from the number of dependencies (N). Finally, the sum of these two arguments gives the complexity count (or as Roger calls SCU - Standard Complexity Unit) of the "box". The box is usually an application or a functional module of that, depending on the level you would see the results. In my experience I am using the functional module level. The following figure shows a basic example, where the boxes represent functional modules, the blue discs are functions and the arrows are the dependencies.
In this example, all modules have the same functional complexity level, the difference is the coordination complexity.
Jump into the details, describing the assumptions and inventions behind!
All we agree, that dependency between the number of capabilities or dependencies and the complexity is not linear. One hand this is a feeling by heart but also confirmed by the real-life experiences. The question then is the exponent. Roger suggest using Robert Glass's number what is calculated from the assumption, that 1.25x more functions doubles the complexity. Getting more details, please read the book Facts and Fallacies of Software Engineering by Robert L.Glass.
Now we arrived the first real challenge: how to collect the functionalities and the dependencies. To feel the size of the challenge, we have to define the real meaning of functionality and dependency.
Functionality: when we say functionality, it means atomic function of a given application to ensure that valid numbers are used in the equity.
Dependency: it covers the connections between applications by a common word the interfaces.
The dependency is the simpler topic, since more or less everyone knows the interface between the applications, while the linkage of the interfaces to the functional modules can be a bit more challenging.
Turning to the functionality we will face the real issues, since you have to ensure that all the functionalities are atomic what you or your team collect about the applications. Let's see an example: assuming you have a webshop which has checkout function at the end of an order, you may record checkout as an atomic function. It is fine, if the actions running behind (user identification, invoicing details, delivery setup, payment) are not used separately - if they are, you have to use the actions behind instead and then do not record checkout additionally!
The other challenge is, staying at the example of the webshop if in case of web orders checkout is atomic, but in shop order handling application it is not. You as architect has to aware to split webshop checkout to the same atomic parts as in the case of shop order application! It is crucial to get comparable SCUs for all applications!
Having the same granularity of each function around the scope of the complexity calculation requires strict documentary disciplines, right principles to identify atomic functions and last but not least quality assurance from EA team.
Project comparison
Projects can be analysed by two aspects from the complexity prospective: how large they are and how much change they make on the complexity of the enterprise.
The first aspect is based on the number of items touched by the project, the sum of new, changed and retired functionalities or connections. In this case you can calculate how large a project is. If you have baseline data about the affected applications, the number of changes must be used on the top of that (see the example below). If the baseline is missing, the calculation is still useful, but you have to know that the result has a strict assumption: the changes are the same on small and large systems!
The missing baseline is normal if you are implementing brand new things having a greenfield project. If in brownfield case you decide not to use a baseline the calculation is still called "greenfield approach".
Look back to figure of blue disks above! The situation is greenfield, introducing "boxes" with four functions in each, therefore their functional complexity is 75. That is the result of the greenfield calculation still in the case if the box on the left has 10 others, not affected function and the box on the right has other 5. Calculating brownfield, using the baseline will have a different result. (10+4)3.11 is on the left and (5+4)3.11 on the right, which is 3668 and 928! It is a real difference to 75 and is the reason to consider using greenfield!
Evaluate the whole enterprise
The main reason of evaluating all application in your enterprise is to have a good baseline for project calculations and on the other hand to let you compare the complexity of applications and to start architecture simplification programs.
The baseline challenge
Now you feel, that collecting of all atomic functions (and interface) of a large enterprise is the class of impossibility. It has a hard assumption to have a good architecture repository where you can store all the information, either the basis of the calculation or the result of it, but it is fine. You probably have a good repository or you can introduce other tool, it is fine. But to collect all atomic function... It seems we reached a dead-end facing an unsolvable issue, but luckily not! Now we can utilise the classification approaches or we should make experts guess about the number of atomic functions! We really can do it since the calculation requires the number of functions and not the description of them!
I suggest executing an evaluation where you take well documented project or applications in your architecture and the calculated complexity will be the reference point to set baseline number for other applications. Taking the example of webshop and shop applications and assuming we know that webshop has 10 functions, while shop application has about the double you will set 20 as the baseline number at that's all. As you defined the baseline you should add it to the number of recorded detailed functions in the course of calculations.
From now on you can record the new functionalities through projects or reset the baseline numbers periodically depending on the project culture maturity in your company.