In the recent blog post in this series, we found out that the Value Stream Mapping, VSM, calculations for real Flow efficiency and the calculations of flow done on an Agile team (in one sprint) looks treacherously similar.
Here are the formulas again [1].
In Value Stream Mapping (VSM), the formula for Flow efficiency (AR)calculations on the Critical Path, which has been generated transdisciplinary and iteratively with the project team and its sub-teams, common experts, stakeholders, centralised resources etc., i.e. taking also resource constraints for the total organisation into account, look like this:
Flow Efficiency = Total PT on Critical Path/Total LT on Critical Path
and for Agile development [2]:
Process Cycle Efficiency* = Value Added Time / Total Lead time
One difference is that “Critical Path” is not stated in the latter, so what does that detail mean, since the Critical Path is in focus when removing waste to achieve a better Flow Efficiency? And does Lead time have the same meaning in the two formulas?
But, that is actually rather uninteresting, since the first thing we always need to ask ourselves is a question that starts with why we have a problem or why the need of introducing a method (below: Q – Question, A – Answer) ;
Q: Why do we need to do Process Cycle Efficiency measurements in Agile software development for every team in every sprint at all?
A: Because there are problems with the Flow Efficiency.
Q: Why do we have problem with the Flow Efficiency?
A: Because activities run out of team members to take care about them, or activities are blocked by other teams or SMEs, making queues of activities inside the process, i.e. showed on the teams’ Kanban boards.
Q: Why do we get queues?
And the queue symptom and its root causes, we have already found, and it looks like this (see the blog post, Finding the root causes to queues – The Prefilled Problem Picture Analysis Map starts to be filled in, for the details)
So, after asking multiple why, we have found two of our root causes in the Prefilled Problem Picture Analysis Map, the ones that is connected to queue problems. Great!
And the answer to the question in the last blog post, why Agile teams and teams of teams never are part of a VSM calculation we can now answer; because it is not possible, there is no plan and no Critical Path to calculate on, only in hindsight. This hindsight time plan for sprints, which we can do a VSM measurement on, will be elaborated on in a later blog post.
As we stated before, queue problems are only a symptom of the root causes above, this means that Process Cycle Efficiency measurements are anti-systemic**, they do not take care about the system, only a symptom, meaning they will lead to suboptimsation of the organisation. And that is not good.
What is not good either, is that the Process Cycle Efficiency measurements are treacherously forcing the teams to work sequential, since sequential work will give the highest value***.
Remember also that queues of activities also consist of potential problems, that are hidden if we do not look into a queued activity. Putting WIP Limits on the queues make us not even consider the activities outside. The risks with introducing WIP Limits are twofold 1) that we do not take care of problems that are for the queued activities not handled but within the WIP Limit, which indirectly means that we are queueing our own problems, and 2) that we do not know that we have problems with activities “outside” that are blocked by the WIP Limit, see this blog post for an elaboration. To hide problems are very far from Toyota thinking of Continuous Improvement****. In Toyota’s production there are no queues at all, only buffers between the processes, called WIP Inventories, that unfortunatley have been misunderstood and converted to WIP Limits.
One more very important thing to remember is that the activity that cannot be executed must always be put on the blocking internal team competence/ external team/SME/tool/etc., since it must be clear where the bottle neck is. If there is only activities in queues on the respective columns on the teams’ respective Kanban boards, the real root causes will not be solved. One way of doing it is to put a black sticker on the blocking part, to be clear who is the owner of the problem. And a red sticker on the teams’ internal Kanban boards to state they are blocked by external resources, many times meaning that a better planning are needed between the teams/SMEs/tools/locations/etc.. And an orange sticker to state internal blocking, where most probably T-shaping or twofold-I-Shaping are appropriate solutions. This is elaborated on in more detail in this blog post, where five different scenarios for having queues in Agile are shown due to WIP Limits.
So, instead of Process Cycle Efficiency calculations, all effort must be put on System Collaboration; taking care of our system and solve the two root causes above, in order to get rid of the queue problems. This means in short:
- Planning within the team and between the teams and a continuously updated Interdependency Board is a must, to have control of the interdependencies between the teams and to the experts, both short-term and long-term.
- Secure that enough T-shaping over time is available in respective team, between the teams and cross-functional over all teams.
This will for sure remove or shorten the Mean queues, which have been elaborated on in this earlier blog post, so that the team of teams speed up and makes the total delivery on time, or hopefully even faster. That is to act systemically and will make the total delivery faster, which is a really a good measure since it is done on the wholeness.
Tomorrow it is time to wrap up. C u then.
*And once again note that for knowledge work, all teams have interdependencies between them including their use of common experts. That means that if we are going to go into the team and its process over a sprint time box, we are really in deep water, with high risk of sub-optimising, since all teams are interwoven.
**sub-optimising the system, see [3] for a film with Prof. Russell Ackoff talking about systems.
***We need to go back to VSM and the Critical Path. In production it is very easy to see the Critical Path, normally only to pick a part of the consecutive processes in the value flow. In product development we need to find the activities and their total lead time that make the Critical Path. This Critical Path then naturally sets the time box to the next Integration Event, because this is the time it will take. The Critical Path we need to keep track on to avoid delays, or to improve its total lead time to make the flow faster.
We can make VSM on any value stream no matter how detailed it is. If we do it on the whole project it can look like this. And in the examples below we assume no waiting times on the Critical Path, which is reasonable for software development, where there is no lead times for prototype deliveries for example.
In this example we have a Project as the only activity on the Critical Path, and the lead time is 2 weeks.
And this project is very successful and deliver after two weeks according to the plan. What is also good, is that all the team members in the Project has done 100% good work and they have encountered few problems. This will give us a 100% value according to both formulas above, because work time for the members in the team and lead time is both two weeks.
If we make this more detailed, it will look like this, and these activities shown make up one story, the story with the highest Business Value. This is the story the team needed to put all effort on to finish within the two weeks lead time, and is therefore on the Critical Path, since they from start already have estimated the time for each of the different steps; Analyse, Code, Integrate & Test and Build, and it fits nicely into the sprint. And they did it, and delivered the story within 2 weeks, and in hindsight they could put the activities on the time line.
So, only the activities on the Critical Path are shown and with the formula for VSM’s Flow Efficiency, we of course still get 100%, since we do not have any delays between the activities as mentioned above. And all the other activities that is not shown in the picture, only add up to the Critical Path or not. And if the team members can do all activities within the Project with only small waiting times, the team members are also highly efficient, i.e. a high Resource efficiency. We cannot be happier.
But, here comes the difference between the formulas, because Agile development’s calculation of flow cannot calculate any value if we not show also the other activities above. And this even though the other activities are not on the Critical Path and therefore are totally uninteresting for a real Flow Efficiency calculation.
So, here we have the difference between the original Flow Efficiency calculations done in Value Stream Mapping that is made on the Critical Path, and Agile development’s calculation of flow, which means they are really not the same.
Lucky enough, we have the details for the other stories (of course in hindsight for an Agile team since it normally uses a Kanban board and not planning the activities first), and here they are put in the time box.
100% Wow!
But, now things get incomprehensible. Because if it would look like this instead…
… we would have a lower Process Cycle Efficiency value, even though the lead time actually is shorter for the two lower stories.
And there is really no difference between the two examples with all activities shown in the time plan, because the team members in both cases cannot work faster, they have been efficient int the way they apply their HOW. And actually, as already stated, the lower one with lower value is more efficient, since the activities for the other two stories have shorter lead time. And to shorten the lead time we like, right?
In the first example the two available team members decided to do one story each, but in the second example they decided to work together on the two middle activities, as in pair-programming, which in this case was more efficient. And HOW the team work, is only up to the team. We cannot force them to work sequential.
One big unintended consequence of Agile development’s calculation of flow is that a big, but hidden constraint is put on the team. The reason is because the team is forced to work sequential in order to get higher values, since no started activity can ever be paused. Because, a paused activity always gets a punishment in the calculations, even if there is no available team member to take care of it. And that is to put a hidden constraint on HOW the team should perform its work. Which means that if the team chose to analyse the stories together and then continues with only one or two of them, the value goes down, because the rest of the stories have been started, but are now paused. But, is this a problem for real Flow efficiency, that activities are paused? Not always. A paused activity is only a real Flow efficiency problem when it starts to affect the Critical Path, so until then it can be paused. And if the team members do not have inactivity and solve all the activities within the time box, the team members have been highly efficient. We cannot wish for more.
And the Lead time in the two formulas are really not the same either. In VSM’s Flow Efficiency calculation, the Critical Path sets the time box, and this is the one to secure for the Project, or try to shorten in a Value Stream Mapping event. But, in Agile development’s calculation of flow, the Lead time is not the sprint box, it is instead what the length of the stories will be, so it can for example be one week for the most prioritised story. But, if Flow Efficiency was so important, why in earth was not the sprint box set to one week then?
The conclusions are:
- The Agile development’s calculation of flow is obscure, and what it is really measuring is not clear. But, clear is that it indirectly forces the team to work sequential, i.e. a hidden constraint that is affecting the teams own HOW. And as you already understood, in the example where the other stories take longer time; the team’s way of working has treacherously been affected to instead work sequential.
- If an Agile team is successful as above, a low value from Agile development’s calculation of flow for a story, only shows that if we wanted to finish that story earlier, we should directly have put in more people on the queued activities. But, since the teams have fixed size, that is probably not an option.
- And as we already know, measurements on parts are easily sub-optimising.
- And as we already also know, measurements can always be gamed as well.
****remember that on the whole only continually improvements can be done, due to the synchronisation needed between the parts when the respective part’s improvements are executed at the same time, for example takt time change. The parts can (and must) themselves be continuously improved, as long as they do not affect another part.
References
[1] Martin, Karen at Karen Martin Group, Inc. Link copied 2018-10-04.
https://www.youtube.com/watch?v=5YJYMLaV9Uw
[2] Sutherland, Jeff, et al. “Process Efficiency – Adapting Flow to the Agile Improvement Effort” . Link copied 2018-10-14.
https://www.researchgate.net/publication/327821851_Process_Efficiency_-_Adapting_Flow_to_the_Agile_Improvement_Effort
[3] Ackoff, L Russell. Systems-Based Improvement, Pt 1.
Link copied 2018-10-27.
https://www.youtube.com/watch?v=_pcuzRq-rDU