Epic Bitter: Cards, metrics and how to deal with it.

Organize workflow is not a simple task. Your goal is to have a system that provides you with valid metrics, works on pull, have continuous improvement process built-in , focus on quality and also heijunka (even flow). Not a simple task.
It is strange to start to write a post about it when we did not yet totally achieved it. We are administering small changes in a continuous way to ensure that the results we get are the ones we expect. So this is a post about where do we want to go, how do we plan to get there and what we did so far.

Before we can proceed with analysis of work itself, it is required to understand the possible status that work can possibly have:

Backlog
Working
Blocked
Done

Backlog is the queue where tasks wait until someone can start to work on it.
Working is a status where the team is performing activities related with that card.
Blocked is when a activity is not finish, but the team is waiting for an external party in order to either resume the work or move it to done.
Done is a status when all the work related with that activity is finished.

It is truly important to understand what kind of work you and your team do. Different types of work requires different approaches, because then you can have more assertive processes on how to perform it and specially how they relate with each other. In our infrastructure/support work, we ended up with three classifications for work:

Fix
BAU - Business as usual.
Improvement

Fix is any system that was working and it is no longer functional.
This type of activity never goes into backlog, skipping directly into working. This means that whenever a Fix became necessary, the required resource to deal with it needs to be freed, and whatever work was being performed by the resource would go into blocked state with a special flag to identify the reason why the task is currently blocked.
Due to Fix nature of disrupting services, it needs to be restored as quick as possible. Once the service is resumed, the card can be move to done, but additional steps become necessary. Every Fix occurrence generates a new card in Improvement Backlog. This card needs to be tagged as a Hi-Priority card.

The reason for this behavior is to ensure that the root cause problem that generated that fix will not generate additional Fix requests in the future.

BAU or Business as Usual is the daily activity. It can include purchases, support activities, housekeeping, etc... This type of activity should be always driven by an existing process. Obviously when starting to adopt this methodology, processes for most of activities will not exist. In this case, the card is solved in the best possible way and a new request for a process is placed on the Improvement Backlog. If there is a process in place, the card will be solved following the existing process. If the person executing the process detects that there is room for improvement in that specific process, a new card related with the improvement is created in the Improvement Backlog. This is extremely important to ensure that continuous improvement or Kaizen is in place.

Another fundamental point around BAU is regarding how much Work in Progress you can allow.
Not limiting WIP can exceeds the limit of work your team can optimally perform. The overwork can have direct effects on the quality of the solutions and in the ability to finish activities. The amount of work that can be allowed as WIP varies from project to project, and it is not target to discussion in this post. We currently allow only one card in the working column per team member.

Improvement is your key element for quality. The amount of improvement that you have performed in your environment will have direct relation with your area maturity level. Improvement are, as the name says, changes that have the goal to change a current behavior into a more optimal one. Different from BAU and Fix, Improvements are based on a cycle time, which means that it always happens following a pre-defined cycle. Improvements cycles always follow a PDCA (Plan Do Check Act) model. In the Plan phase, an inception meeting is made with all the teams members to discuss the work. The goal here is to have a deep analysis of the problem to generate an action plan and a acceptance criteria. Once we have a plan and an acceptance criteria defined, we move to the next step, which is the implementation according with the plan built. After the implementation it is necessary to wait for the change to converge. In this phase we collect feedback that will allow us to evaluate if the change have met the acceptance criteria that was defined in the first phase. Feedback can come in different forms and sources like people, monitoring systems, logs, etc...
If the acceptance criteria was not met, a new cycle is started and a new inception meeting is required to revisit the change (or sometimes revisit the acceptance criteria). If the acceptance criteria is met, then a new process is accepted and documented. The card is moved to done and a new election process will happen to determine which improvement from the backlog will be selected for the next cycle. The election process must obey an order:

Ticket can not have pre-requisits that are not met.
Ticket is marked as hi-priority
Impact that the improvement will have on the BAU.
Other improvements.

The number of improvement cycles that can exists is directly related with your team availability and how big is your BAU backlog. If your BAU backlog is under control, and you have the resources available there is no reason to not have more than one cycle. This balanced will change from team to team and it is not the objective of this post to discuss it.

Improvements can have a drastic effect on the number of Fix and BAU cards you will have in the future.

A big advantage of using a Kanban wall is the ability to get metrics. Metrics will show you many elements around your team's work and can be an excellent base for decisions.
Some useful metrics that are normally extracted from a Kanban system are:

Lead and Cycle time - This metric represents the average time your team spends in every task. This metric is normally compared with the project tackt time in order to predict and allow corrections on the performance. This is also crucial for creating continuous flow or heijunka.
Cards solved per week - Similar to your cycle time, gives you an idea of the average number of tickets your team solves in an average week.
Backlog growth - This metric is directly associated with your demand. It is the number of new cards (work) that your clients demand every week. Normally is compared with the cards solved per week in order to determine how effective your team is related with its demand.
Work in Progress - Measures the amount of work that is being executed at a specific moment. Usually is compared with cycle time in order to predict delivery time and other CFD info.
Blocked time - Can be used to determine improvement opportunities in processes that require third parties interaction.

Epic Bitter

Friday, January 23, 2015

Cards, metrics and how to deal with it.

No comments:

Post a Comment