Thursday, January 29, 2015

What users want ?

One big challenge when dealing with user support is around time management. Years ago I read a book called Time Management for System Administrators by Thomas Limoncelli. This was a great book at the time because opened my eyes for a concept of huge importance that I only came to fully understand when I started to study TPS and Lean.
In this book the author insisted in the idea that it is the system administrator's responsibility to talk to users and gather their requests. He called it The Check-in-with-Customers Walk-around, and his idea was inspired by a coworker that used to visit and talk with all users every day.

When I first started to use this idea I was the only tech support in an office of around 60 people. After a few weeks, the first thing I realized was the amount of time that I had freed.I was really surprised by that, so I started to think, how could I have more free time, if actually I was spending more time to talk to people ?

After I put some thought on it I realized that getting the issue beforehand, when I came to talk with the users, allowed me to better organize my work and priorities. In the end, I had lots of free time to do stuff like research, improvements, playing video-game, at the same time that work was all done and users were happy.

People will only search for solutions and help when they actually need it. It is normal for people to be focused on different issues and only realize that they need your help when it becomes urgent. When it becomes urgent, you have little time to deal with it, and in most cases you will have to handle the user's frustration. By talking to them in a regular basis you reach their needs before it becomes a problem. This give you more time to solve the problem. Also allows you to organize your work in a way it becomes more optimal, resulting in free time to invest in other activities.

After doing this for years I learned a few tricks that can actually help a lot gathering the right information from users. For instance, if you just walk by and ask if "everything is ok ?" the reply most likely will be a nod and people will say it "is all good". The reason is because they are not focused on your question. You need to break the focus on what they are doing and force them to focus on your question. When I visit tables nowadays, I first ask how is everything. People will say, it is fine most times. Then I start to get specific about questions. How is the internet ? We upgraded the link, did you noticed difference ? How are the meeting rooms ? How is your laptops ? Is your project needing something ? Usually after shooting a few of this questions, something will surface. The questions induce them to really think about it making it easy to remember what they need. I take notes on every request and they all go to my backlog. This is the main way we collect requirements.


After doing this consistently for some time, users will get used to it. Many times I saw they organizing their needs to report them during the walks that we perform in the office. Also this discourages them to search a solution by themselves. In a classical ticket systems, where users need to send emails or call a service desk, it is not unusual for users to try to solve the issues by themselves instead of going thru the bureaucratic process of opening a ticket. This can have many negative impacts in our work:
  • Sometimes user solutions ignores important aspects as cost, security, monitoring and availability.
  • You will never know for sure the real demand you have, once some problems are solved by users.
  • Many times those solutions do not follow standards, but still the users will ask for you to help with their solution whenever they have an issue with it.
  • They are spending time in a activity that is not their focus. This could have business impact on the project they are working.
Another  element behind this practice is the fact that it approaches the clients and the support team. Email addresses and phone numbers are too distant and many times the user would like to have a face associated with the issue. It is psychological, but must people would rather know the person who is helping them. If they can have the person's contact number even better. Most times it will not come to a point where people will actually call you, but it works to increase confidence and the security sense.
Here in our company we started with the walks once every week and now we are increasing into twice every week (Tuesdays and Fridays).
Also you can assign different members of your team to collect the requirements from different teams in your company. This can be used in larger offices and whenever you have remote clients.


It is not important who will solve the issue, but it is important who is responsible to gather it and keep track of the status. This can be simply done during the daily standup meeting.


Friday, January 23, 2015

Cards, metrics and how to deal with it.

Organize workflow is not a simple task. Your goal is to have a system that provides you with valid metrics, works on pull, have continuous improvement process built-in , focus on quality and also heijunka (even flow). Not a simple task.
It is strange to start to write a post about it when we did not yet totally achieved it. We are administering small changes in a continuous way to ensure that the results we get are the ones we expect. So this is a post about where do we want to go, how do we plan to get there and what we did so far.

Before we can proceed with analysis of work itself, it is required to understand the possible status that work can possibly have:
  • Backlog
  • Working
  • Blocked
  • Done
Backlog is the queue where tasks wait until someone can start to work on it.
Working is a status where the team is performing activities related with that card.
Blocked is when a activity is not finish, but the team is waiting for an external party in order to either resume the work or move it to done.
Done is a status when all the work related with that activity is finished.



It is truly important to understand what kind of work you and your team do.  Different types of work requires different approaches, because then you can have more assertive processes on how to perform it and specially how they relate with each other. In our infrastructure/support work, we ended up with three classifications for work:
  • Fix
  • BAU - Business as usual.
  • Improvement
Fix is any system that was working and it is no longer functional.
This type of activity never goes into backlog, skipping directly into working. This means that whenever a Fix became necessary, the required resource to deal with it needs to be freed, and whatever work was being performed by the resource would go into blocked state with a special flag to identify the reason why the task is currently blocked.
Due to Fix nature of disrupting services, it needs to be restored as quick as possible. Once the service is resumed, the card can be move to done, but additional steps become necessary. Every Fix occurrence generates a new card in Improvement Backlog. This card needs to be tagged as a Hi-Priority card.


The reason for this behavior is to ensure that the root cause problem that generated that fix will not generate additional Fix requests in the future.

BAU or Business as Usual is the daily activity. It can include purchases, support activities, housekeeping, etc... This type of activity should be always driven by an existing process. Obviously when starting to adopt this methodology, processes for most of activities will not exist. In this case, the card is solved in the best possible way and a new request for a process is placed on the Improvement Backlog. If there is a process in place, the card will be solved following the existing process. If the person executing the process detects that there is room for improvement in that specific process, a new card related with the improvement is created in the Improvement Backlog. This is extremely important to ensure that continuous improvement or Kaizen is in place.


Another fundamental point around BAU is regarding how much Work in Progress you can allow.
Not limiting WIP can exceeds the limit of work your team can optimally perform. The overwork can have direct effects on the quality of the solutions and in the ability to finish activities. The amount of work that can be allowed as WIP varies from project to project, and it is not target to discussion in this post. We currently allow only one card in the working column per team member.

Improvement is your key element for quality. The amount of improvement that you have performed in your environment will have direct relation with your area maturity level. Improvement are, as the name says, changes that have the goal to change a current behavior into a more optimal one. Different from BAU and Fix, Improvements are based on a cycle time, which means that it always happens following a pre-defined cycle. Improvements cycles always follow a PDCA (Plan Do Check Act) model. In the Plan phase, an inception meeting is made with all the teams members to discuss the work. The goal here is to have a deep analysis of the problem to generate an action plan and a acceptance criteria. Once we have a plan and an acceptance criteria defined, we move to the next step, which is the implementation according with the plan built.  After the implementation it is necessary to wait for the change to converge. In this phase we collect feedback that will allow us to evaluate if the change have met the acceptance criteria that was defined in the first phase. Feedback can come in different forms and sources like people, monitoring systems, logs, etc...
If the acceptance criteria was not met, a new cycle is started and a new inception meeting is required to revisit the change (or sometimes revisit the acceptance criteria). If the acceptance criteria is met, then a new process is accepted and documented. The card is moved to done and a new election process will happen to determine which improvement from the backlog will be selected for the next cycle. The election process must obey an order:
  1. Ticket can not have pre-requisits that are not met.
  2. Ticket is marked as hi-priority
  3. Impact that the improvement will have on the BAU.
  4. Other improvements.
The number of improvement cycles that can exists is directly related with your team availability and how big is your BAU backlog. If your BAU backlog is under control, and you have the resources available there is no reason to not have more than one cycle. This balanced will change from team to team and it is not the objective of this post to discuss it.



Improvements can have a drastic effect on the number of Fix and BAU cards you will have in the future.

A big advantage of using a Kanban wall is the ability to get metrics. Metrics will show you many elements around your team's work and can be an excellent base for decisions.
Some useful metrics that are normally extracted from a Kanban system are:
  • Lead and Cycle time - This metric represents the average time your team spends in every task. This metric is normally compared with the project tackt time in order to predict and allow corrections on the performance. This is also crucial for creating continuous flow or heijunka.
  • Cards solved per week - Similar to your cycle time, gives you an idea of the average number of tickets your team solves in an average week.
  • Backlog growth - This metric is directly associated with your demand. It is the number of new cards (work) that your clients demand every week. Normally is compared with the cards solved per week in order to determine how effective your team is related with its demand.
  • Work in Progress - Measures the amount of work that is being executed at a specific moment. Usually is compared with cycle time in order to predict delivery time and other CFD info.
  • Blocked time - Can be used to determine  improvement opportunities in processes that require third parties interaction.

Tuesday, January 13, 2015

Thank you Ms. Satir - A study on change.

When I decided to bring this blog back from the dead, one of the thoughts that constantly were in my mind was the question "where to start ?". What is the most logical point to start with ? I decided that a (not so) brief introduction of the knowledge base I would be using in my posts was a good start place. So I did wrote posts about Lean and TPS. Now, having done that, the question visited me again.
Considering that all the tools and techniques discussed in this blog will possibly represent a change, I decided that a good course of action would be to analyze the change itself, it's impacts and how to handle it in a controlled way.

One of the best studies on change I could find was in the work of Ms. Virginia Satir. Ms. Satir was a social worker and a researcher on family therapy. Amongst other works she is responsible for the creation of the Virginia Satir Change Process Model. In this change model, focused on human and family behavior changes, Virginia identified that change had 5 distinct stages:
  • Old status quo
  • Resistance
  • Chaos
  • Integration
  • New status quo
 Although this model was developed focused on human behavior and interaction, it was acknowledged as having a great importance to understand change itself.


The first state, or Old Status Quo describes a stable system, where occurrences are predictable, familiar and comfortable. From a organization perspective, the Old Status Quo represents a system stable, where everyone understand their roles and attributions, although not necessarily this is an optimal system. In 1A change gets introduced into the system. According to Ms Satir analysis, the first reaction to change is Resistance. In a organization this could mean that there is slow adherence to the change. Several reasons could be behind a slow compliance to the change, from disagreement to lack of familiarity (it is easier to keep the Old Status Quo). After some time, people will get more used with the idea and attempt to adopt the change. This is where Chaos becomes apparent. The lack of experience with the new scenario will directly impact functionality. While people learn and get familiarized with the new standards, it is normal to expect delays, questions, misunderstandings and errors that will reflect as a drop in overall performance. Given proper time, questions will get answered, people will become familiarized with the new standards, error rates will drop. This period is called by Ms. Satir as Integration. During this period, performance will start to raise again, until stabilizes on a New Status Quo, where people will understand their new roles in the process and new ways of work are bedded down. The New Status Quo can be positive, neutral or negative, depending on the change impact on the organization. A positive change will have a New Status Quo with higher performance than the Old Status Quo. A neutral will practically remain unchanged while a negative will have a worst performance.
The size of a change will impact the size of the curve in a proportional way. A large change tends to generate more Chaos (and more performance drop) than a small change. One big risk when introducing changes in the environment is that your performance drop falls bellow of your minimal acceptable performance. A MAP is a estimated line in your performance level, that represents an impact the business can not live with. In this cases, it is not unusual for the change to get rolled back to the Old Status Quo.


The larger the change, the more risk it carries to hit your MAP. One approach used to avoid this issue is to break the changes into smaller changes, with smaller impacts on performance. This way, it is still possible to perform change keeping the impacts to an acceptable level.


Another great advantage of smaller changes is related with the outcomes. Even if you get a negative change, due to a shorter cycle time of small changes,  and the continuous improvement of the process, it is easier to review and start a new change.

Lean uses this concept in the form of kaizen and standardization.
Masaaki Imai in his outstanding book Gemba Kaisen: A Commonsense Low Cost Approach to Management, describes this process in a comprehensive way.

"...SDCA (Standardize-Do-Check-Act) standardizes and stabilizes the current process, while PDCA (Plan-Do-Check-Act) improves them. SDCA refers to maintenance and PDCA refers to improvement" - Gemba Kaisen.

 When inserting a change, it comes in the format of a standard (standardize), stabilizing the process. Do refers to implementing the standard. Check refers to determine if the implementation remains on track and has brought the planned improvement. Act refers to performing and adopt the new procedure to prevent recurrence of the original problem or to set the goals for the new improvement in case it does not meet expectancies. Then, as part of continuous improvement, a PDCA cycle is called. P refers to plan the improvement, D refers to implement it, C to evaluate and A to adopt or plan a new cycle.


 When applying the cycles to Virginia Satir Change Model, we use SDCA in the first interaction. If the results are below expected, then a PDCA can be used to improve the change process. This step can be repeated until we have a satisfactory result from the change. Later, if the team identify that there is still room for improvement, we repeat every step of the process, starting with a new SDCA and iterate with PDCA until the results match the expectancy.


To illustrate the use of this process, I will give a real example that happened last year in our Porto Alegre office. One of the critical components of our infrastructure are the virtual conference rooms. This is due to the distributed nature of our company and the fact that majority of our clients are located in different countries. The meeting rooms setup that we use for virtual conference have a flat screen TVs, one mac mini, one conference microphone and a Full HD webcam. During the day, different groups and projects share the meeting rooms, and it is not unusual for them to change configurations. Different versions and brands of communication software are used, based on what the client have available. In order to maintain all this updated and functional, we discussed and created a process. In this process we would automate the deploys of images in a daily basis. The images would receive software updates once a month. We implement the process and waited. During the next two weeks, we collected feedback from the users about this new process. Many users were not satisfied because their temporary files and configurations would be erased when the new image was deployed. At this point, we decided that the results were below expected and started a new PDCA cycle to improve the process. After discussion about the problem, we decided to change the deploy time, from daily to monthly. We implemented and once again observed. In the feedback sessions with the users we found that it was better, but still was impacting them. So we started a new cycle. This time we decided to change the deploy from monthly to every 3 months, based on a research of how often companies would release new patches. We also, change the image generation to every 3 months, to eliminate unnecessary work. We applied the changes and observed. From the feedback that we received, the users were, overall, feeling comfortable with the solution. At this time, we decided that we had reached a desired result and documented the process. The process then was distributed to other offices of the region to be deployed. All the changes that we did were small changes, with low risk to the business. When we discussed the issue, we always attempt to unveil the root cause and work directly on it. Also, in the check phases, we gave two weeks for the users to familiarize with the change and only then we actively look for feedback. In this case, it took us 3 cycles to get into a state where the results from the change were satisfactory. It is really difficult to get it right in the first try. Many of the changes that we applied in our offices required at least two cycles to get it right. This is not waste. By ensuring that your change will have satisfactory results you will always be aggregating value and eliminating waste from the system.

The conclusion is that the use of continuous improvement and standardization to control small continuous changes will lead your environment to a positive transformation, without the addition of unnecessary risks.

Monday, January 12, 2015

Toyota Production System (Part II of II)

This post is a continuation from this post here.

Principle 9. Grow Leaders Who Thoroughly Understand the Work, Live the Philosophy, and Teach It to Others.

If Toyota considers the culture being one of the most important elements inside TPS, it becomes obvious the importance of having leaders who truly understand it. Toyota's leaders are developed inside and not brought from outside. Toyota want someone who is familiar with the culture. They want someone who can understand how the work is done in the line. And they want someone who is committed to pass the culture to the people he leads.

Another important aspect of Toyota leadership is regarding the leader presence. Toyota believes that the leaders should be always present in gemba. Gemba is translated as "the real place". Toyota leaders all practice genchi gembutsu or the "go-and-see" art. This is about leaders and managers to be present in the place where the product is being developed. It is said that new leaders in Toyota are taken to a place in the line, and a chalk circle is drawn in the floor. Than he/she is asked for the new manager to stay in that circle and observe. Hours later, someone will come and ask him, what did he see. This is a common way in Toyota to teach the importance of gemba.

Also it is expected from leaders full understanding of the processes and the work, making it very difficult to hire an external leader. Leaders in Toyota often interacts with their workers, and it is not unusual inside Toyota for a leader to cover a worker in his absence.

Principle 10. Develop Exceptional People and Teams Who Follows your Company's Philosophy

In order to achieve development of teams one must first consider what is the role of leadership.
The main role of a leader inside TPS is to sustain the development of Lean thinking inside the company. It is done by coaching and by example, from the highest levels of leadership to the team leader in the line. The leaders, aside from their work responsibilities, have to coach and help to spread the Toyota culture into the company. It is a common attribution every leadership position comes with in Toyota.

The hiring process in Toyota also worth mention. It focus on identifying in the candidates characteristics that are align with the company's values.  It is a three-stage process that includes a job fair and several interviews. This kind of effort ensures that the new hired person have the potential to develop himself inside the company's culture and values.
After the job offer, a new hire receives instruction on the culture and TPS. Until the individuals and teams really understands the culture and TPS, they are not in a position to be empowered. The road for a team development is usually long. Toyota understands that groups have to develop over time and are not able to jump right into functional efficient work.

Another contrast TPS presents when compared with traditional models is regarding of where the problems are solved. TPS understands that problem solving is a responsibility of the working groups and leadership should act as facilitators for this process. The reason behind this approach is the familiarity with the process and the eventual problems that the working groups have, placing them in an optimal position to improve process and solve any related issues. This also empowers the workers to contribute with kaizen or continuous improvement.

Principle 11. Find Solid Partners and Grow Together to Mutual Benefit in the Long Term.

In many business, the quality of your partners and suppliers will be a determinant factor into the quality of your product. Toyota recognizes this statement and always aims for long term relationships with its partners and suppliers.
If it is normal for your company to undergo a complex hiring process to ensure that your workers will be aligned with the company's culture, then why it should be any different when selecting partners and suppliers ?
Toyota usually considers new suppliers with caution and normally test them with small orders to determine quality and commitment. It is not unusual for Toyota managers to visit new suppliers lines in order to evaluate their processes and work. Also Toyota will teach new suppliers and partners about TPS and the Toyota Way. A partner or supplier is part of the Toyota family and as such have the same expectancies and opportunities as a regular employee. Once a partnership is established, it is very difficult to be ruptured.

Principle 12. Go and See for Yourself to Thoroughly Understand the Situation (Genchi Gembutsu)

Fugio Cho was the first president of the Toyota's Georgetown, Kentucky's plant. It is said that frequently he would be found in the line, observing the teams working. Workers and managers would say that he would be really focused on the line, staring, in a trance like mode. After some time he would break focus, say good morning to people around and get back to his office.  It was not unusual for a request from the president's office to come later in the day, requesting to tight a process or to correct the flow in the area. This is what Toyota call Genchi Gembutsu or go-and-see.
Leaders in Toyota understand that the most important place for them to be was the production line, or Gemba (the real place). Gemba  is the place where value is added. It's the place where continuous improvement or Kaizen happens. It is the heart of TPS.
Another famous story tells that a group of Toyota managers would be visiting a possible new supplier in US. The supplier team scheduled a full day of presentations about their product with the sales team and other managers. When they arrive in the site, they asked to go directly to the production line. The sales team insisted to have them going to the presentation first, but they shown little interest, insisting into going directly to the line. Once they got to the line, they stay there, staring at the workers and the machines for some time. Then, they turned to the sales managers and said that unfortunately it would not be possible to conduct business with them. This is how serious Toyota considers Gechi Gembutsu.

Principle 13. Make Decisions Slowly by Consensus, Thoroughly Considering All Options; Implement Rapidly.

Decisions are, without any doubt, an important part of any business. The way business make their decision varies from company to company, business to business. Toyota decision process is slow and usually requires a significant amount of work, research and discussion. Toyota collects all possible data on a subject before making a decision. Although this is a costly process, it has been proven quite efficient and fundamental for TPS. Decision making have five major elements:
  • Genchi Gembutsu to find what is really happening.
  • 5 whys to reach the root cause.
  • Map and analyze all possible courses of actions.
  • Involve teams, stakeholders and partners in the discussion.
  • Use efficient communication to facilitate from one to four.
 This methodology is called Nemawashi in the TPS. The literal translation would be "going around the root", and the original meaning was just that, digging around the root to prepare the tree for transplant. It's main goal is to allow all the people to propose possible solutions, focusing the core issue and decide which approach will be taken by consensus.

Principle 14. Become a Learning Organization Through Relentless Reflection (Hansei) and continuous improvement (Kaizen)

Learning to learn is a big thing inside Toyota. They achieve it by merging standardization and innovation, two concepts often regarded as opposite, into a functional methodology.  By standardizing the processes, it creates a environment where improvement thru innovation is possible.
The key element for this is employee empowerment. Allowing individual and team innovation to be spread into the organization in the form of a standard. Standardization punctuated by innovation which leads to new standards.
Problem solving is one place where standardization meets innovation to generate new standards. When studying a new problem in a process, the flow usually starts by gathering more information about the situation. Many times this is done using Genchi Gembutsu. The next stage is focused into finding the real problem. This is often done asking repeatedly the question why? Why A happened? Because of B. Why B happened ? Because of C...repeat until you find the fundamental issue. Apply countermeasure. Observe how the countermeasure behaves to fix the problem. Repeat until satisfactory. Standardize.


Inside this method, hides two important aspects of the TPS culture: hansei and kaizen.
Hansei is the self-reflection. It pictures the ability of looking inside (to self) and identify what is wrong. In quality, this is of paramount importance, once it is impossible to solve a problem you don't acknowledge. It is said, in Toyota that no problem is a problem. There is a story of a senior director from Toyota who was visiting a plant in the US. After a few days, the director asked the local plant manager how many times did they stopped the line (because of problems). The manager say that it was a good thing that they did not had any problems, and so the line did not had to be stopped. The Toyota director replied to the manager "no problem is a problem".
Only through hansei, kaizen can exists. This is a very strong statement that means "there is no improvement without identifying first what needs to be improved". So, in TPS, find your weakness will enable you to improve. And the way to improve is through kaizen.
Kaizen, as described in this post, is the constant search for improvement. The literate translation would be continuous improvement, and it is at the core of the TPS.

These last three posts:
  • Lean
  • Toyota Production System (Part I of II)
  • Toyota Production System (Part II of II)



They form the theory base from what we will explore in this blog in future, more practical posts based on my experiences. I hope I was able to touch each concept in a way that is not boring or tedious (although I might have failed).  What I can promise, is that the next posts will be more practical and less conceptual. I hope you guys are still there...


Wednesday, December 31, 2014

Toyota Production System (Part I of II)


"The practical expression of Toyota's people and customer-oriented philosophy is known as the Toyota Production System (TPS). This is not a rigid company-imposed procedure but a set of principles that have been proven in day-to-day practice over many years. Many of these ideas have been adopted and imitated all over the world." - Toyota Website

This is the first paragraph in the Toyota website describing what the Toyota Production System, or TPS  is. As stated in the website, these principles were studied and adopted worldwide, by many companies trying to understand and replicate the success of this system. Many other systems and methodologies were born from TPS, Lean Manufacturing being the most famous. It is important to differentiate at this point Lean and TPS. Although there are a few discussions on the subject, the most accepted concept is that TPS is the precursor of the more generic Lean Manufacturing. For more information on Lean, there is a post discussing its principles in this blog here.

The history of the TPS goes back to Sakishi Toyoda, in the year of 1896. Sakishi developed a loom machine with automatic stop mechanism. This was the remote principle of Jidoka, allowing an automatic machine that could be stopped by human intervention. This was a very important step towards quality, that the Toyota would maintain until these days. The concepts of flow and JIT were first introduced by Kiichiro Toyoda when he designed a production method using a chain conveyor into the assembly line of a textile plant, in 1927. Later in 1938, Kiichiro introduced the same system in the body production line at Toyota Motor Co. Those were mechanisms used by Toyota Motor Co., but was under Eiji Toyota and Taiichi Ohno that the TPS become the philosophy that we know today as TPS.

The TPS is based in 14 principles divided in 4 different sessions.



Principle 1. Base your management decisions on a long term philosophy, even at the expense of short-term financial goals.
This is probably the most important aspect in TPS. It represents the company's culture and how this culture is shared and observed by every worker and partner. It is something that is present from the company's CEO office to every plant floor, in every location. It is there to guide decisions and set expectations of every single person involved in this enterprise. Generate value for the customer, society, and the economy. This is the starting point where every function in the company is evaluated in terms of its ability to achieve it.
The great importance of this principle make me choose to create a blog post dedicated to discuss it. You can find it here.

Principle 2. Create continuous process flow to bring problems to the surface.
 Most business process are full of waste. Create a continuous flow of information or material, will help to identify and reduce the waste in a process. 
Flow means that when your customer places an order, this triggers the process of obtaining the raw materials needed just for that customer's order. The raw materials then flow immediately to supplier plants, where workers immediately fill the order with components, which flow immediately to a plant, where workers assemble the order, and then the complete orders flow to the costumer.  - The Toyota Way 
 In the core of the continuous flow process is the One-Piece Flow and the Takt Time, that works as a heart beat for it.
Takt is the rate of customer demand. If customer demands is 3600 pieces per month, and the factory works 7 hours a day, 22 days per month, your Takt time will be 154 seconds per piece (554400 worked seconds in a month, divided by the 3600 pieces required in the same month). This means that, in order to achieve a continuous flow, every step in the one-piece flow should spend 154 seconds. If the step is going in a faster pace, they will end up overproducing (waste), and if the step is going too slow, they will become a bottleneck, accumulating inventory (waste). Takt time is used to measure the work in each production step, detect issues and help to solve them.
One-piece flow is also an important element in the flow creation. The idea of one-piece flow is that no step in the production can produce any part unless that is required by the next step. In order to make flow process possible inside production, a U shape one piece flow cell unit concept was developed by Toyota. The idea was to have in one production cell, all the steps required to build a product, and that the work flow from station to station with minimum waste.


A production environment can have multiple cells.
In the end, creating flow helps to eliminate waste (muda) and unevenness (muda).

Principle 3. Use "pull" systems to avoid overproduction
The best definition for a Pull System that I could find states that is a method of controlling the flow of resources by replacing only what has been consumed, and produce no more than what is already ordered. Toyota usually says that the goal of a pull system is to provide your customer with what they want, when they want it, and in the amount they want. No more, no less.
This system avoids the creation of big inventory buffers, that is also considered waste (muda) by Toyota. Taiichi Ohno use to say that the more inventory a company has, the less likely they will have what they need.
One important point is to admit that zero inventory and one-piece flow are not easy systems to master. Some companies will take a long time, improving their processes in order to approach those ideal models, and sometimes they will come to realize that, for their business model, some level of buffer are acceptable and necessary. Rother and Shook in their book, Learning to See (1999) stated "Flow where you can, pull where you must." What is important is to ensure that those buffers have a small size and are replenished by pull system (only when the buffer reach a low level you refill the buffer).
In the Toyota line, this concept was implemented using bins with a color code in the bottom. A station would have two bins filled with a specific part. When one of the bins become empty, they would send it to refill. The code in the bottom of the bin would give the instructions of which part should be filled and where to deliver it. This way, the station would always have the parts required to perform work without the need for a large inventory. They called this card system kanban. Kanban, inside Toyota, are cards and signs that help to signal events and requests on the production line. This days, Toyota will often use a computer system for scheduling some operations, but then use manual cues like cards or white boards to visually control the process.

Principle 4. Level Out the Workload (Heijunka)
One of the goals of the TPS is to eliminate waste. Toyota managers and employees use three japanese words to define waste:
  • Muda - Non-value-added. In the production line, there are 8 types of muda:
    • Transport
    • Inventory
    • Motion
    • Waiting
    • Overproducing
    • Overprocessing
    • Defects
    • Mnemonics
  • Muri - Overburdening people or equipment. Overburdening people results in safety and quality problems. Overburdening equipment causes breakdowns and defects.
  • Mura - Unevenness. Unevenness on production implies that there will be overburden periods and periods of underuse.
Having starts and stops, overutilization then underutilization, is a problem because it does not lend itself to quality, standardization of work, productivity, or continuous improvement.
Heijunka is the leveling of production by both volume and product mix. It does not build products according to the actual flow of customer orders, which can swing up and down wildly, but takes the total volume of orders in a period and levels them out so the same amount and mix are being made each day. 

Principle 5. Build a Culture of Stopping to Fix Problems, to Get the Quality Right the First Time.
Jidoka traces back to Sakichi Toyoda and his loom. Amoung his improvements was a device that detected when a thread broke and, when it did, it would stop the loom at once. The idea was to solve the problem right away before move production on. Quality should be built in. This means that a method to detect problems when they occur and stop production to fix the issue should be present. If not, the defect will continues downstream.
Another important aspect of jidoka is autonomation. Autonomation is the power that every employee have to make a decision to stop the line when he detects something unusual or wrong. Toyota says that In-station quality (when the problems can't move upstream in the flow) is much more effective and less costly than inspecting and repairing quality problems in a later station up in the flow, or even have to deal with a low quality product in the end of the process.
When the line was stopped, a light signal would be ligt to point in what part of the flow the issue was originated. This signalling is called andon, and represents a call for help. In the Toyota line, the first andon would not stop the whole cell. It would stop one station. The team leader would have one takt to respond before the andon becomes red and stops the cell. If the teamleader can fix the issue or determined that the line does not need to stop to fix the problem, he can push the button again to cancel the andon.
Another tool used to detect issues on the stations are what the japanese call Poka-Yoke. Poka-Yoke is a mechanism in the station that helps an equipment operator (yokero) to avoid mistakes (poka).

Principle 6. Standardized Tasks Are the Foundation for Continuous Improvement and Employee Empowerment.
There in no improvement without standardization. This is Masaaki Imai's message in his book Gemba Kaizen: A Commonsense, Low-Cost Approach to Management.
A standard is a set of instructions, formulas and information that describes how to operate a process.
A standard is a living document and should be updated under two circumstances  a better routine to conduct the process is discovered and must be attached or adapted to the standard, or an error lead to a standard revision which recommended a change to avoid the recurrence of the error.
More on standards, according to Imai:
  • Standards represents the best, easier and safest way to do a job.
  • Offer the best way to preserve know-how and expertize.  
  • Provide a way to measure performance.  
  • Show the relationship between cause and effect. 
  • Provide basis for both maintenance and improvement.
  • provide objectives and indicate training goals.
  • Provide a basis for training.
  • Create a basis for audits and diagnosis.
  • Provide a means for preventing recurrence of errors and minimizing variability.
Another important aspect of the standard lies in the ability to empower the employee that operates the process. Because the operator is the person with the most familiarity with the process, he is the most indicated person to propose improvements to the process. Standards have to be specific enough to be useful guides, yet general enough to allow for some flexibility. This flexibility is the entry point for continuous improvement in the standard.

Principle 7. Use Visual Controls So No Problems are Hidden
A clean environment improves visibility. In Japan, the Toyota factories were all very organized and clean. This was an effect of another TPS tool: The 5S program. This program are a series of activities that aim to eliminate waste in the workplace. Each S was originally a japanese word that ended up translated to english:
  • Seri or Sort - Keep only what is really necessary to do the work. All other items are distractions and potential sources for accidents.
  • Seiton or Straighten - It is about how organized your work environment is. "A place for everything and everything in its place"
  • Seiso or Shine - Keep it clean. In a clean environment is easier to detect problems. Also, the process of cleaning is also a good inspection process.
  • Seiketsu or Standardize - Develop processes to ensure the first 3 Ss are being applied.
  • Shitsuke or Sustain - Ensure that the workplace is kept organized and clean at all times. Audit the environment regularly.
There is a story that a group of Japanese managers came to US to visit a potential partnership with a vendor. When the Japanese arrived, the vendor had already organized an schedule for the day, starting with a long presentation about the company and the product. The japanese managers insisted to skip the presentation and asked to be taken right away to the factory (Gemba). Once arriving there, it took them less then 15 minutes to come to a decision. When they told the vendor that it would not be possible to do business with them, the vendor asked why. The answer was simple: You do not understand 5s. So you can not understand TPS. This represents how important 5S is to TPS.

Visual controls are broadly used in the workplace, from the assembly line to the offices. The idea is to make information available just by looking. One of the most important visual controls from Toyota is the Obeya. Obeya is a large room that received information from different systems and metrics and have this information available in large monitors inside the room. It is used to help decision making based on the informations available. 
"A well-developed visual control system increases productivity, reduces defects and mistakes, helps to meet deadlines, facilitates communication, improves safety, lower costs, and generally gives the workers more control over their environment" - The Toyota Way.
Principle 8. Use Only Reliable, Thoroughly Tested Technology That Serves Your People and Processes
For Toyota, the process to adopt a new technology requires a long period of evaluation and trials. One of the important aspects analyzed is regarding the impact of the new technology. If no value is added, the idea is just discarded. It is necessary to verify that real value will be added to the product, people or processes before it can be adopted. Another aspect of great importance is around the cultural aspect. Toyota will not adopt a technology that conflicts with it's philosophies and operation principles. Above all, Toyota look at technology as a tool, and as any other tool, it is there to support the process and the people who operates it.
Before it can be adopted, a process must be created and implemented manually, to verify efficiency and functional aspect.
"First work out the manual process, and then automate it. Try to build into the system as much flexibility as you possibly can so you can continue to kaizen the process as your business changes. And always supplement the system information with "genchi genbutsu" or "go look, go see". - The Toyota Way.

It is too long for one post...lets split in two posts and make it easier for both the reader and for me :).
to be continued...

Sunday, December 28, 2014

Lean.

Before we start to discuss how to use Lean and TPS principles in IT support and IT infrastructure, it is necessary to understand it first. So, we will have two blogposts that will aim to give an understanding on Lean and TPS before we can move on. This first post will focus on Lean.

I remember the first time I saw "the lean house", I could not understand it. It was the drawing of a house, with a floor, a roof and two pillars. The roof was representing things you would like to achieve. One pillar represents the flow while the other represents the process. In the base, is the continuous quest for quality. This is a very simplified description of it. The original drawing was a bit more complex, as we can see below.

The Lean House used to teach lean principles


This was first used by the lean founders to explain the coherence and harmony of the Lean System.
The foundations of the house are build over muda reduction and kaizen. The word muda translates as waste in english. The elimination of waste is considered one of the very important steps in the lean methodology. Kaizen can be translated as continuous improvement. Those are the two elements where everything is built over. Waste elimination and continuous improvement.

The seven wastes identified in Lean manufacture are:

  • Transport
  • Inventory
  • Motion
  • Waiting
  • Overprocessing
  • Defects
One of the systems created to reduce the waste was the 5S. 5S is a system to reduce waste and optimize productivity through maintaining an orderly workplace and using visual cues to achieve more consistent operational results. Implementation of this method "cleans up" and organizes the workplace basically in its existing configuration, and it is typically the first lean method which organizations implement.
The 5S pillars, Sort (Seiri), Set in Order (Seiton), Shine (Seiso), Standardize (Seiketsu), and Sustain (Shitsuke), provide a methodology for organizing, cleaning, developing, and sustaining a productive work environment. In the daily work of a company, routines that maintain organization and orderliness are essential to a smooth and efficient flow of activities. This lean method encourages workers to improve their working conditions and helps them to learn to reduce waste, unplanned downtime, and in-process inventory.

Lean production is founded on the idea of kaizen – or continual improvement. This philosophy implies that small, incremental changes routinely applied and sustained over a long period result in significant improvements. The kaizen strategy aims to involve workers from multiple functions and levels in the organization in working together to address a problem or improve a process. The team uses analytical techniques, such as value stream mapping and "the 5 whys", to identify opportunities quickly to eliminate waste in a targeted process or production area. The team works to implement chosen improvements rapidly (often within 72 hours of initiating the kaizen event), typically focusing on solutions that do not involve large capital outlays.
Periodic follow-up events aim to ensure that the improvements from the kaizen "blitz" are sustained over time.

The pillar in the left, have its base as heijunka. Heijunka translates as continuous or smooth sequencing.  Which means that the ideal work is always constant, and that variation in production should also be eliminated. Heijunka focus is to reduce the mura (uneveness) and by doing that, also reduce muda (waste).
In the same pillar we also have "Pull System", where production aims to make only the parts required by client orders (on demand). 
Based on the orders, a Takt time is calculated and measured against the cycle time of every part of the manufacturing process. The Takt time is the relation between time and client demand. Once calculated this is divided in all processes of the production, and have the goal to set the rhythm of the production. Whenever a difference in the flow appear, kaizen is used to correct the flow. It also ensures that the flow will be constant, helping to achieve heijunka.

Comparison between Takt time and Cycle time during manufacturing process.


The last element of this pillar is JIT or Just-in-Time. It focus on muda elimination by reducing the amount of goods and materials in stock. The idea is to have parts and partial products ready for the next process only when they are required, reducing the amount of inventory in each process. This concept is very align with the pull system.



This first pillar is very focused on the flow. It helps to guide how the work moves from one station to the next station, from one process to the next. The next pillar examines in more detail what should happen within the stations, focusing in the process of each production phase.

Standardize work is critical inside Lean. If you do not have a standardized process, then you can not have Kaizen in that process. The TPS principle #6 states "Standardized tasks and processes are the foundation for continuous improvement and employee empowerment." Use of stable, repeatable methods everywhere is the key to maintain the predictability, regular timing and regular output of your processes. Allowing people that work with the process to suggest improvements, and then incorporate those improvements to the standard is what will ensure the continuous improvement of the process.

The oldest part of the production system is the concept of Jidoka which was created in 1902 by Toyoda founder Sakichi Toyoda. This concept pertains to notion of building in quality at the production process as well as enabling separation of man and machine for multi-process handling. A group of machines would be handled by an human operator, capable of making intelligent decisions and shutting down automatically at the first sign of an abnormal condition. When the line was stopped, the problem was solved in the immediate condition, but a team would start to investigate the root cause of the issue, and once detected, a fix would be applied wherever necessary. This stopped the process from create defective material and also enhanced the quality.

Those are the principles of the Lean Process. Next post we will analyze the 14 principles of the Toyota Production System. The understanding of those concepts will allow us to better understand on how to adopt this same methodologies in our IT environment.




Friday, December 19, 2014

Keep it simple.

Simplicity is underestimated.
In order to understand what this really means, we need to first understand another concept: Efficiency.
A lot of people often talk about efficiency, without really understanding the real meaning of it.
According to the Merriam-Webster, "the ability to do something or produce something without wasting materials, time, or energy".
So, if I had to simplify this concept in a few words, I would say that efficiency is to produce without waste.
Now, waste is the spending of any more resource other than the strictly necessary to perform any given action.
In the IT industry, it is often common to people to focus on the technology without looking at the technical cost of it. Technology, does not come for free. A system requires specialists to develop it, deploy it, tune it and then people to maintain and operate it. It is not uncommon for people to create a larger problem from the use of technology than the original issue that technology is being aplied to solve.

Few weeks ago, I had a meeting to present a solution to a problem we had in our environment. The problem was around the meeting rooms. In our company, meeting rooms are shared resources, and often are used by different teams in a daily basis. In each meeting room we have a TV connected to a Mac-Mini, a conference microphone, a webcam and a telephone. With different teams using the resources frequently, it is common to have problems. The solution we found, was to create a card with two sides. One green and one red. In the red side, we have options for people to select where the problem is located. Each card is kept in the door with the green face out. If anybody using the meeting room find an issue, the person just turn the card to the red side and mark the option(s) where he had the issue. When I was explaining the concept, the first reaction that I received was someone saying that they should implement it using a tablet. An application in the tablet would be responsible to call the support team. Ok, lets analyze the suggestion. You need to build a complex application (some will say it is not complex, but when compared to a two color card, even a hello world would be considered complex). Next, you need to ensure that the tablet will be functional at all times. Third, you need to ensure that the connectivity is in place. Another issue, someone needs to be in the support room to pickup the call, otherwise it wont work. Can you see ? Two solutions, both solves the same issue. Look at the difference in complexity. It is a common mistake from people in IT to always try to solve problems using technology.  And this is a larger issue than we actually think.

A solution like this is based on trust. People will use a solution if they trust that solution will solve their problem. If a solution fails to solve the problem, people will stop trusting the solution and, even if you make it right later, people will not use it. A simple solution have a better chance to be used when compared with a complex one.
Technologists love systems and new technologies. They will find all sorts of reasons to justify the usage of the latest piece of tech. If you use technology as a supportive tool for your business, and not as something that will give you a business edge, you need to keep it simple.

Toyota have a very interesting approach to the use of technology. Use only reliable, thoroughly tested technology that serves your people and processes. It is normal to implement a process manually before involve technology. This enable people to understand the process making it easier to implement something to automate it. Inside Toyota, new technologies are often seen as unreliable and difficult to standardize and therefore endangers the "flow". People will not replace a technology unless the amount of benefits surpasses the costs. Even when the benefits surpasses the costs, it will face a series of tests to ensure it will perform as planned. The idea to keep supportive systems with low complexity is the closest to efficiency I can imagine an IT area can achieve.
In some ways efficiency is simplicity.