software development

Jidoka and Multi-tasking

I’ve recently done a fair amount of research into the application of lean manufacturing techniques to software development. Its mentioned in a lot of places that the Toyota Production System is based on JIT and Jidoka. (Personally I think Kaizen should fit in here as well, as a governing philosophy.)

Essentially, jidoka means:

  • automatically stopping the line when a defect is detected
  • fixing the defect
  • instituting counter-measures to prevent further defects (implies root-cause analysis)

By instituting these counter-measures in the system immediately, you’re building quality into the system.

In my opinion, jidoka resonates with the ‘Boy-Scout Principle’ (leave it better than when you found it) and the Pragmatic Programmers’ ‘Don’t Live with Broken Windows’.

From my interpretation, jidoka means that when you find a defect in your software development process, you stop it there and then, and fix it. Broadly this would include bugs, ‘bad’ or flawed code, broken builds etc. (Please challenge me on these in the comments, I’m not 100% sure if all of these fit in.)

If I extrapolate a bit, this implies that if I’m doing a code review of one of my report’s work and I find some badly written or designed code, I should immediately pull all the developers in my team off of what they’re doing, fix the bad code, and have a session on why the code is bad and how it should be written in future (the counter-measure).

This is where my difficulty begins. It is now relatively well documented that multitasking causes delays in inefficiencies in the process. I know from personal experience that the context switch involved in changing tasks, at any granularity, is expensive and disruptive.

Then, given that interrupting all the members in my team will cause a major context switch, how do I satisfy the demands of jidoka?

If a bug is reported by the QA team or by an end-user, does the developer (or pair) who originally worked on that feature/code stop what he’s doing right now and fix the bug?

Maybe jidoka is less applicable to software development as it is to manufacturing: how much context is involved in the case of a worker attending an assembly line? (I don’t know, I haven’t worked as one before.)

I am led to another (off-topic) question: in the case of a bug report, which causes less of a context switch:

  • the developer moving to work on the bug right away, while the context of his original work on the code is still fresh at the expense of the context of the current task
  • the developer moving to work on the bug only once his current task is complete, thereby retaining the context of the current task, but losing context of the buggy code

How does one achieve a good balance between satisfying jidoka and disrupting the team as little as possible?

When should the knowledge created by the bug fix be disseminated across the team?

Should teams have a scheduled weekly or fortnightly code review/knowledge dissemination session?


5 thoughts on “Jidoka and Multi-tasking

  1. A friend of mine and I discussed this post in an IM session. His contention is that in traditional lean manufacturing, e.g. a car assembly line, a defect on the line would affect every car. In a software development process, this may not necessarily be the case.

    While formulating the post, one example I had in mind is something that might be repeated many times in an application, for example a GridView in an ASP.Net page.

    If I’m reviewing a report’s code, in which he’s using a GridView control in a manner which I consider incorrect or sub-optimal, and I know that the application will require a lot of pages with GridView controls, do I stop the ‘production line’, fix the problem, and institute a counter measure (by educating the report in question as well as the rest of the team)?

  2. I would never allow to have every bug interrupting current development. You are aware of the cost of interruption so you may try to measure gain of instant fix. If the latter is bigger than the former, stop whatever you’re doing and fix the bug. However most of the time it is not.

    You may want to assign priority to every bug saying that critical issues are fixed instantly no matter what, major issues are fixed when someone finishes their task but no later than a day/couple of days from submission, minor issues are fixed whenever it doesn’t affects current development much. Then you need just to assign right priority whenever a bug is submitted.

    By the way: I wouldn’t be so orthodox about fixing all issues. When it comes to software development we have a lot of bugs (small glitches, sub-optimal GUI, small performance issues etc) which we can live with. And often, when we analyze the cost and the gain we decide just to leave them as they are.

    1. The issues I’m talking about here are issues reported by the QA team, i.e. before the feature goes into production. The reason we decided that rework would interrupt current work is that the QA team’s productivity is dependent on development. So if a QA engineer reports a ‘bug’ to me for a feature I was working on last week, he can’t continue his work until I’ve fixed that bug. (This statement may not necessarily be true).

      Either way, there is a task-switch cost involved, either on the part of the developer or the QA person.

      Just to make clear, the environment I’m talking about does not have WIP limits, and is very siloed according to function (i.e. no cross-functional teams based on delivery).

      1. There’s one trick here I use which limits number of context switches. I try to have QA people ready at the very moment some feature/task becomes ‘code complete’. Then usually first bugs are submitted before developers start working on another feature thus there aren’t context switches as long as QA folks are able to submit bugs at least as fast as they’re fixed by development team (which is usually true).

        I don’t know whether you have enough control over QA team to organize things in such way but even if you don’t to some point the approach should be possible to implement.

  3. That makes sense. I think currently there isn’t that kind of thinking. To me that’s almost an approximation of a pull system (somehow).

    Its an idea I’ll keep in my mind, thanks!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s