7 March 2012

Via Evolving Excellence, I’ve learned of, which is noteworthy for two reasons: 1) it is a cool business model, in my opinion, and 2) they have an absolutely fantastic introductory video on their website (warning: it autoplays).  As you watch it, notice the guy reading The Lean Startupin the background.

 | Posted by | Categories: Business |

Contained Failure

16 February 2011

I assume that almost everyone has by now heard the maxim “fail early, fail often” at least once.

I hope everyone has also heard the explanation that the point is not to fail early and often per se but rather to fail earlier rather than later, i.e., avoid large catastrophic failures by accepting small, frequent ones.

The practical advice is quite clear:  When you embark on a new project or task, set up the situation so that you get early warning signs of failure.  Pay attention to those signs of failure.  Be willing to accept that the time, money, and energy you’ve already put into the project is a sunk cost and don’t throw good money after bad.  Etc.  All of this focuses on the idea of preventing a larger, more catastrophic failure.  If you’re going to fail, at least fail small.

Many authors also go to lengths to emphasize the the “learn” aspect of the “fail early, fail often” maxim – which is great.

What I often find lacking is a discussion of the practical implications for how one should coach teams or individuals.

There are inevitably situations where a person or team are faced with a problem and someone else knows that their planned solution is going to fail.  Perhaps this person is formally the team’s coach.  Perhaps they’re one member of a pair (as in pair programming).  Perhaps it’s a manager.  Perhaps it’s just someone tangentially related to the situation.  It doesn’t really matter.

The question is: what should that person do?

The natural inclination is to try to prevent the mistake – for obvious reasons.  Sometimes this takes the form of explaining why the plan won’t work.  Sometimes it’s telling a story about how that has been tried in the past and never works.  Sometimes it’s someone with authority just overruling the plan.  None of these approaches is terribly effective.  Sure, they may work sometimes.  Sure, the overruling approach can “work” 100% of the time, at least on the surface.  Anyone who’s been on the other side of that equation, though, knows pretty well the negative side effects of a team (or an individual) just being overruled when they’re convinced they’re right or having to argue endlessly with someone who thinks they’re wrong.

The fundamental problem with these situations is, in my opinion, the hidden premise that all mistakes should be avoided if at all possible.  In other words, never allow failure to occur.

The problem with this premise is twofold: 1) it’s impossible, 2) it prevents learning.

The trick is to craft the situation such that, if you fail, the failure is small, contained, and teaches you something valuable.  It’s not that the advice is “fail early, fail often so as to avoid catastrophic failure and, whenever possible, avoid failing at all.”  It’s “allow small failure both for the sake of avoiding larger failure and for the sake of learning.”

For example, if I’m working on a task with someone and my partner wants to do something that I am certain will not work, what should I do?  Rather than spending valuable time and effort arguing about it, the best thing to do is probably do it their way – so long as I can craft the situation to ensure that, if and when it fails, we haven’t spent a lot of time/energy/money doing it.  After all, there are only two possibilities: either I’m right and we will fail or I’m wrong and it will work.  Either way, we both win.  If I’m right, my partner has now learned something valuable in perhaps the most effective way possible (by experience) and with a small amount of expenditure (of time/energy/money) and we can peacefully move on to doing it my way.  On the other hand, if I’m wrong, I’ve learned something valuable AND the task is now done.

For this to end up being a win/win, though, you must contain the size of the experiment (what is potentially going to fail), for two reasons:

  1. It contains the cost associated with the benefit of the learning to be achieved so that the cost/benefit still works out in your favor.
  2. It limits externalities.  That is, it limits the number of variables to which the failure can later be attributed.  If the experiment (task) has too many variables (is too big), if and when it fails, it will be all too easy for people to argue about what really caused the failure.  The smaller the experiment, the more self-evident it becomes what went wrong.  The larger the failure, the less likely anyone is to learn anything since they will be able to rationalize the failure according to their own biases.

In sum, failure is not something to be avoided always and everywhere.  Because experience is often the most powerful teaching mechanism and because experience inevitably involves failures, failure is an excellent way to learn.  The critical distinction, though, is between contained, low-cost, high-yield failure on the one hand and open-ended, high-cost, no-yield failure on the other.  To get the first and avoid the second, craft your experiments well:

  • When you embark on a new project or task, set up the situation so that you get early warning signs of failure.
  • Pay attention to those signs of failure.
  • Be willing to accept that the time, money, and energy you’ve already put into the project is a sunk cost and don’t throw good money after bad.
  • And, the point of this post, be willing to allow others to conduct experiments you know are going to fail.  Don’t try to “save” them from failure.  Save them from that second kind of failure.
 | Posted by | Categories: Lean Principles, Project Management | Tagged: |

Imagine you have a single organization (choose your own scale: a single team or a group of teams arranged into a single, bigger organization).  Now imagine that you have multiple – perhaps many – products to manage concurrently.  By ‘manage’, I mean everything from the smallest bug fix or tweak to new major releases.  Now imagine that the customer/user bases of those disparate products are not coextensive.  That is, some users only care about their particular product to the exclusion of the rest – and they might be the only user base that cares about that particular product.

Lastly, imagine that your users/customers are generally dissatisfied with the speed with which enhancements or fixes are being rolled out to the product(s) they care about.  After all, with a single organization handling all of the products, and the efficiency gains to be had by grouping similar work items together, it is often going to be the case that a significant amount of time goes by without a particular product seeing any updates – because there is a major release for other products, for example.

What would you do?

One option is to use allocation based project or portfolio management.  I’m deliberately avoiding the phrase “resource allocation” because I object to using ‘resource’ as a euphemism for people – thus, I turn the phrase around and coming at it from the project side.

In any event, what I mean is a situation where you step away from having a single backlog (which you presumably have since we’re talking about a single organization) and break the work into multiple backlogs (or value streams, as the case may be) – allocating a portion of your people or teams to each.  The idea being that this new situation will ensure a slow but steady flow of improvements to each product, thus reassuring users that progress is being made, even if it is slow.

As with every tool or principle in any toolbox or framework, there are times when this will work and times when it will not (and times when it will make the situation worse).

Here are the things one must consider:

1. The stability of the allocation

If you can foresee that the allocation of people/teams to each product is going to be pretty stable, then this strategy might work for you (assuming the other conditions below obtain).  If, on the other hand, you’re going to have to adjust the allocation frequently based on changing market needs or customer demands, then this will probably make the situation worse.  By the time a person or team has been working with a particular product to be familiar with it and really productive, they will be pulled off and have to come up the learning curve on another product before being as productive as possible.  Fast forward a few more cycles and you’ll have a lot of wasted horse trading and lower morale on your hands.

2. The distribution of knowledge and skills

That is, will each product have allocated to it a set of people who have sufficient knowledge and skill sufficient to make the desired enhancements?  Often times, this will not be the case.  For example, say there is a team of eight people responsible for eight products but six team members are only familiar with 3 of the products and the other five products are really only understood well by the other two team members.  Assuming having those two team members float between teams is not a viable solution (for whatever reason: too much task switching, etc.), one could not really make any progress on all five of those systems in an allocation based system.  (I’m assuming that the team as a whole can make progress on these products due to the mentorship of those two team members.)

3. The ease with which you can deploy to Production

If your products are such that you can fairly easily, quickly, and reliably deploy new builds (for software; models for other types of products), allocation will possibly be a winning strategy (again, assuming the other conditions obtain).  As the cost/pain of deployment rises, the benefits of allocation decline because the frequency with which users receive updates will decline because of the transaction cost associate with deployment.

4. The type of work to be done for each product

If your products are mature and stable and on going work takes the form of minor enhancements, new features, the occasional bug, etc., allocating might work.  If this is the case, your users will indeed feel a slow but steady stream of improvements coming their way.

If, on the other hand, your working on major releases and cannot be deployed piecemeal because they represent pieces of a completely new paradigm which have to be deployed in a large batch, your customers will not experience this feeling of steady progress – because they’ll still of necessity be receiving them in “big bang” releases.

Given that we’re positing multiple products, it is likely that different products will fall into different camps.  Without knowing specifics, the only guiding principle I can offer is that allocation will be a relatively better idea as the  importance of the products that are mature and stable relative to the other products increases.

So, assuming those four issues are favorable, allocation might work and is certainly a valid experiment to run to see if you get the intended result.  If, however, any one of them seems problematic, allocation will probably not work and will create a slew of other problems.

I’m interested to hear your take on this, especially if you’ve ever tried allocation in this way.  Please share experiences in the comments.

I’ve been thinking a lot about retrospectives lately both because our team has been struggling with them being ineffective/wasteful and because retrospectives were the subject of conversation at the last DC/NOVA Scrum Users Group meetup.

Our team has been tweaking and experimenting with various modifications to our process but one thing that we’ve left untouched for a year now is our retrospectives.  Every sprint, we ask “What went well?” “What went badly?” and “What can we improve?”  Rarely, though, do we follow through on those items under “What can we improve?”  We’ve tried forcing ourselves to make these items concrete, posting them near our board, bringing them up in the standup meeting, etc. but to no avail.

We’re now experimenting with changing the meeting itself to better foster improvement.  First: we’re going into the meeting with an agenda rather than having it be a free form discussion.  Generally, when we treat it as free form, memories are dull and it is difficult to start a conversation.  We’re hoping that a prepared agenda (to which anyone can contribute) will help grease the skids, so to speak.

Secondly, the “What can we improve?” section will now be more explicitly a “What experiments should we run?” discussion – things like “should we be using Selenium instead of our current solution?” or “what if we only tasked out half of the stories at the beginning of the sprint and left the second half until the middle of the sprint?”

The “What went well?” and “What went badly?” topics, then, can focus not only on unexpected things that came up but also on the results of these process, tooling, and workflow experiments that we’re running.

Hopefully, this will prove to be a true PDCA loop and really drive improvement.  After all, that what retrospectives are supposed to do.

A number of recent problems has caused our team to tweak our weekly Backlog Review meeting.  Specifically, we’ve added two items to the agenda: 1) a review of the stories tentatively scheduled for the next sprint (and possibly the next one after that) and 2) providing our PO with an estimate of our story point capacity for the next sprint.  If you’re thinking “why weren’t you reviewing upcoming stories already and why don’t you just use your velocity?” read on.

The first problem we experienced dealt with a specific story – a new database report that was needed.  The report was originally conceived and designed a few months ago but the project was postponed for several (valid) reasons.  The story, though, had been estimated months ago and then not touched again.  When business priorities were such that the report was again a priority, we simply dropped it into a sprint.  That’s when things got ugly.

In the intervening months, the estimate had become stale.  We had learned several rather critical lessons about this particular type of database report (we had developed similar ones in the meantime) but never incorporated that learning into this story or its estimate.  What we originally thought was an 8 point story instantly became 6 different 5 point stories.

That’s all fine and good – estimates are estimates and it is expected that the team will gain new knowledge and refine estimates as time goes on.  That’s not what happened, though.  Instead, we only realized the problem during our sprint planning meeting.  Since our velocity was hovering in the low 30′s, the revised set of stories ate an entire sprint.  Product Management was not expecting that at all.  Though they were prepared for estimates to shift and priorities to have to be moved around, they were not prepared for a sprint that they thought (and were basically told) would hold 4 or 5 different stories being eaten up by one story.

The lesson we learned was that we simply couldn’t allow stories and their estimates to become stale – there was too much risk that the story itself no longer made sense or that the estimate was now widely off.  Ideally, stories would never get stale because the backlog would be a pull system and stories would be planned and estimated very close to when development would begin.  Unfortunately, a huge piece of prioritization is weighing costs and Product Management can’t gauge the cost/benefit of each story relative to others without estimates.  Thus, it sometimes happens that stories are estimated and then shelved for a while.

To deal with this, both the team and Product Management now know to be on the look out for any stale stories or estimates that might be making their way to the top of the backlog.  Additionally, we’ve now broken our backlog review into two parts: estimated new stories and reviewing the one or two sprint’s worth of stories at the top of the backlog in case anything needs to be tweaked.  This uncovers problems a lot earlier than the sprint planning meeting and gives Product Management a chance to move around other priorities and reset customer expectations.

Secondly, our team’s capacity over the last several sprints has been somewhat erratic due to overlapping vacations, conference attendance, and other circumstances.  As a result, Product Management has been subjected to a few rude surprises on the first day of a new sprint when we tell them that our capacity for this sprint is half of what it was last time.  The fix for this (at least until our schedules settle down and we can really rely on our velocity again) is to take 2-3 minutes in the backlog review and estimate (in story points) our capacity for the next sprint.  This again gives Product Management so early warning and time to shift things around as necessary.

I’d be interested to hear if anyone else has had similar problems and what their solutions were.

 | Posted by | Categories: Scrum |

On Monday of last week, our network and desktop support team (what we call our “IT Team” as distinct from our software development teams) began experimenting with Kanban as our project management framework.  Heretofore, we’d simply been handling our project management and priorities in a sort of ad-hoc fashion.  We knew we wanted to ratchet it down, but didn’t want to use Scrum since the IT team is more of a support organization that would not operate well using time boxes.  We decided to experiment with Kanban for multiple reasons, including its suitability for support organizations and its focus on lean principles.

Our first week of experimenting with Kanban went quite well.  The major benefit we saw was the visualization of our work and workflow.  On Monday, we held our first retrospective and identified the first major process issue we want to address: widening the ownership of the backlog to the entire team.  Up until now, I had generally been the one ultimately responsible for what we worked on and in what order.  Obviously, there was input from the rest of the team and other stakeholders but there was a sense that I was the gatekeeper for priorities.

Kanban has highlighted the inefficiencies in that arrangement and we’re now trying to actively discuss the backlog and new issues at least once a day in our daily meeting – if not more often throughout the day.  This is definitely going to be an ongoing improvement effort so I expect we’ll keep this as an action item for at least several weeks until we get to a point where we think the entire team has full ownership of the backlog.

 | Posted by | Categories: Kanban |


19 March 2010

Today was the opening of AgileCoachCamp 2010 (#ACCNC) here in Durham.  So far, we’ve had a few rounds of lightning talks which were limited to 3 minutes and no slides as well as a lot of networking and generally good conversation.

In my lightning talk, I mentioned that I like to refer to our previous methodology (what we were doing prior to moving to Scrum) as “Waterhocking.”  I think it accurately captures the nature of our previous process.  It was definitely ad-hoc in so far as we weren’t following any particular project management framework and just handling things as they came up on a case-by-case basis.  It was similar to heavyweight “waterfall” methods in that we had extensive requirements gathering and documentation phases (BDUF), lengthy periods where the team would keep heads down and just try to build exactly what was documented, and too little user and acceptance testing too late.  Lastly, our releases often felt like we were hocking the product up since we were often under a fixed deadline and killing ourselves to get a product out the door only to find that the customer wasn’t happy with what was delivered.

Apparently, this label struck a chord with my fellow participants – it has a few mentions on twitter.  I’m definitely looking forward to tomorrow’s sessions; my only regret is that I can’t be in six or seven places at once.  There are so many really experienced, really insightful people here it is impossible not to miss great talks.

 | Posted by | Categories: Software Development | Tagged: , |

Our company has been in the habit of doing periodic EPPs (Employee Performance Plans).  We’ve evolved from doing them once a quarter to once every four months to twice a year.  We’ve gradually lengthened the amount of time an EPP covers because of the overhead involved in putting them together and reviewing them.

When the Engineering department moved to Scrum, we obviously had to change the way we conceived of an EPP.  As usual, the problems we encountered weren’t caused by Scrum – just highlighted by it.

The main problem we encountered was “how can we say we’re planning on doing anything since we don’t set our own priorities?”  In the past, this didn’t seem like a problem because we just subtracted the amount of time needed to complete our EPP objectives from the amount of available time for “PM-sponsored” projects – or we built in the fact that we wouldn’t be working 100% of the time on those projects.  Either route is a problem because it reduces visibility into what the team’s priorities and capacity are.

Another problem was “how can we claim to be agile while putting together six month long personal plans?”

The latest problem we’ve encountered had to do with personal/career/team development, e.g., writing more unit tests, peer reviewing code, experimenting with peer programming, networking with peers outside the company, etc.

I feel like we’ve addressed all three problems fairly well.  Here’s how:

Regarding EPP projects, we realized that engineers making themselves personally responsible for entire projects was simply the wrong approach.  Granted, we wanted to get these projects (mostly technical debt reduction projects) done and granted they are important, but cutting out the rest of the team and the Product Owner is simply not the best way to accomplish them.

We realized that we should not be focusing on the whole project but simply that piece which is under our control.  Thus, we are now adopting EPP goals such as “Advocate for refactoring product X” – with objectives such as “educate Product Management about the costs and potential benefits” and “submit requested user stories and Definitions of Done to our Product Owner”.  In this way, we’re doing everything we can to see that these projects get done without sacrificing the prerogative of the PO to set priorities.  We’re also doing what only we can do: identify, explain, and plan to reduce technical debt or capitalize on new technologies.

Regarding the fact that we’re using six month EPPs, we are very explicit that EPPs – like all plans – should not be written in stone.  Thus, we’ve taken the approach of having quick, monthly reviews of our EPPs to see if there is anything we want to add, remove, or change given our evolving situation and knowledge.  These reviews sometimes only last five minutes; sometimes they last 30.  The point is that they don’t introduce much overhead and they allow us to course correct fairly frequently.

Regarding personal/career/team development goals, the problems we were running into regarded how to measure success.  If we had an EPP goal to “ensure unit tests are written,” what defines success?  What do we say at the end of six months if we didn’t write as many unit tests as we could have for the first month or two, then were pretty good for the rest of the period until the last week when we again may have missed some opportunities for tests?

We realized that we were not focusing on the real issue.  At the end of the period, we didn’t so much want a code coverage percentage as we wanted to be able to say that we had adopted or internalized certain practices.  That is, that we had developed certain habits.  Thus, at the end of the period, the question we ask ourselves is not “what is our code coverage like?” but rather “have we developed the habit of always writing unit tests?”  While this is more subjective, we feel it is still more valuable and it more accurately reflects what we actually want to do.


  • Plan to do those things where you add unique value – bearing in mind that no one person can tackle an entire project alone and, therefore, should not be solely responsible for that project.
  • Review the plan often, making changes as necessary.  The plan is not written in stone.
  • Don’t be seduced by “vanity metrics” like “how many units tests have I written per story?”  Rather, focus on those habits or practices that you want to develop or internalize and then judge yourself against how well you have become the engineer you want to be.

Spring Conferences

26 February 2010

I’ll be attending Agile Coach Camp 2010 – a bar camp for agile practitioners being held in Durham, NC, March 19-21.  I and a colleague will be able to be there for the entire weekend and hopefully meet up with another co-worker who is based in NC whom we haven’t seen in a while.

I’ll also be attending the Lean Software and Systems Conference in Atlanta from April 21st through the 23rd.

I’m really looking forward to these conferences and meeting anyone else who might be attending these.  If you’re planning on attending, drop me a line and perhaps we can arrange to meet up.

 | Posted by | Categories: Uncategorized | Tagged: |

One of the biggest problems (if not the biggest problem) we have on our team is technical debt that has accumulated over the last 4-5 years. Fortunately, the team and our Product Owners understand the problem of technical debt in general and recognize it in our case.
We’ve started taking small steps to reduce our amount of debt. I’d love feedback and suggestions about what we’re doing and anything that anyone else has found helpful.

The first thing we’ve done is resolved to not take on any additional debt intentionally. Of course, almost anything we implement will eventually become technical debt if it is allowed to collect dust long enough or if circumstances change – but the key point is that it wasn’t debt when we implemented it. Our hope is that the rate at which technical assets become technical debt will be such that we will be able to keep up with regular refactoring.

The second thing is that our POs have made it clear that they are willing to give us the time to pay off technical debt – but the burden is on the team to identify, flag up, and explain the technical debt to the PO so that it can be properly prioritized in our backlog.

The third thing is the team now has a weekly 30 minute meeting to discuss technical debt. We don’t have a firm agenda, but the discussion usually centers around a few points:

  • Are there any pieces of technical debt that we would like to discuss (presumably because we haven’t discussed them in this forum before)?
  • How costly would it be to pay down this piece of debt? (We estimate this using the XS, S, M, L XL scale.)
  • How costly in the interest on this debt? That is, how much pain is it causing? (We estimate this using a yellow, orange, red scale. I’ll explain why in a minute).
  • How should we begin the process of paying down this debt? Is this something we can “just fix” with a little effort on the side? Is this something that we should write up a user story and request our PO add to the backlog? Should we keep it in our back pocket for a hack-a-thon project?
  • Who is on point for this piece of debt? That is, who is going to “just fix it” or write the user story or keep it on their own hack-a-thon to-do list?

The fourth thing is that we’re now maintaining a technical debt board on a wall near our sprint backlog. We wanted a visual representation and reminder of our team’s biggest problem. It will hopefully help us stay focused on it, not let us forget about any given piece of technical debt, and help us track and encourage progress (a very important facet, in my opinion).

This is why we estimated cost using size but impact using color – we can visually represent each piece of technical debt using a piece of paper, not card, post-it note, etc. of the appropriate color and have a board where someone can assess all of the critical information at a glance (example). If we had just used Fibonacci numbers for each scale and written them in the corners of cards or something, it would be much harder to get a sense for the whole situation.

So far, we’ve identified two pieces of technical debt that we can “just fix” in our spare time and the fixes are in progress. We’ve also begun working on writing the stories necessary to eliminate another debt. Hopefully, we’ll be able to keep up this momentum, increasing our velocity and quality along the way.

Any tips from those who have gone before would be greatly appreciated!

 | Posted by | Categories: Scrum, Software Development | Tagged: |