Skip to main content

Some Thoughts on Software Estimation

"All programmers are optimists." Fred Brooks's adage is as true today as it was when he first penned the words. The truth is that people are naturally very poor estimators. We all have a tendency to believe everything will go more smoothly than it probably will. Things will work out just as we planned, assuming we bothered to plan at all.

I encounter the reality of this often in my work. The first deadline passes, and we assure ourselves that we can catch up to meet the next one. The next deadline passes, and we finally come to terms with the reality that we're not going to make it. Then, we push the schedule back a week, when if we really reflected on the situation, we would realize we were probably a couple of months out from actually completing the project.

The product gets later and later one day at a time. Lots of yelling and finger-pointing ensues. Management demands better estimates, but rejects any that don't fit with their view of when we "should" be done, largely because there is pressure coming down the chain of command all the way from the top, far removed from the realities of technical issues that are hampering release. Thus, the schedule never really reflects the state of the software. When it comes time to plan the next release, we say we'll do better. What we often really mean is that we'll do better at racing to meet the schedule we've prescribed for ourselves rather than do better at estimating the amount of time it's actually going to take to build the thing.

It's too easy to lose sight of the fact that schedule estimation should be descriptive rather than prescriptive. We can talk endlessly about how long it theoretically or ideally should take us to finish and what the marketing department thinks would be the ideal time to launch, but at the end of the day, we have the requirements and the team and the legacy code that we have, and these circumstances largely determine the amount of time required to build the software. If we recognize and acknowledge these facts, we can do a better job of scheduling.

For many of us though, we operate on a smaller scale, planning and developing one small piece of the software at a time, but even in this context, we can make a contribution to help solve the issue. The next time you prepare to give an estimate of the number of hours or days a given task is going to take you, ask yourself how long it will take to not only write the code for the functionality your are tasked with delivering, but:

How much time will it take to:

  • write tests to validate the code?
  • debug and fix errors you make in writing the code?
  • debug and fix regressions you might introduce in other areas of the code as a result of your changes?
  • overcome tricky issues you didn't anticipate?
  • run tests and verify they all pass?
  • resolve issues identified during code review?


These are easy items to overlook when you are coming up with an estimate. Not all of these items will consume your time for every task, but almost certainly they will all come up from time to time. If you don't take them into account when devising an estimate, you will exceed your estimate more often than not.

There are several techniques for estimating. A few that have been used in software are WAGs (wild-ass guesses), Boehm's Constructive Cost Model (COCOMO) and the program evaluation and review technique (PERT). I won't go into details, but it is not hard to find more information about them. I prefer PERT, primarily because it results in a range estimate.

That brings me to the next aspect of estimation. Beyond taking all aspects of a task into account in order to provide an estimate, there are a couple other characteristics of an estimate to consider. Should it be a single value or a range? I would generally advocate for a range because it better reflects the uncertainty of the estimate. A single-valued estimate is highly likely to be wrong. In some sense, people will often translate it into a range anyway. However, the way others translate your single-valued estimate into a range may not be the way you would prefer it were perceived, so it often makes sense just to state the range explicitly.

Sometimes the consumers of your estimate want the greater sense of certainty that they believe comes with a single number. In these cases, a good question to ask oneself is how biased should the estimate be? If someone gave you an estimate to deliver something to you that you wanted, and that estimate turned out to be incorrect, would you rather the item was delivered in less time or more time than was estimated? In less time of course; that is, most of us would prefer the estimate was longer than the actual time required to deliver. This implies people favor receiving pessimistic estimates even though they have a propensity to give optimistic ones.

Now, clearly there has to be a limit on how pessimistic an estimate can be. I don't think "the end of time" is likely to be received well as an estimated time of completion. I like to go back to the Pragmatic Programmers' advice of gently exceeding expectations when attempting to determine how pessimistically to estimate.

There are two competing factors. One is providing an estimate that you will be able to meet or exceed, i.e. deliver at or prior to the estimate, most of the time, and thus, exceed expectations. Contrary to that, we want to avoid providing one that is either perceived as or proves to be in actuality a gross overestimate, which will reduce your credibility. I often settle on aiming for the 95th percentile. This means I am 95% confident I will meet or exceed my estimate. People often look at me a little cross-eyed when I give a 95%-confident estimate, but they tend to enjoy the fact later that the time to execute the task rarely exceeds the estimate given. Going beyond 95%, however, tends to require egregious overestimation and has vastly diminishing returns.

None of the above is intended to be too prescriptive. Estimation is a topic that the software industry has struggled with since its inception and will probably continue to struggle with for many years to come. The main point here is to attempt to consider all of the relevant factors in estimating the time required to complete a software task and deliver estimates that can be met or exceeded more often than not. I hope these ideas assist you in your estimation. Best of luck meeting your next deadline.

Comments

Popular posts from this blog

Books That Have Influenced Me and Why

A mantra that I often repeat to myself is, "Don't abandon the behaviors and habits that made you successful." I believe this trap is actually much easier to fall into than most people realize. You can sometimes observe it in the context of a professional sporting event such as American football. One team might dominate the game, playing exceptionally well for the first three quarters. Then, as they sit with a comfortable lead, you see a shift in their strategy. They start to play more conservatively, running the ball more often than they had. Their defense shifts to a "prevent" formation, designed to emphasize stopping any big plays by the other team while putting less pressure on the short game. The leading team often looks awkward in this mode. They have switched their perspective from that of pursuing victory to that of avoiding defeat. They have stopped executing in the way that gained them the lead in the first place. I have seen more than one game ult

Code Maintenance Requires More Than Testing

Writing and running automated tests can go a long way to help maintain the quality and reliability of a code base. Tests help ensure the code you've written executes the way you expect it to. However, even though automated tests are a great tool for helping to ensure quality over the years in which people will contribute to the code, they are not sufficient in and of themselves. It may be inevitable that all software products will reach a point of obsolescence no matter how much work is done to keep them current. The inherent problem is that if we view tests as constraints upon what the software system under test can be, then as we add more and more tests over the years, we will have more and more constraints on the shape our software is allowed to take. It then seems we must reach an inevitable point where we can no longer change our software system without violating a constraint, i.e. causing a test to fail. In more practical terms, we'll more likely first reach a situat

Notions of Debugging

Bugs can be really evasive and produce some surprising, even "impossible" behaviors. Part of a software developer's job is to relentlessly hunt them down and destroy them. However, often at least half the battle is finding them to begin with. The best debuggers I have seen rely a lot on experience, both experience with the application and experience with debugging in general. The former is situation-specific, and there many not be any good ways to advance in that regard other than practice. However, we can extract some information pertaining to good debugging practices in general. Dig for Clues Using All Available Resources The first step in debugging an issue is ideally to reproduce the problem. This isn't always possible, but if you want any significant level of confidence that you have fixed a bug, you need to observe the broken behavior and understand how to trigger it from a user's perspective. Sometimes we're lucky, and a tester has taken the time to