Monday, January 28, 2008

On the effectiveness of TDD

There's a fascinating exchange between Phil Haack and Jacob Proffit on the implications of a National Research Council of Canada paper titled "The Effectiveness of Test-first Approach to Programming".

The experimenters divided 24 third-year CS students into two groups, one practicing Test-First development and one practicing Test-Last development. Each implemented the same functionality. The Test-First group always wrote unit tests before writing each feature. The Test-Last group wrote unit tests after writing all of the features. It wasn't a long running experiment so I would be tempted to describe the "Test-Last" group as the "Test-Soon" group; but that's quibbling. The researchers arrive at conclusions favorable to Test-First ... which you should gather from reading the report (and the commentary) on your own.

Phil discusses this study in his post "Research Supports the Effectiveness of TDD". While he doesn't say that the study actually proves that TDD is effective, he clearly thinks highly of it.

Jacob responds in a post on his own blog, "TDD Proven Effective. Or is it?", with a devastating (IMHO) critique of the study and obliquely criticizes Phil for succumbing to "Confirmation Bias".

The essence of Jacob's argument (if I may) is that (a) the study data do not confirm the thesis that Test-First is more "effective" than Test-Soon, (b) there are disturbing data suggesting that the Test-Soon control group produced better code, (c) there is substantial cost to changing one's programming practice to Test-First and (d) he is reluctant to make such a switch until there is decent evidence for it.

I think the study, Phil's post, and Jacob's critique make superb reading so you shouldn't rely on my commentary for anything other than inspiration to check it out for yourself.

I think Jacob has the best of it here (and it appears that Phil eventually accepts Jacob's critique while keeping faith with TDD). I especially appreciate the manner in which the two of them move the debate along. Their exchange is full of passion and intelligence but never strays from civility.

In the end they agree (without evidence) that, whether you prefer "Test-First" or "Test-Soon", the outcome is substantially better than "Test-Never".

Sadly, the NCR experimenters didn’t include a "Test-Never" group but I think there may be good evidence to support the notion that some testing (whether "first" or "soon") beats no testing. I intend to read one of the referenced papers on this subject: "A Longitudinal Study of the Use of a TestDriven Development Practice in Industry"

Don't look to me for conclusions. I am generally persuaded that we should have more social research before we start telling everyone what they have to do. That said, I am persuaded of the merits of testing and have a good feeling about TDD ... when I take time to practice it.

No comments: