Baseball lessons

SAN FRANCISCO (09/19/2003) - Doing more with less is the theme of Michael Lewis' terrific new book, Moneyball. This David-versus-Goliath tale explains how the low-budget Oakland Athletics consistently win more games than much richer teams. Moneyball is not just a baseball book; it's a treatise on the science and economics of individual and team performance. The methods pioneered by Oakland General Manager Billy Beane, based on the theoretical foundations laid by maverick statistician Bill James, hold important lessons for enterprise IT.

Here's one: It points to things we might be measuring. At the moment, we seem to have no idea what to measure. The infamous KLoC (thousand lines of code) metric refuses to go away because, for better and worse, it's an obvious thing to count. But what else might we count? Even in statistics-drenched baseball, there were surprises. For example, the crucial Jamesian discovery that walks were much more important than anyone had guessed was delayed, for years, because nobody bothered to count walks.

The economic corollary, cleverly exploited by Billy Beane, was that players who walked more often were being systematically undervalued by the market and could be had more cheaply. Software development has an equivalent to the base on balls: a module that doesn't have to be written new because it already exists and can be reused.

We know that the best coders are far more productive than the norm. Arguably, that variance might lie in the patience, discipline, and research skills required to recycle rather than to reinvent. If so, finding ways to measure and value these qualities could yield a pivotal advantage.

There will always be programmers who just want to step up to the plate and hack away, but many have internalized the virtue of laziness and prefer to buy (or in the case of free software, download) rather than build whenever possible. So there's going to be recycled code in almost any project, and it's not hard to measure how much. That number might be helpful, but it won't tell us whether the reuse opportunity has been fully explored and properly exploited. That's a judgment call requiring expertise that we just can't measure. Or can we?

A piece of software isn't just a bag of bits; it's the nexus of a community of developers and users. In the case of open source projects -- and increasingly of commercial ones, too -- these communities live online. Mailing lists, personal e-mail correspondence, RSS feeds, weblogs, and Wikis are all grist for the social network analyst's mill. By mining such data, this new breed of statistician are able to measure the strength of a person's ties to a community and can assess his or her reputation within that community.

Membership in multiple communities is another potentially valuable clue. In software development as in science, breakthroughs often occur when insights flow across disciplinary boundaries. The conductors of these flows are typically generalists who belong to several (or many) communities and who form bridges among them.

We know intuitively that these are good people to have on your team. Although we can't yet quantify the difference they make, the data, for the most part, is already available for inspection. Perhaps even now another Bill James is making discoveries that will revolutionize what we think we know about the price/performance ratio of software teams.

Join the newsletter!

Error: Please check your email address.
Show Comments