Refactoring it in

Refactoring has been called the most important advance in computer science in the last 30 years. It is the act of improving the design of source code without changing its meaning.

First of all I should mention the XP event of the year. If you’re interested in XP, or even just interested in this week’s topic, you should try to get along to this year’s Software Developers Conference (Wellington March 18 to 20). We’ve got Martin Fowler, who wrote the seminal work on Refactoring, and Jim Highsmith, one of the fathers of agile methods. Both are signatories to the Agile Alliance. There are other great speakers on the bill, and I’ll be running an informal XP workshop.

Refactoring has been called the most important advance in computer science in the last 30 years. It is the act of improving the design of source code without changing its meaning. It is similar in some respects to the housework that programmers have always done — similar, but not the same.

Refactoring is a set of transformations that can be applied to code to improve its design. Unlike normal “tidying up”, or “housework”, these transformations have been formalised. So, now there is a “rename method” refactoring that simply renames a method and all references to it. There is a “move method” refactoring that moves a method from one class to another.

Along with these transformations are the conditions under which you’d want to use them. Rename method is used when the name of a method is misleading, or simply wrong. When we examine code we often find things that aren’t good; we call that a smell. We say “this code smells” and then we refactor it. The process states that you always refactor the instant that you smell bad code.

Of course, if you change code then you run the risk of breaking something. This is where housework differs significantly from refactoring. Before you refactor anything you must have a set of unit tests that show that the code is working. You run the tests, watch them pass, perform a single, small refactoring, and then run the tests again. If they pass again, you can do another refactoring, if not, you can back out that last change (because it was small).

This is amazingly powerful. The tests give you the confidence to change anything. If part of a design is bad, you can change it, without worrying about breaking things, even if it’s a difficult and large change. You do it in small steps, verifying the integrity of your code with each step. Magic.

Because tests can be difficult to write we proceed in small steps. We write the trivial test first, then write the inside lower boundary condition, then inside upper boundary, then outside lower and finally outside upper. We do it one at a time, making each one pass before writing the next, refactoring as we go.

A software system is analogous in some ways to a physical system. As time progresses the design of a system degrades because we make changes to it. Each change adds to the disorder, increasing entropy. This is why systems often have to be retired after a number of years; they literally become unmentionable. We call this software entropy.

However, as with a physical system we can reduce entropy in a software system by introducing new energy. In software we call this energy refactoring. After each change, we need to refactor — that is, we reduce the disorder of the system.

We start with a totally ordered system, containing no code. We add some code, increasing its disorder and then we refactor, adding as much order back into the system as we can without changing its behaviour.

Making a change in an ordered system is easy. There is no duplication, there will be no side effects and you’ll have the tests to insure that you do it right. Compare this with the idea of making a change to your current system.

How changeable are our systems? One team in the US recently reported that they’d retrofitted EJB support into their large J2EE system, in a week. Think about your current project …

The biggest impact of refactoring, however, is that it effectively flattens the cost of change curve. This is where the cost of making a change to requirements costs between 1000 and 10,000 times more during development than if the change were introduced at the start of the project. With XP, making a change costs as much as adding a new feature; the cost is linear. The reason is mostly due to refactoring, and unit tests.

Some tools now support refactoring. Smalltalk has had a refactoring browser for a while. Java is just beginning to get them. Plug-ins are available for VAJ and JBuilder, but the best and most productive development environment I’ve ever used comes from a Czech company called IntelliJ, and is called IDEA. IDEA hosts around 30 refactorings, and is amazing. I’d strongly recommend it to any Java developer.

Dollery is a Wellington-based IT consultant.

Join the newsletter!

Error: Please check your email address.

Tags extreme programming

More about Agile

Show Comments