Showing posts with label refactoring. Show all posts
Showing posts with label refactoring. Show all posts

fishing and refactoring: A story with two endings

The Happy Ending

I'm salmon fishing on the River Spey, on the beautiful Tulchan beat, concentrating on spey casting. My phone is back at the hotel. My laptop is back at the hotel. I'm not thinking about coding. I'm fishing and I'm only fishing. Suddenly, from "nowhere" my subconcious pops an idea up into my consciousness. It says, hey, you know that code you were working on 2 weeks ago...? Where you had that nagging feeling there was a better, simpler solution? I know you know what I'm talking about! We'll I've been working on that for you. And I've come up with this. Here's the idea. Tada. What do you think?

For a moment, I think about the idea. It seems a good one. Of course, I can't try it out now because I'm waist deep in the middle of the River Spey. I promise myself I'll look into it when I get back to the hotel. I carry on trying to catch a salmon.

Later, back at the hotel I try out the new idea. I re-run all the unit tests and they pass. I make the change.

A few weeks later I'm fishing again. This time on the River Tay at Dunkeld. My subconscious pops up a new idea. I make a note of it. In the evening, back at the hotel I try the change. All the tests pass. The change is live.

A few weeks later I'm fishing again. On the River Verdal in Norway. My subconscious pops up with another new idea...


The Sad Ending

I'm salmon fishing on the River Spey, on the beautiful Tulchan beat, concentrating on spey casting. My phone is back at the hotel. My laptop is back at the hotel. I'm not thinking about coding. I'm fishing and I'm only fishing. Suddenly, from "nowhere" my subconcious pops an idea up into my consciousness. It says, hey, you know that code you were working on 2 weeks ago...? Where you had that nagging feeling there was a better, simpler solution? I know you know what I'm talking about! We'll I've been working on that for you. And I've come up with this. Here's the idea. Tada. What do you think?

For a moment, I think about the idea. It seems a good one. Of course, I can't try it out now because I'm waist deep in the middle of the River Spey. I promise myself I'll look into it when I get back to the hotel. I carry on trying to catch a salmon.

Later, back at the hotel I think about the new idea. I'm wary of making the change. There are no unit-tests I can quickly run. I'm remembering a few months back - that time when I tried an idea and broke stuff and was castigated. I'm fearful of making the change. I don't make it.

A few weeks later I'm fishing again. This time on the River Tay at Dunkeld. My subconscious gives me a new idea. I make a note of it. In the evening, back at the hotel I think about making the change. There are still no unit-tests. And I'm still remembering that time a few months back - when I tried an idea and broke stuff. Boy was I ridiculed. And if anything the culture is a bit worse since then. I can feel the fear. Once again I decide not to make the change.

A few weeks later I'm fishing again. On the River Verdal in Norway. This time, my subconscious doesn't pop up a new idea.

lessons from testing

I have run hundreds of test-driven coding dojos using cyber-dojo.
I see the same test anti-patterns time after time after time.
Do some of your tests exhibit the same same anti-patterns?



are you missing a TDD step?

Here's a TDD state diagram.
  • start by writing a test for new functionality
  • see it fail
  • make it pass
  • refactor
  • round and round you go
It looks a bit like an animal. Let's give it some eyes!



But there's something not right!
There's no red-to-red self-transition.
My animal is missing an ear!
I'll add the missing ear.




What is this new ear?
It's for changes made at red.
I see the test fail.
I read the diagnostic.
Then I stay at red and improve the diagnostic.
When I'm happy with diagnostic I get rid of it by making the test pass.



This was part of my lessons from testing presentation which reviews common test anti-patterns I see on cyber-dojo.

Note: I'm being careful not to call this red-to-red transition a refactoring since refactoring is for changes made at green.


XP and culture change

Last week, at a clients site, I noticed an old, coffee-stained copy of the Cutter IT Journal. It was titled "XP and culture change", dated September 2002. Here are some quotes from it.

From Kent Beck:

Because culture embodies perception and action together, changing culture is difficult and prone to backsliding.

Is it easier to change your perception or go back to designing the old way?

From Laurent Bossavit:

A process change will always involve a cultural change.

We were also a culture of Conviviality, which you could easily mistake (as I did at first) for a culture of Communication... In Conviviality what is valued is the act of sharing information in a group setting - rather than the nature, quantity, or quality of the information thus shared.

Culture is what remains when you have forgotten everything else.

From Mary Poppendieck and Ron Moriscato:

If there were one thing that Ron's team would do differently next time, it would be to do more refactoring.

XP is a process that doesn't feel like a process.

The theory of punctuated equilibrium holds that biological species are not likely to change over a long period of time because mutations are usually swamped by the genes of the existing population. If a mutation occurs in an isolated spot away from the main population, it has a greater chance of surviving.

From Ken Schwaber:

Agile process management represents a profound shift in the development of products and software. Agile is based on an intuitive feel of what is right, springs from a completely different theoretical basis than traditional development processes, and is in sum a wholly different approach to building products in complex situations.

From Matt Simons and Chaitanya Nadkarny

A fixed-bid contract changes the very nature of the relationship between customer and vendor from collaborative to "contentious". "Embrace change" undergoes a fatal transformation into "outlaw change."

There is no way to pretend everything is fine when you have to deliver software to your customer every few weeks.

From Nancy Van Schooenderwoert and Ron Moriscato:

The advantages of pair programming hit you hard and fast. As you explain an area of code to your partner, you get a deeper understanding of how it fits into the current architecture. You're your own peer reviewer!

After pair programming for a while, we found ourselves in a situation where the entire team had worked somewhere in the module in the recent past. Code reviews became exciting idea-exchanging periods where refactoring tasks were discussed and planned.

With schedule pressure, there is a huge temptation to put off refactoring, and we did too much of that.

It's not enough for the code to work; it also has to serve as a solid base for the next wave of features that will be added.

All through the project, a frequent cause was that unit testing wasn't thorough enough.


refactoring in the shower

Some houses are populated only with teenagers. When one gets in the shower they notice the plug-hole is clogged with horrible gunk: hairs of indeterminate origin, bits of soap, spit, bogies, general fluff (like the stuff that grows in your belly button), congealed shower gel, etc., etc.,. But they don't clean out the plug-hole. After all, they didn't make the mess. When they've finished they're cleaner. But only slightly. And they don't clean the plug-hole. Partly because they then notice there's no towel. They curse the other teenagers as they walk around the house naked, dripping water as they go, looking for anything vaguely absorbant they can commandeer.
After a while the plug-hole gets so gunked up tackling it becomes a truly repulsive task. It actually clogs the plug-hole. The shower fills with water - it starts turning into a bath! But it wasn't designed as a bath. Splashes leak over the lip of the door. Water collects under the shower base where it's not visible. Not at first anyway. After a while the ceiling below the shower starts to discolour in a tell-tale circular pattern. But it goes unnoticed. At first anyway. Soon the ring grows so large it's unmissable. But it's ignored. After all, it's a full-scale call-the-plumber-and-tiler job now.

Some houses contain a mixture of teenagers and adults. When a teenager showers sometimes the plug-hole is gunk free and there's a dry towel but sometimes the plug-hole is gunked up and there's no dry towel. They can't quite understand this. But they never clean the plug-hole and never put up a clean dry towel. When an adult showers sometimes there's a build-up of plug-hole-gunk and no dry towel. They know the previous shower-goer was a teenager. Sometimes there's no plug-hole-gunk and there is a dry towel. They know the previous shower-goer was an adult. When they've finished they clean out any plug-hole-gunk and put up a new dry towel.

Some houses house only responsible adults. Whenever they start to shower they don't notice any gunk around the plug-hole because there never is any. And there's always a fresh clean towel. After they've showered they clear any plug-hole gunk and put up a new dry towel.

sliming and refactoring and deliberate duplication



Suppose I'm doing the Print-Diamond kata in Ruby:
Given a letter print a diamond starting with 'A' 
with the supplied letter at the widest point. 
For example: print-diamond 'E' prints
    A
   B B
  C   C
 D     D
E       E
 D     D
  C   C
   B B
    A
I start with a test
def test_diamond_A
  assert_equal ['A'], diamond('A')
end
which I pass using
def diamond(widest)
  ['A']
end
I add another test
def test_diamond_B
  assert_equal [' A',
                'B B',
                ' A'], diamond('B')
end
which I pass using
def diamond(widest)
  if widest == 'A'
    return ['A']
  end
  if widest == 'B'
    return [' A',
            'B B',
            ' A']
  end
end
I add one more test
def test_diamond_C
  assert_equal ['  A',
                ' B B',
                'C   C',
                ' B B',
                '  A'], diamond('C')
end
which I pass using
def diamond(widest)
  if widest == 'A'
    return ['A']
  end
  if widest == 'B'
    return [' A',
            'B B',
            ' A']
  end
  if widest == 'C'
    return ['  A',
            ' B B',
            'C   C',
            ' B B',
            '  A']
  end
end
The tests have already proved valuable:
  • I've decided I don't want to actually test printing
  • I've chosen the result format - an array of strings
  • I've chosen not to embed newlines at the end of the strings
  • I've something to refactor against
However, there is no point in carrying on sliming. As the tests get more specific, the code should get more generic. I have three specific tests, but the code is equally specific. I need to generalize the code.

While coding the array of strings for the 'C' case I found myself copying the result for 'B' and modifying that. Specifically, I had to:
  • duplicate the 'B B' string
  • add a space at the start of the ' A' and 'B B' strings
  • add a new middle string 'C C'
This gave me the idea to try a recursive implementation. My first step was to refactor the code to this:
def diamond(widest)
  d = inner_diamond(widest)
  mid = d.length / 2
  d[0..mid-1] + d[mid+1..-1]
end

def inner_diamond(widest)
  if widest == 'A'
    return ['A',
            'A']
  end
  if widest == 'B'
    return [' A',
            'B B',
            'B B',
            ' A']
  end
  if widest == 'C'
    return ['  A',
            ' B B',
            'C   C',
            'C   C',
            ' B B',
            '  A']
  end
end
This looks a promising step towards a recursive solution - to make the implementation of 'C' contain the implementation of 'B' and then add strings only for 'C'. So, remembering what I had to do when copying and modifying, I refactored to this:
def inner_diamond(widest)
  if widest == 'A'
    return ['A',
            'A']
  end
  if widest == 'B'
    return [' A',
            'B B',
            'B B',
            ' A']
  end
  if widest == 'C'
    b = inner_diamond('B')
    upper,lower = split(b.map{ |s| ' ' + s })
    c = widest + '   ' + widest
    return upper + [c,c] + lower
  end
end

def split(array)
  mid = array.length / 2
  [ array[0..mid-1], array[mid..-1] ]
end
From here I verified the recursive solution works for 'B' as well:
def inner_diamond(widest)
  if widest == 'A'
    return ['A',
            'A']
  end
  if widest == 'B'
    a = inner_diamond('A')
    upper,lower = split(a.map{ |s| ' ' + s })
    b = widest + ' ' + widest
    return upper + [b,b] + lower
  end
  if widest == 'C'
    b = inner_diamond('B')
    upper,lower = split(b.map{ |s| ' ' + s })
    c = widest + '   ' + widest
    return upper + [c,c] + lower
  end
end
Now I worked on generalizing the use of the hard-coded argument to inner_diamond() and the hard-coded number of spaces:
def inner_diamond(widest)
  if widest == 'A'
    return ['A','A']
  end
  if widest == 'B'
    a = inner_diamond(previous(widest))
    upper,lower = split(a.map{ |s| ' ' + s })
    n = (widest.ord - 'A'.ord) * 2 - 1
    b = widest + (' ' * n) + widest
    return upper + [b,b] + lower
  end
  if widest == 'C'
    b = inner_diamond(previous(widest))
    upper,lower = split(b.map{ |s| ' ' + s })
    n = (widest.ord - 'A'.ord) * 2 - 1
    c = widest + (' ' * n) + widest
    return upper + [c,c] + lower
  end
end

def previous(letter)
  (letter.ord - 1).chr
end
Now I collapsed the duplicated specific code to its more generic form:
def inner_diamond(widest)
  if widest == 'A'
    return ['A','A']
  else
    a = inner_diamond(previous(widest))
    upper,lower = split(a.map{ |s| ' ' + s })
    n = (widest.ord - 'A'.ord) * 2 - 1
    b = widest + (' ' * n) + widest
    return upper + [b,b] + lower
  end
end
Finally some renaming:
def inner_diamond(widest)
  if widest == 'A'
    return ['A','A']
  else
    inner = inner_diamond(previous(widest))
    upper,lower = split(inner.map{ |s| ' ' + s })
    n = (widest.ord - 'A'.ord) * 2 - 1
    middle = widest + (' ' * n) + widest
    return upper + [middle,middle] + lower
  end
end
To summarise:
  • When sliming I try to think ahead and choose tests which allow me to unslime the slime.
  • If I have slimed 3 times, my next step should be to unslime rather than adding a 4th gob of slime.
  • My first unsliming step is often deliberate duplication, done in a way that allows me to collapse the duplication.


my kanban 1's board game

Here's a slide deck explaining the essence of my kanban 1s board game. Jon Jaggers Kanban 1s Board Game
  • You can play an early session with no clips so the players can see how inventory builds up (you can also push done story-cards to the next edge's corner, rather than waiting for them to be pulled).
  • The clips that hold the story-cards are a crucial part of the game. They make it a kanban game.
  • You can limit the number of clips per edge to create a natural work-in-progress (wip) limit.
  • You can add a new rule: players can also spend a 1 to split a story-card in two, eg a 4 into a 3 and a 1 (assuming they have a spare clip).
  • You can record the day a story-card comes off the backlog, and also the day it gets to done and thus measure the cycle time.
  • You can simulate scrum-style discrete sprints.
  • You can vary the number of dice at different edges.


Isolating legacy C code from external dependencies

Code naturally resists being isolated if it isn't designed to be isolatable. Isolating legacy code from external dependencies can be awkward. In C and C++ the transitive nature of #includes is the most obvious and direct reflection of the high-coupling such code exhibits. However, there is a technique you can use to isolate a source file by cutting all it's #includes. It relies on a little known third way of writing a #include. From the C standard:

6.10.2 Source file inclusion
...
A preprocessing directive of the form:
  #include pp-tokens 
(that does not match one of the two previous forms) is permitted. The preprocessing tokens after include in the directive are processed just as in normal text. ... The directive resulting after all replacements shall match one of the two previous forms.


An example. Suppose you have a legacy C source file that you want to write some unit tests for. For example:
/*  legacy.c  */
#include "wibble.h"
#include <stdio.h>
...
int legacy(int a, int b)
{
    FILE * stream = fopen("some_file.txt", "w");
    char buffer[256];
    int result = sprintf(buffer, 
                         "%d:%d:%d", a, b, a * b);
    fwrite(buffer, 1, sizeof buffer, stream);
    fclose(stream);
    return result;
}
Your first step is to create a file called nothing.h as follows:
/* nothing! */
nothing.h is a file containing nothing and is an example of the Null Object Pattern. Then you refactor legacy.c to this:
/* legacy.c */
#if defined(UNIT_TEST)
#  define LOCAL(header) "nothing.h"
#  define SYSTEM(header) "nothing.h"
#else
#  define LOCAL(header) #header
#  define SYSTEM(header) <header>
#endif

#include LOCAL(wibble.h)  /* <--- */
#include SYSTEM(stdio.h)  /* <--- */
...
int legacy(int a, int b)
{
    FILE * stream = fopen("some_file.txt", "w");
    char buffer[256];
    int result = sprintf(buffer, 
                         "%d:%d:%d", a, b, a*b);
    fwrite(buffer, 1, sizeof buffer, stream);
    fclose(stream);
    return result;
}
Now structure your unit-tests for legacy.c as follows:
First you write null implementations of the external dependencies you want to fake (more Null Object Pattern):
/* legacy.test.c: Part 1 */

static FILE * fopen(const char * restrict filename, 
                    const char * restrict mode)
{
    return 0;
}

static size_t fwrite(const void * restrict ptr,   
                     size_t size, 
                     size_t nelem, 
                     FILE * restrict stream)
{
    return 0;
}

static int fclose(FILE * stream)
{
    return 0;
}
Then #include the source file. Note carefully that you're #including legacy.c here and not legacy.h and you're #defining UNIT_TEST so that legacy.c will have no #includes of its own:
/* legacy.test.c: Part 2 */
#define UNIT_TEST
#include "legacy.c" 
Then write your tests:
/* legacy.test.c: Part 3 */
#include <assert.h>

void first_unit_test_for_legacy(void)
{
    /* writes "2:9:18" which is 6 chars */
    assert(6, legacy(2,9));
}

int main(void)
{
    first_unit_test_for_legacy();
    return 0;
}
When you compile legacy.test.c you will find your first problem - it does not compile! You have cut away all the #includes which cuts away not only the function declarations but also the type definitions, such as FILE which is a type used in the code under test, as well as in the real and the null fopen, fwrite, and fclose functions. What you need to do now is introduce a seam only for the functions:
/* stdio.seam.h */
#ifndef STDIO_SEAM_INCLUDED
#define STDIO_SEAM_INCLUDED

#include <stdio.h>

struct stdio_t
{
    FILE * (*fopen)(const char * restrict filename, 
                    const char * restrict mode);
    size_t (*fwrite)(const void * restrict ptr, 
                     size_t size,  
                     size_t nelem, 
                     FILE * restrict stream);
    int (*fclose)(FILE * stream);
};

extern const struct stdio_t stdio;

#endif    
Now you Lean On The Compiler and refactor legacy.c to use stdio.seam.h:
/* legacy.c */   
#if defined(UNIT_TEST)
#  define LOCAL(header) "nothing.h"
#  define SYSTEM(header) "nothing.h"
#else
#  define LOCAL(header) #header
#  define SYSTEM(header) <header>
#endif

#include LOCAL(wibble.h) 
#include LOCAL(stdio.seam.h)  /* <--- */
...
int legacy(int a, int b)
{
    FILE * stream = stdio.fopen("some_file.txt", "w");
    char buffer[256];
    int result = sprintf(buffer, 
                         "%d:%d:%d", a, b, a*b);
    stdio.fwrite(buffer, 1, sizeof buffer, stream);
    stdio.fclose(stream);
    return result;
}    
Now you can structure your null functions as follows:
/* legacy.test.c: Part 1 */
#include "stdio.seam.h"

static FILE * null_fopen(const char * restrict filename, 
                         const char * restrict mode)
{
    return 0;
}

static size_t null_fwrite(const void * restrict ptr, 
                          size_t size, 
                          size_t nelem, 
                          FILE * restrict stream)
{
    return 0;
}

static int null_fclose(FILE * stream)
{
    return 0;
}

const struct stdio_t stdio =
{
    .fopen  = null_fopen,
    .fwrite = null_fwrite,
    .fclose = null_fclose,
};    
And viola, you have a unit test. Now you have your knife in the seam you can push it in a bit further. For example, you can do a little spying:
/* legacy.test.c: Part 1 */
#include "stdio.seam.h"
#include <assert.h>
#include <string.h>

static FILE * null_fopen(const char * restrict filename, 
                         const char * restrict mode)
{
    return 0;
}
    
static size_t spy_fwrite(const void * restrict ptr, 
                         size_t size, 
                         size_t nelem, 
                         FILE * restrict stream)
{
    assert(strmp("2:9:18", ptr) == 0);
    return 0;
}

static int null_fclose(FILE * stream)
{
    return 0;
}

const struct stdio_t stdio =
{
    .fopen  = null_fopen,
    .fwrite =  spy_fwrite,
    .fclose = null_fclose,
};
This approach is pretty brutal, but it might just allow you to create an initial seam which you can then gradually prise open. If nothing else it allows you to create characterisation tests to familiarize yourself with legacy code.

You'll also need to create a trivial implementation of stdio.seam.h that the real code uses:
/* stdio.seam.c */
#include "stdio.seam.h"
#include <stdio.h>

const struct stdio_t stdio =
{
    .fopen  = fopen,
    .fwrite = fwrite,
    .fclose = fclose,
};
The -include compiler option might also prove useful.

-include file
    Process file as if #include "file" appeared as the first line of the primary source file.


Using this you can create the following file:
/* include.seam.h */
#ifndef INCLUDE_SEAM
#define INCLUDE_SEAM

#if defined(UNIT_TEST)
#  define LOCAL(header) "nothing.h"
#  define SYSTEM(header) "nothing.h"
#else
#  define LOCAL(header) #header
#  define SYSTEM(header) <header>
#endif

#endif
and then compile with the -include include.seam.h option.

This allows your legacy.c file to look like this:
#include LOCAL(wibble.h) 
#include LOCAL(stdio.seam.h)
...
int legacy(int a, int b)
{
    FILE * stream = stdio.fopen("some_file.txt", "w");
    char buffer[256];
    int result = sprintf(buffer, "%d:%d:%d", a, b, a*b);
    stdio.fwrite(buffer, 1, sizeof buffer, stream);
    stdio.fclose(stream);
    return result;
}    


Fit for any type of sea voyage

My beautiful wife Natalie met me here in Norway on Saturday. We visited the Viking Ship Museum. One of the displays said:

Et skip som må øses 3 ganger på 2 døgn er sjødyktig til all slags ferd.

which translates as:

A ship that has to be bailed 3 times in 2 days is fit for any type of sea voyage.

I just love that.

#include - there is a third way

Isolating legacy code from external dependencies can be awkward. Code naturally resists being isolated if it isn't designed to be isolatable. In C and C++ the transitive nature of #includes is the most obvious and direct reflection of the high-coupling such code exhibits. There is a technique that you can use to isolate a source file by cutting all it's #includes. It relies on a little known third way of writing a #include. From the C standard:

6.10.2 Source file inclusion
...
A preprocessing directive of the form:
  #include pp-tokens 
(that does not match one of the two previous forms) is permitted. The preprocessing tokens after include in the directive are processed just as in normal text. ... The directive resulting after all replacements shall match one of the two previous forms.


An example. Suppose you have a legacy source file that you want to write some unit tests for. For example:
/*  legacy.c  */
#include "wibble.h"
#include <stdio.h>

int legacy(void)
{
    ...
    info = external_dependency(stdout);
    ...
}


First create a file called nothing.h as follows:
/* nothing! */
nothing.h is a file containing nothing and is an example of the Null Object Pattern). Then refactor legacy.c to this:
/* legacy.c */
#if defined(UNIT_TEST)
#  define LOCAL(header) "nothing.h"
#  define SYSTEM(header) "nothing.h"
#else
#  define LOCAL(header) #header
#  define SYSTEM(header) <header>
#endif

#include LOCAL(wibble.h)  /* <--- */
#include SYSTEM(stdio.h)  /* <--- */

int legacy(void)
{
    ...
    info = external_dependency(stdout);
    ...
}


Now structure your unit-tests for legacy.c as follows:
First you write the fake implementations of the external dependencies. Note that the type of stdout is not FILE*.
/* legacy.test.c: Part 1 */

int stdout;

int external_dependency(int stream)
{   
    ...
    return 42;
}
Then #include the source file. Note carefully that we're #including legacy.c here and not legacy.h
/* legacy.test.c: Part 2 */
#include "legacy.c" 
Then write your tests:
/* legacy.test.c: Part 3 */

#include <assert.h>

void first_unit_test_for_legacy(void)
{
    ...
    assert(legacy() == expected);
    ...
}

int main(void)
{
    first_unit_test_for_legacy();
    return 0;
}


Then compile legacy.test.c with the -D UNIT_TEST option.

This is pretty brutal, but it might just allow you to create an initial seam which you can then gradually prise open. If nothing else it provides a way to create characterisation tests to familiarize yourself with legacy code.

The -include compiler option might also prove useful.

-include file
    Process file as if #include "file" appeared as the first line of the primary source file.


Using this you can create the following file:
/* include_seam.h */
#ifndef INCLUDE_SEAM
#define INCLUDE_SEAM

#if defined(UNIT_TEST)
#  define LOCAL(header) "nothing.h"
#  define SYSTEM(header) "nothing.h"
#else
#  define LOCAL(header) #header
#  define SYSTEM(header) <header>
#endif

#endif

and then compile with the -include include_seam.h option.

A singleton refactoring

Here's a refactoring I often teach. It's particularly applicable if you're wanting to refactor legacy code. I'll show it in C++ in honour of the excellent developers at Promethean.

I'll start with a typical Singleton:

class singleton
{
public:
    static singleton & instance();
    void f(int value);
    int g(const std::string & path);
private:
    singleton();
    ~singleton();
};

Together with typical piece of client client code that I'm wanting to unit-test:

int client::eg1()
{
    stuff();
    more_stuff();
    int r = singleton::instance().g("Hello world");
    return yet_more_stuff(r);
}

My problem is that singleton leads to one or more external dependencies I'd like my unit-tests to bypass. My first step is to create an interface for the singleton:

class singleton_interface
{
public:
    virtual void f(int value) = 0;
    virtual int g(const std::string & path) = 0;
    ...
};

And then make singleton implement the interface. (The saving grace of singleton is that its methods are at least instance methods.)

class singleton : public singleton_interface
{
public:
    static singleton & instance();
    void f(int value);
    int g(const std::string & path);
private:
    singleton();
    ~singleton();
};

Now I can overload eg1 as follows:

int client::eg1(singleton_interface & instance)
{
    stuff();
    more_stuff();
    int r = instance.g("Hello world");
    return yet_more_stuff(r);
}

int client::eg1()
{
    return eg1(singleton::instance());
}

And finally, I can create singleton_interface sub-classes and use them in my unit-tests:

struct my_mock_singleton : singleton_interface
{
    explicit my_mock_singleton(int g_result) 
        : g_result(g_result) 
    {
    }
    void f(int value) 
    {
    }
    int g(const std::string & s) 
    { 
        return g_result; 
    }
    int g_result;
};

void test_client_eg1_hitch_hiker_is_42()
{
    assert(42 == client().eg1(my_mock_singleton(6*9)));
}


Micro refactoring - from here to there

One of my Mastery book snippets from a few days ago reads:

Masters ... are zealots of practice, connoisseurs of the small, incremental step.

I was doing the roman numerals kata in ruby and something related to small incremental steps occurred to me. Here's three tests:

def test_to_roman
  assert_equal "I", to_roman(1)
  assert_equal "II", to_roman(2)
  assert_equal "III", to_roman(3)
end

and here's the first code snippet:

def to_roman(n) 
  roman = ''
  if n == 3
    roman = 'III' 
  end
  if n == 2
    roman = 'II'
  end
  if n == 1
    roman = 'I'
  end
  roman
end

I can refactor these three if statements into a single while statement, in very small steps, as follows:

def to_roman(n) 
  roman = ''
  if n >= 3
    roman += 'I'
    n -= 1
  end
  if n >= 2
    roman += 'I'
    n -= 1
  end
  if n >= 1
    roman += 'I'
    n -= 1 
  end
  roman
end

and then:

def to_roman(n) 
  roman = ''
  if n >= 1
    roman += 'I'
    n -= 1
  end
  if n >= 1
    roman += 'I'
    n -= 1
  end
  if n >= 1
    roman += 'I'
    n -= 1 
  end
  roman
end

and then:

def to_roman(n) 
  roman = ''
  while n >= 1
    roman += 'I'
    n -= 1
  end
  roman
end

Ok. Now here's the original again:

def to_roman(n) 
  roman = ''
  if n == 3
    roman = 'III'
  end
  if n == 2
    roman = 'II'
  end
  if n == 1
    roman = 'I'
  end
  roman
end

And this time I refactor in a different direction, towards an array lookup:

def to_roman(n) 
  units = [ '', 'I', 'II', 'III' ]
  roman = ''
  if n == 3
    roman = units[3]
  end
  if n == 2
    roman = units[2]
  end
  if n == 1
    roman = units[1]
  end
  roman
end

And then:

def to_roman(n) 
  units = [ '', 'I', 'II', 'III' ]
  roman = ''
  if n == 3
    roman = units[n]
  end
  if n == 2
    roman = units[n]
  end
  if n == 1
    roman = units[n]
  end
  roman
end

And then:

def to_roman(n) 
  units = [ '', 'I', 'II', 'III' ]
  roman = ''
  if n == 3
    roman = units[n]
  end
  if n == 2
    roman = units[n]
  end
  if n == 1
    roman = units[n]
  end
  roman
  units[n]
end

And then:

def to_roman(n) 
  units = [ '', 'I', 'II', 'III' ]
  units[n]
end

In this case I can refactor from a sequence of if statements in two directions equally easy - towards a while or towards an array lookup. Ok. Now consider the situation if I was starting from a switch instead of a sequence of ifs:

def to_roman(n) 
  case n
  when 3 then 'III'
  when 2 then 'II'
  when 1 then 'I'
  end
end

From this I can easily refactor towards an array lookup but refactoring towards a while is perhaps not quite so straightforward.

I'm suggesting that a construct is useful not just in it's own right but also in relation to all the other constructs it might get refactored to or from. How easily can I move from one construct to another? Do some constructs live only on one-way roads? Do some lead you down more dead-ends than others?

I'm reminded of the family on holiday in the West Country who got thoroughly lost. Spotting a farmer leaning on a gate they stop the car, wind down the window and ask

Excuse me. Can you tell me how to get to Nempnett Thrubwell?

The old farmer looks at them and says

Well.... you don't want to start from here.


Working effectively with legacy code

is an excellent book by Michael Feathers (isbn 0-13-117705-2). As usual I'm going to quote from a few pages:
A few years ago, I gave my friend Erik Meade a call after I'd finished work one night. I knew Erik had just started a consulting gig with a new team, so I asked him, "How are they doing?" He said, "They're writing legacy code, man."
To me, legacy code is simply code without tests. I've gotten some grief for this definition… I have no problem defining legacy code as code without tests. It is a good working definition, and it points to a solution.
In nearly every legacy system, what the system does is more important than what it is supposed to do… Frankly, it's very important to have that knowledge of what the system actually does someplace.
When teams aren't aware of their architecture, it tends to degrade. What gets in the way of this awareness? … The brutal truth is that architecture is too important to be left exclusively to a few people.
When you have tests around the areas in which you are going to make changes, they act as a software vise [vice]. You can keep most of the behaviour fixed and know that you are changing only what you intended to.
The most subtle bugs that we can inject are bugs related to inheritance. … The language feature that gives us the most possibility for error when we lean is inheritance.
Orthogonality is a fancy word for independence… One of the startling things that you discover when you start removing duplication zealously is that designs emerge.
Sometimes it makes sense to add a variable to a class and use it to sense conditions in the method that we want to refactor.
…not all behaviours are equal in an application.
Code is pretty fragile material.
At this point in my career, I think I'm a much better programmer than I used to be, even though I know less about the details of each language I work in.
The primary purpose of a compiler is to translate source code into some other form, but in statically typed languages, you can do much more with a compiler.

Lean on the Compiler by Hiding (C++)

My Lean on the Java Compiler by Hiding post has proved quite popular so I've been thinking about if it could also work for C++... It can. I'll use Singleton as a example again. Suppose you have a C++ class like this:



class legacy
{
public:
    legacy();
    void eg1();
    void eg2();
    ...
};
#include "legacy.hpp"
#include "singleton.hpp"

void legacy::eg1()
{
    singleton::instance()->call();
}

void legacy::eg2()
{
    singleton::instance()->call();
}

Again, you want to create a seam for singleton access. The first step is to introduce the hiding identifier in the header file:

class legacy
{
public:
    legacy();
    void eg1();
    void eg2();
    ...
private:
    class singleton;  // <-----
};


Now Lean on the Compiler. All the calls to singleton::instance() no longer compile because the global singleton class is now hidden by the local singleton class just introduced. Next, in the source file add a global variable called instance of type singleton* and refactor all occurrences of singleton::instance() to instance.
#include "legacy.hpp"
#include "singleton.hpp"

singleton * instance;

void legacy::eg1()
{
    instance->call();
}

void legacy::eg2()
{
    instance->call();
}

Then Lean on the Compiler again to make sure there are no typos. Next remove the global variable called instance from the source file and refactor the header file to this:

class singleton; // <-----

class legacy
{
public:
    legacy();
    void eg1();
    void eg2();
    ...
private:
    singleton * instance; // <-----
};

Finally, initialize the instance member in the constructor definition.

legacy::legacy()
    : instance(singleton::instance())
{
}

I haven't tried this on real legacy code either. Caveat emptor!

Lean on the Compiler by Hiding (Java)

Here's a small trick which might not be out of place in Michael Feather's excellent book. I'll use Singleton as a example. Suppose you have a class like this (Java):



public class Legacy
{
    public void eg1()
    {
        stuff(Singleton.getInstance().call());
    }
    public void eg2()
    {
        more_stuff(Singleton.getInstance().call());
    }
}

And you want to create a seam for the singleton access. The first step is to introduce two public fields:

public class Legacy
{
    public int Singleton;
    public Singleton instance;

    public void eg1()
    {
        stuff(Singleton.getInstance().call());
    }
    public void eg2()
    {
        more_stuff(Singleton.getInstance().call());
    }
}

Now Lean on the Compiler. All the calls to Singleton.getInstance() no longer compile because the Singleton class is now hidden by the int Singleton just introduced. Next refactor all occurences of Singleton.getInstance() to instance (the other identifier just introduced, you can pick any name for this one but it's type must be Singleton).

public class Legacy
{
    public int Singleton;
    public Singleton instance;

    public void eg1()
    {
        stuff(instance.call());
    }
    public void eg2()
    {
        more_stuff(instance.call());
    }
}

Then Lean on the Compiler again to make sure there's no typos. Finally refactor to this:

public class Legacy
{
    public Singleton instance = Singleton.getInstance();

    public void eg1()
    {
        stuff(instance.call());
    }
    public void eg2()
    {
        more_stuff(instance.call());
    }
}

The name lookup rules for Java allow this to work but I don't think this idea will work in C++ (for example). I haven't got time to try it now as I'm heading off to the NDC conference in Oslo. I only thought of it yesterday. I haven't tried this on real legacy code. Caveat emptor!

Update. I have a C++ version aswell.

Adapt Parameter and Preserve Signature

Working Effectively With Legacy Code by Michael Feathers (isbn 0-13-117705-2) is a great book. The very first dependency breaking technique (p326) is called Adapt Parameter. It uses this Java example:



public class ARMDispatcher
{
    public void populate(ParameterSource request) {
        String[] values
            = request.getParameters(pageStateName);
        if (values != null && values.length > 0)
        {
            marketBindings.put( 
                pageStateName + getDateStamp(),
                values[0]);
        }
        ...        
    }
    ...
}


Michael writes:

In this class, the populate method accepts an HttpServletRequest as a parameter. ... It would be great to use Extract Interface (362) to make a narrower interface that supplies only the methods we need, but we can't extract an interface from another interface. ...


and a bit later...

Adapt Parameter is one case in which we don't Preserve Signatures (312). Use extra care.


Well, here's a variation on Adapt Parameter where we do Preserve Signatures. I'll stick with the Java example, but the idea is broadly applicable...

First we create our ParameterSource interface and fill it with the signatures of the methods in HttpServletRequest that our populate method calls:

public interface ParameterSource
{
    String[] getParameters(String name);
}
then we implement the adapter for HttpServletRequest:
public class HttpServletRequestParameterSource 
    implements ParameterSource
{
    public HttpServletRequestParameterSource(HttpServletRequest request) {
        this.request = request;
    }

    public String[] getParameters(String name) {
        return request.getParameters(name);
    }

    private HttpServletRequest request;
}
Next we add an overload of populate taking a ParameterSource and implement it with an exact copy-paste:
public class ARMDispatcher
{
    public void populate(ParameterSource request) {
        String[] values
            = request.getParameters(pageStateName);
        if (values != null && values.length > 0)
        {
            marketBindings.put(
                pageStateName + getDateStamp(),
                values[0]);
        }
        ...        
    }

    public void populate(HttpServletRequest request) {
        String[] values
            = request.getParameters(pageStateName);
        if (values != null && values.length > 0)
        {
            marketBindings.put(
                pageStateName + getDateStamp(),
                values[0]);
        }
        ...        
    }
    ...
}
Now we Lean on the Compiler (315) to make sure we haven't missed any methods in HttpServletRequest that our populate method calls. Add any we missed to the ParameterSource interface. Implement them in HttpServletRequestParameterSource as one liners. When it compiles we can refactor to this:
public class ARMDispatcher
{
    public void populate(ParameterSource request) {
        String[] values
            = request.getParameters(pageStateName);
        if (values != null && values.length > 0)
        {
            marketBindings.put(
                pageStateName + getDateStamp(),
                values[0]);
        }
        ...        
    }

    public void populate(HttpServletRequest request) {
        populate(new HttpServletRequestParameterSource(request));
    }
   ...
}
And now we have a seam we can pick it at...
public class FakeParameterSource 
    implements ParameterSource
{
    public String[] values;

    public String[] getParameters(String name) {
        return values;
    }
}