It was a pleasure and an honour to speak at the first Atlantec conference held in Galway, Ireland on May 15th.
I talked about cyber-dojo and showed some statistics from a random sample of its 30,000+ cyber-dojos,
together with a few examples of code/tests typically submitted, a few dashboard patterns, and wrapped up linking testing to Le Chatelier's Law and some of my favourite Systems Thinking quotes from Bradford Keeney.
Hi. I'm Jon Jagger, director of software at Kosli.
I built cyber-dojo, the place teams practice programming.
Showing posts with label testing. Show all posts
Showing posts with label testing. Show all posts
print "squashed-circle" diamond
There's been a bit of a buzz about the Print-Diamond practice recently.
I recall doing this a couple of years ago with Johannes Brodwall.
In cyber-dojo naturally.
We took a wrong turn and were making a thorough mess of it.
I vividly recall Johannes saying:
Viewed like this you can think of the Diamond as a sort of squashed circle with the A,B,C characters all lying on the circumference. From here it was a short step to this (Ruby):
which, when puts'd gives:
And we knew we were on our way.
Let's hear it for Curry Driven Development!
This is too difficult. We're doing it wrong.I love that. If it's difficult you're probably doing it wrong. We gave up and took a break. We got a really nice Indian take away. Then we went back to Print-Diamond. Very quickly we came up with a new idea. We imagined the diamond lying in the center of an x,y axis. The Print-Diamond of 'C' would therefore look like this:
-2 -1 0 +1 +2
-2 - - A - -
-1 - B - B -
0 C - - - C
+1 - B - B -
+2 - - A - -
Viewed like this you can think of the Diamond as a sort of squashed circle with the A,B,C characters all lying on the circumference. From here it was a short step to this (Ruby):
(-2..+2).map{|row|
(-2..+2).map{|col|
row.abs + col.abs == 2 ? 'X' : '-'
}.join
}
which, when puts'd gives:
--X-- -X-X- X---X -X-X- --X--
And we knew we were on our way.
Let's hear it for Curry Driven Development!
yet another interesting TDD episode
In the previous episode
I described how a custom assert function declared a local array which I did not initialize.
The effect of not initializing the array was that state from one test leaked into another test and I got an unexpectedly passing test. I fixed the problem by initializing the array.
The most recent episode (in cyber-dojo naturally) revolved around this same issue. This time I was redoing the roman numerals exercise (1 → "I", 2 → "II", etc) in C. My custom assert function started like this.
Once again I failed to initialize the array. I'm a slow learner. This time the test failed unexpectedly! And the diagnostic was even more unexpected:
This is a less than ideal diagnostic! It seemed that
After this the tests passed. This addressed the immediate problem but it did not address to the root cause. So I removed the initialization and (with a hat tip to Mr Jonathon Wakely) I reworked
With this change the diagnostic became:
Much better. Then I thought about initializing the array a bit more. I realized that initializing the array to the empty string was a poor choice since it masked a fault in the implementation of
So I added that and reworked
And I was back to green :-)
static void assert_fizz_buzz(const char * expected, int n)
{
char actual[16];
...
}
The effect of not initializing the array was that state from one test leaked into another test and I got an unexpectedly passing test. I fixed the problem by initializing the array.
static void assert_fizz_buzz(const char * expected, int n)
{
char actual[16] = "";
...
}
The most recent episode (in cyber-dojo naturally) revolved around this same issue. This time I was redoing the roman numerals exercise (1 → "I", 2 → "II", etc) in C. My custom assert function started like this.
...
#define PRINT(s) print_string(#s, s)
static void print_string(const char * name, const char * s)
{
printf("%10s: \"%s\"\n", name, s);
}
static void assert_to_roman(const char * expected, int n)
{
char actual[32];
to_roman(actual, sizeof actual, n);
if (strcmp(expected, actual) != 0)
{
printf("to_roman(%d) FAILED\n", n);
PRINT(expected);
PRINT(actual);
assert(false);
}
}
Once again I failed to initialize the array. I'm a slow learner. This time the test failed unexpectedly! And the diagnostic was even more unexpected:
to_roman(1) FAILED
expected: "I"
actual: "I"
This is a less than ideal diagnostic! It seemed that
strcmp and printf had differing opinions on what a string is!
I fixed it by adding initialization.
...
static void assert_to_roman(const char * expected, int n)
{
char actual[32] = "";
to_roman(actual, sizeof actual, n);
if (strcmp(expected, actual) != 0)
{
printf("to_roman(%d) FAILED\n", n);
PRINT(expected);
PRINT(actual);
assert(false);
}
}
After this the tests passed. This addressed the immediate problem but it did not address to the root cause. So I removed the initialization and (with a hat tip to Mr Jonathon Wakely) I reworked
print_string to display the length of the string as well as an indication
of the (un)printability of each character:
static void print_string(const char * name, const char * s)
{
printf("%10s: \"%s\" %d ", name, s, (int)strlen(s));
for (size_t i = 0; i != strlen(s); i++)
{
putchar(isprint(s[i]) ? 'P' : 'U');
}
putchar('\n');
}
With this change the diagnostic became:
to_roman(1) FAILED
expected: "I" 1 P
actual: "I" 4 UUUP
Much better. Then I thought about initializing the array a bit more. I realized that initializing the array to the empty string was a poor choice since it masked a fault in the implementation of
to_roman which did not start like this:
void to_roman(char buffer[], size_t size, int n)
{
buffer[0] = '\0';
...
}
So I added that and reworked
print_string as follows:
static void assert_to_roman(const char * expected, int n)
{
char actual[32];
memset(actual, '!', sizeof actual);
to_roman(actual, sizeof actual, n);
...
}
And I was back to green :-)
commas in C - can you pass the puzzle ?
In a recent Deep-C training course in Bangalore
I was discussing sequence points with some C programmers.
I explained that the comma operator is one the very few operators that creates a sequence point.
I like the comma. It's the name of a beautiful butterfly. K&R's famous Hello World program had a comma between hello and world. I should really put a comma into my blog picture!
During several cyber-dojos it became clear to me that many of the programmers (despite having programmed in C for several years), did not understand that in C, not all commas are the same. I've created a small piece of C code to try help C programmers understand the humble comma...
Have a look at the following 5 lines of code. Do you know what each line does?
int x = (1,2,3); int x = 1,2,3; x = 1,2,3; x = (1,2,3); x = f(1,2,3);.
.
.
.
.
.
Scroll down for my answers once you've decided...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
int x = (1,2,3);
This declares an
int called x and initializes it to the result of the expression
(1,2,3). The commas inside this expression are operators.
1 is evaluated and its value (1) discarded,
then a comma provides a sequence point, then 2 is evaluated and its value (2) discarded,
then a comma provides a sequence point, then 3 is evaluted and its value is the value of the
expression (1,2,3). So x is initialized to 3.
You'll probably get warnings saying there are no side-effects in the expressions 1 and 2.
int x = 1,2,3;
This is different. If this compiled it would declare an
int called x
and initialize it to 1 and then declare two more ints called 2
and 3. It has the same structure as int x = 1,y,z; which declares three
ints called x, y, and z.
The commas are not operators, they are punctuators/separators.
You can't declare variables called 2 or 3. It does not compile.
x = 1,2,3;
In this fragment
x is assumed to have already been declared. It is not a declaration. The
commas are operators again.
Assignment has higher precedence than the comma operator so this binds as (x = 1),2,3;.
So 1 is assigned to x, and the result of this assignment expression (1) is discarded,
then there is a sequence point, then 2 is evaluated and its value (2) is discarded,
then there is a sequence point, then 3 is evaluated and its value (3) is discarded.
You'll probaby get warnings saying there are no side-effects in the expressions 2 and 3.
x = (1,2,3);
Again,
x is assumed to have already been declared. It is not a declaration.
The commas are operators again. This is the same as the first fragment except it is not
a declaration. x is assigned the value of the expression (1,2,3)
which is 3. Again you'll probably get warnings saying there are no side-effects in the expressions 1 and 2.
x = f(1,2,3);
Again,
x is assumed to have already been declared. As has a function called
f which accepts three int arguments. These commas are
not operators. They are punctuators/separators. They separate the three expressions
forming the three arguments to f. But they do not introduce any sequence points.
How did you do?
another interesting TDD episode
The classic TDD cycle
says that you should start with a test for new functionality and see it fail.
There is real value in not skipping this step; not jumping straight to writing code to try to make it pass.
I started by writing my first test, like this:
I made this fail by writing the initial code as follows (the
which gave me the diagnostic:
I made this pass with the following slime
Next, I returned to the test and added a test for 6:
I ran the test, fully expecting it to fail, but it passed!
Can you see the problem?
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
The problem is in
Here's what's happening:
My mistake was in the test;
I ran the test again and this time it failed :-)
I made the test pass:
Let's hear it for starting with a test for new functionality and seeing it fail.
There is real value in not skipping this step; not jumping straight to writing code to try to make it pass.
- One reason is improving the diagnostic. Without care and attention diagnostics are unlikely to diagnose much.
- A second reason is to be sure the test is actually running! Suppose for example, you're using JUnit and you forget its @Test annotation? Or the public specifier?
- A third reason is because sometimes, as we saw last time, you get an unexpected green! Here's another nice example of exactly this which happened to me during a cyber-dojo demo today.
I started by writing my first test, like this:
static void assert_fizz_buzz(const char * expected, int n)
{
char actual[16];
fizz_buzz(actual, sizeof actual, n);
if (strcmp(expected, actual) != 0)
{
printf("fizz_buzz(%d)\n", n);
printf("expected: \"%s\"\n", expected);
printf(" actual: \"%s\"\n", actual);
assert(false);
}
}
static void numbers_divisible_by_three_are_Fizz(void)
{
assert_fizz_buzz("Fizz", 3);
}
I made this fail by writing the initial code as follows (the
(void)n is to momentarily avoid the
"n is unused" warning which my makefile promotes to an error
using the -Werror option):
void fizz_buzz(char * result, size_t size, int n)
{
(void)n;
strncpy(result, "Hello", size);
}
which gave me the diagnostic:
...: assert_fizz_buzz: Assertion `0' failed. fizz_buzz(3) expected: "Fizz" actual: "Hello"
I made this pass with the following slime
void fizz_buzz(char * result, size_t size, int n)
{
if (n == 3)
strncpy(result, "Fizz", size);
}
Next, I returned to the test and added a test for 6:
static void numbers_divisible_by_three_are_Fizz(void)
{
assert_fizz_buzz("Fizz", 3);
assert_fizz_buzz("Fizz", 6);
}
I ran the test, fully expecting it to fail, but it passed!
Can you see the problem?
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
The problem is in
assert_fizz_buzz which starts like this:
static void assert_fizz_buzz(const char * expected, int n)
{
char actual[16];
...
}
Here's what's happening:
assert_fizz_buzz("Fizz", 3)is calledchar actual[16]is definedfizz_buzz(actual, sizeof actual, 3)is calledif (n == 3)istrue"Fizz"isstrncpy'd intoactualfizz_buzz(actual, sizeof actual, 3)returnsstrcmpsays thatexpectedequalsactual- ...
assert_fizz_buzz("Fizz", 6)is calledchar actual[16]is definedactualexactly overlays its previous location so its first 5 bytes are still'F','i','z','z','\0'fizz_buzz(actual, sizeof actual, 6)is calledif (n == 3)isfalsefizz_buzz(actual, sizeof actual, 6)returnsstrcmpsays thatexpectedequalsactual
My mistake was in the test;
actual has automatic
storage duration so does not get initialized.
It's initial value is indeterminate.
The first call to assert_fizz_buzz is accidentally interfering
with the second call.
Tests should be isolated from each other.
I tweaked the test as follows:
static void assert_fizz_buzz(const char * expected, int n)
{
char actual[16] = { '\0' };
...
}
I ran the test again and this time it failed :-)
...: assert_fizz_buzz: Assertion `0' failed. fizz_buzz(6) expected: "Fizz" actual: ""
I made the test pass:
void fizz_buzz(char * result, size_t size, int n)
{
if (n % 3 == 0)
strncpy(result, "Fizz", size);
}
Let's hear it for starting with a test for new functionality and seeing it fail.
an interesting TDD episode
I'm doing the roman-numerals kata in C.
I write a test as follows:
I write a do-nothing implementation of
I run the tests and I get (I kid you not) this:
So I work towards improving the diagnostic with a custom assert, as follows:
I run this and my diagnostic is as follows:
Much better :-)
Now I start to implement
And I'm at green.
I refactor to this:
I refactor to this:
Remembering that in my test, n is one-hundred-and-eleven, I refactor to this:
I refactor to this:
And I'm still at green. Now I add a new test:
I run it and am amazed to see it pass.
It takes me a little while to figure out what is going on.
I'll take it line by line.
When
is this
and
which is this:
And
which is this:
And yet again
I edit the code to this:
And I'm still at green.
So now I'm wondering if there are any lessons I can learn from this episode. It was not a good idea to run the tests (to try and get an initial red) when doing so would knowingly cause the (unfinished) program to exhihibit undefined behaviour. In cyber-dojo terms an amber traffic-light is not the same as a red traffic-light. After adding the second test I should have edited
Then I would have got a proper red:
I write a test as follows:
#include "to_roman.hpp"
#include <assert.h>
#include <string.h>
int main(void)
{
char actual[32] = { '\0' };
to_roman(actual, 111);
assert(strcmp("CXI", actual) == 0);
}
I write a do-nothing implementation of
to_roman.I run the tests and I get (I kid you not) this:
...
test: to_roman.tests.c:26: main: Assertion `__extension__ ({ size_t __s1_len, __s2_len; (__builtin_constant_p ("CXI") && __builtin_constant_p (actual) && (__s1_len = __builtin_strlen ("CXI"), __s2_len = __builtin_strlen (actual), (!((size_t)(const void *)(("CXI") + 1) - (size_t)(const void *)("CXI") == 1) || __s1_len >= 4) && (!((size_t)(const void *)((actual) + 1) - (size_t)(const void *)(actual) == 1) || __s2_len >= 4)) ? __builtin_strcmp ("CXI", actual) : (__builtin_constant_p ("CXI") && ((size_t)(const void *)(("CXI") + 1) - (size_t)(const void *)("CXI") == 1) && (__s1_len = __builtin_strlen ("CXI"), __s1_len < 4) ? (__builtin_constant_p (actual) && ((size_t)(const void *)((actual) + 1) - (size_t)(const void *)(actual) == 1) ? __builtin_strcmp ("CXI", actual) : (__extension__ ({ const unsigned char *__s2 = (const unsigned char *) (const char *) (actual); register int __result = (((const unsigned char *) (const char *) ("CXI"))[0] - __s2[0]); if (__s1_len > 0 && __result == 0) { __result = (((const unsigned char *) (const char *) ("CXI"))[1] - __s2[1]); if (__s1_len > 1 && __result == 0) { __result = (((const unsigned char *) (const char *) ("CXI"))[2] - __s2[2]); if (__s1_len > 2 && __result == 0) __result = (((const unsigned char *) (const char *) ("CXI"))[3] - __s2[3]); } } __result; }))) : (__builtin_constant_p (actual) && ((size_t)(const void *)((actual) + 1) - (size_t)(const void *)(actual) == 1) && (__s2_len = __builtin_strlen (actual), __s2_len < 4) ? (__builtin_constant_p ("CXI") && ((size_t)(const void *)(("CXI") + 1) - (size_t)(const void *)("CXI") == 1) ? __builtin_strcmp ("CXI", actual) : (__extension__ ({ const unsigned char *__s1 = (const unsigned char *) (const char *) ("CXI"); register int __result = __s1[0] - ((const unsigned char *) (const char *) (actual))[0]; if (__s2_len > 0 && __result == 0) { __result = (__s1[1] - ((const unsigned char *) (const char *) (actual))[1]); if (__s2_len > 1 && __result == 0) { __result = (__s1[2] - ((const unsigned char *) (const char *) (actual))[2]); if (__s2_len > 2 && __result == 0) __result = (__s1[3] - ((const unsigned char *) (const char *) (actual))[3]); } } __result; }))) : __builtin_strcmp ("CXI", actual)))); }) == 0' failed.
...
So I work towards improving the diagnostic with a custom assert, as follows:
#include "to_roman.h"
#include <assert.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
static void assert_roman(const char * expected, int n)
{
char actual[32] = { '\0' };
to_roman(actual, n);
if (strcmp(expected, actual) != 0)
{
printf("to_roman(%d)\n", n);
printf("expected: \"%s\"\n", expected);
printf(" actual: \"%s\"\n", actual);
assert(false);
}
}
int main(void)
{
assert_roman("CXI", 111);
}
I run this and my diagnostic is as follows:
test: to_roman.tests.c:16: assert_roman: Assertion `0' failed. to_roman(111) expected: "CXI" actual: "" ...
Much better :-)
Now I start to implement
to_roman
#include "to_roman.h"
#include <string.h>
void to_roman(char * roman, int n)
{
roman[0] = '\0';
strcat(roman, "CXI");
}
And I'm at green.
I refactor to this:
#include "to_roman.h"
#include <string.h>
void to_roman(char * roman, int n)
{
roman[0] = '\0';
strcat(roman, "C");
strcat(roman, "X");
strcat(roman, "I");
}
I refactor to this:
#include "to_roman.h"
#include <string.h>
void to_roman(char * roman, int n)
{
const char * hundreds[] = { "C" };
const char * tens[] = { "X" };
const char * units[] = { "I" };
roman[0] = '\0';
strcat(roman, hundreds[0]);
strcat(roman, tens[0]);
strcat(roman, units[0]);
}
Remembering that in my test, n is one-hundred-and-eleven, I refactor to this:
#include "to_roman.h"
#include <string.h>
void to_roman(char * roman, int n)
{
const char * hundreds[] = { "", "C" };
const char * tens[] = { "", "X" };
const char * units[] = { "", "I" };
roman[0] = '\0';
strcat(roman, hundreds[1]);
strcat(roman, tens[1]);
strcat(roman, units[1]);
}
I refactor to this:
#include "to_roman.h"
#include <string.h>
void to_roman(char * roman, int n)
{
const char * hundreds[] = { "", "C" };
const char * tens[] = { "", "X" };
const char * units[] = { "", "I" };
roman[0] = '\0';
strcat(roman, hundreds[n / 100]);
n %= 100;
strcat(roman, tens[n / 10]);
n %= 10;
strcat(roman, units[n]);
}
And I'm still at green. Now I add a new test:
int main(void)
{
assert_roman("CXI", 111);
assert_roman("CCXXII", 222);
}
I run it and am amazed to see it pass.
It takes me a little while to figure out what is going on.
I'll take it line by line.
When
n == 222 this line:
strcat(roman, hundreds[n / 100]);
is this
strcat(roman, hundreds[2]);
and
hundreds[2] is an out-of-bounds index.
However, hundreds[2] just happens to evaluate to the same as
tens[0] which is the empty string. So at this point
roman is still the empty string.
The next lines are these:
n %= 100;
strcat(roman, tens[n / 10]);
which is this:
strcat(roman, tens[2]);
And
tens[2] is also an out-of-bounds index.
And tens[2] just happens to evaluate to the same as units[0]
which is also the empty string. So at this point
roman is still the empty string.
The next lines are these:
n %= 10;
strcat(roman, units[n]);
which is this:
strcat(roman, units[2]);
And yet again
units[2] is an out-of-bounds index.
This time units[2] just happens to evaluate to
"CCXXII" from the test!
So after this roman is "CCXXII" and the test passes!
Amazing!
I edit the code to this:
void to_roman(char * roman, int n)
{
const char * hundreds[] = { "", "C", "CC" };
const char * tens[] = { "", "X", "XX" };
const char * units[] = { "", "I", "II" };
roman[0] = '\0';
strcat(roman, hundreds[n / 100]);
n %= 100;
strcat(roman, tens[n / 10]);
n %= 10;
strcat(roman, units[n]);
}
And I'm still at green.
So now I'm wondering if there are any lessons I can learn from this episode. It was not a good idea to run the tests (to try and get an initial red) when doing so would knowingly cause the (unfinished) program to exhihibit undefined behaviour. In cyber-dojo terms an amber traffic-light is not the same as a red traffic-light. After adding the second test I should have edited
to_roman as follows:
void to_roman(char * roman, int n)
{
const char * hundreds[] = { "", "C", "" };
const char * tens[] = { "", "X", "" };
const char * units[] = { "", "I", "" };
roman[0] = '\0';
strcat(roman, hundreds[n / 100]);
n %= 100;
strcat(roman, tens[n / 10]);
n %= 10;
strcat(roman, units[n]);
}
Then I would have got a proper red:
test: to_roman.tests.c:17: assert_roman: Assertion `0' failed. to_roman(222) expected: "CCXXII" actual: "" ...
lessons from testing
I have run hundreds of test-driven coding dojos using cyber-dojo.
I see the same test anti-patterns time after time after time.
Do some of your tests exhibit the same same anti-patterns?
I see the same test anti-patterns time after time after time.
Do some of your tests exhibit the same same anti-patterns?
are you missing a TDD step?
Here's a TDD state diagram.
But there's something not right!
There's no red-to-red self-transition.
My animal is missing an ear!
I'll add the missing ear.
What is this new ear?
It's for changes made at red.
I see the test fail.
I read the diagnostic.
Then I stay at red and improve the diagnostic.
When I'm happy with diagnostic I get rid of it by making the test pass.
This was part of my lessons from testing presentation which reviews common test anti-patterns I see on cyber-dojo.
Note: I'm being careful not to call this red-to-red transition a refactoring since refactoring is for changes made at green.
- start by writing a test for new functionality
- see it fail
- make it pass
- refactor
- round and round you go
But there's something not right!
There's no red-to-red self-transition.
My animal is missing an ear!
I'll add the missing ear.
What is this new ear?
It's for changes made at red.
I see the test fail.
I read the diagnostic.
Then I stay at red and improve the diagnostic.
When I'm happy with diagnostic I get rid of it by making the test pass.
This was part of my lessons from testing presentation which reviews common test anti-patterns I see on cyber-dojo.
Note: I'm being careful not to call this red-to-red transition a refactoring since refactoring is for changes made at green.
some cyber-dojo measurements
cyber-dojo has hosted about 13,000 practice sessions so far.
I've written a short ruby script to extract some measurements from a sample of 500 sessions.
I was looking at transitions between red, amber, and green traffic-lights:
The second column is colour → colour transition.
The third column is sample size.
Here's how I interpret the results:
- red means one or more tests failed
- amber means the tests did not run (eg syntax error)
- green means the tests ran and all passed
The second column is colour → colour transition.
The third column is sample size.
| 3.94 | amber → green | 447 |
| 4.65 | amber → red | 379 |
| 4.67 | amber → amber | 1462 |
| 5.39 | red → green | 607 |
| 6.01 | red → red | 604 |
| 7.52 | green → red | 420 |
| 13.65 | green → amber | 436 |
| 17.67 | red → amber | 432 |
| 22.18 | green → green | 598 |
Here's how I interpret the results:
- If you're at red or green and you make a small change (5.39,6.01,7.52) you're likely to stay at red or green.
- If you're at red or green and you make a large change (13.65,17.67) you're likely to transition to amber.
- There is a big spike in the number of amber → amber transitions (1462). I speculate that long sequences of these transitions are occuring after a large 13.65 green → amber or 17.67 red → amber transition.
- I think the green → green value of 22.18 is larger than it should be because it's including plain file renames.
testing
Back to quotes table-of-contents
From Perfect Software and Other Illusions About Testing
From Test-driven Development for Embedded C
From Quality Software Management: Vol 4. Anticipating Change
From The Lady Tasting Tea
From Mind and Nature
From Management of the Absurd
From The Right Stuff
From We Seven
From The Pragmatic Programmer
From The Alchemist
From The Importance of Living
From The Mind of War
From An Introduction to General Systems Thinking
From Safer C
From The Mythical Man Month
From Perfect Software and Other Illusions About Testing
You have to know what you're expecting before you give meaning to a test report, otherwise everything looks or sounds right. That's why I'm a strong advocate of the test-first philosophy, whereby developers write their tests to include expected results before they write a line of code. It's what we did fifty years ago, but the practice was gradually lost when industry trends separated testing from development.
From Test-driven Development for Embedded C
TDD helps you go faster... Slowing down is exactly what is needed to go fast!
One test result is worth 1,000 expert opinions. [Wernher von Braun]
From Quality Software Management: Vol 4. Anticipating Change
Testing to improve, not to prove.
Testing is not a stage, but part of a control process embedded in every stage.
From The Lady Tasting Tea
What I discovered working at Pfizer was that very little scientific research can be done alone. It usually requires a combination of minds. This is because it is so easy to make mistakes.
No test can be powerful against all possible alternatives.
From Mind and Nature
Other things being equal (which is not often the case), the old, which has been somewhat tested, is more likely to be viable than the new, which has not been tested at all.
From Management of the Absurd
We need to fail often. If we don't, it means we're not testing our limits.
From The Right Stuff
In the military they always said "flight test" and not "test flying".
What people were seeing on television were, in fact, ordinary test events. Blown engines were par for the course in testing aircraft prototypes and were inevitable in testing an entirely new propulsion system, such as jet or rocket engines.
From We Seven
A test-pilot is fiercely proud of his profession. [Walter Schirra]
In combat, for example, you are thinking about what goes on outside of your airplane… But in test flying you have an entirely different problem. You are concerned about what is going on inside the airplane, and what the aircraft itself is doing. [Deke Slayton]
Each part that goes into the capsule has had a prototype tested to destruction to make sure it can stand the rough ride and the temperature changes. The test procedures are extremely painstaking. First, one part is tested; then two parts are linked together and both of them are tested as a unit. The small units are joined into bigger units for further testing, and this process continues until finally the entire machine is ready for a master test. [Malcomn Scott Carpenter]
From The Pragmatic Programmer
Design to Test.
From The Alchemist
'I had to test your courage,' the stranger said. 'Courage is the quality most essential to understanding the Language of the World.'
From The Importance of Living
We must give up the idea that a man's knowledge can be tested or measured in any form whatsoever.
From The Mind of War
He was always testing the limits - of airplanes, people, science, the military, and, most especially, bureaucracies.
From An Introduction to General Systems Thinking
"Proof" in its original sense was a test applied to substances to determine if they are of satisfactory quality... Over the centuries, the meaning of the word "prove" began to shift, eliminating the negative possibilities...
From Safer C
You can't test quality into software.
From The Mythical Man Month
No part of the schedule are so thoroughly affected by sequential constraints as component debugging and system test.
the pleasure of finding things out
is an excellent book by Richard Feynman (isbn 978-0-141-03143-9). As usual I'm going to quote from a few pages:
Looking at the bird he says, "Do you know what that bird is? It's a brown throated thrush; but in Portuguese it's a … in Italian a …, " he says "in Chinese it's a …, in Japanese a …," etcetera. "Now," he says, "you know in all the languages you want to know what the name of the bird is and when you've finished with all that," he says, "you'll know absolutely nothing whatever about the bird. You only know about humans in different places and what they call the bird. Now," he says, "let's look at the bird."
I said, "Say, Pop, I noticed something: When I pull the wagon the ball rolls to the back of the wagon, and when I'm pulling it along and I suddenly stop, the ball rolls to the front of the wagon," and I says, "why is that?" And he said, "That nobody knows," he said. "The general principe is that things that are moving try to keep on moving and things that are standing still tend to stand still unless you push on them hard." And he says, "This tendency is called inertia but nobody knows why it's true." Now that's a deep understanding - he doesn't give me a name, he knew the difference between knowing the name of something and knowing something, which I learnt very early.
To do high, real good physics work you do need absolutely solid lengths of time.
You cannot expected old designs to work in new circumstances.
If you are in a hurry, you must dissipate heat.
We had lots of fun.
The people underneath didn't know at all what they were doing. And the Army wanted to keep it that way; there was no information going back and forth... I felt that you couldn't make the plant safe unless you knew how it worked… I said that the first thing there has to be is that the technical guys know what we're doing. Oppenheimer went and talked to the security people and got special permission. So I had a nice lecture in which I told them what we were doing, and they were all excited. We're fighting a war. We see what it is. They knew what the numbers meant. If the pressure came out higher, that meant there was more energy released and so on and so on. They knew what they were doing. Complete transformation! They began to invent ways of doing it better. They supervised the scheme. They worked all night. They didn't need supervising at night. They didn't need anything. They understood everything. They invented several of the programs that we used and so forth. So my boys really came through and all that had to be done was to tell them what it was, that's all. It's just, don't tell them they're punching holes. As a result, although it took them nine months to do three problems before, we did nine problems in three months.
Most of the trouble was the big shots coming all the time and saying you're going to break something, going to break something.
We used to go for walks often to get rest.
Advertising, for example, is an example of a scientifically immoral description of the products.
The magnetic properties on a very small scale are not the same as on a large scale.
But what we ought to be able to do seems gigantic compared with our confused accomplishments. Why is this? Why can't we conquer ourselves?
Erosion and blow-by are not what the design expected. They are warnings that something is wrong. The equipment is not operating as expected, and therefore there is a danger that it can operate with even wider deviations in this unexpected and not thoroughly understood way… The O-rings of the Solid Booster Rockets were not designed to erode. Erosion was a clue that something was wrong. Erosion was not something from which safety can be inferred.
We have also found that certification criteria used in Flight Readiness Reviews often develop a gradually decreasing strictness.
The computer software checking system and attitude is of highest quality. There appears to be no process of gradually fooling oneself while degrading standards so characteristic of the Solid Rocket Booster or Space Shuttle Main Engine safety systems. To be sure, there have been recent suggestions by management to curtail such elaborate and expensive tests as being unnecessary at this late date in Shuttle history. This must be resisted for it does not appreciate the mutual subtle influences, and sources of error generated by even small changes of one part of a program on another. There are perpetual requests for changes as new payloads and new demands and modifications are suggested by the users. Changes are expensive because they require extensive testing. The proper way to save money is to curtail the number of requested changes, not the quality of testing for each.
Official management, on the other hand, claims to believe the probability of failure is a thousand times less. One reason for this may be an attempt to assure the government of NASA perfection and success in order to ensure the supply of funds. The other may be that they sincerely believe it to be true, indicating an almost incredible lack of communication between themselves and their working engineers.
It is presumptuous if one says, "We're going to find the ultimate particle, or the unified field laws," or "the" anything.
we seven
is an excellent book by the seven mercury astronauts (isbn 978--4391-8103-4). As usual I'm going to quote from a few pages:
By working with the designers and engineers on a brand-new, complicated airplane you learn to ferret out the bugs and problems before they can be built into the system to worry other pilots who will use later production aircraft. [John Glenn]
Looking back on it now, it sounds a bit silly. But it takes little moments like that to build up a person's tolerance of fear and his ability to face the unknown. [Malcomn Scott Carpenter]
When I got back to the States, I served a hitch teaching some younger pilots how to fly. This kind of duty is probably even more dangerous than combat. At least you know what a MIG is going to do. [Virgil Grissom]
A test-pilot is fiercely proud of his profession. [Walter Schirra]
I could not take care of the polyp right away because part of the procedure before they could operate on it was to keep me absolutely quiet for four days and not let me speak… Later on the medics did put me on a week's silent treatment. I had to break it only once when a NASA official called me up from Langley to ask me how my polyp was coming along. I told him he had just interrupted the cure. [Walter Schirra]
In combat, for example, you are thinking about what goes on outside of your airplane… But in test flying you have an entirely different problem. You are concerned about what is going on inside the airplane, and what the aircraft itself is doing. [Deke Slayton]
If you are an amateur in this business, and you just think you are in trouble, you can really get yourself into trouble very fast by doing the wrong thing first. You might be a whole lot better off if you did nothing at all. [Deke Slayton]
In flying, navigation is generally defined as "continuously detecting and correcting infinitesimal errors in the flight path." [Deke Slayton]
The schedule was flexible. We knew that variable factors such as weather, over which we would have no control, could cause delays. [John Glenn]
This panel groups all of the warning lights in one convenient place so we can see at a glance if any problems have cropped up. [John Glenn]
Each part that goes into the capsule has had a prototype tested to destruction to make sure it can stand the rough ride and the temperature changes. The test procedures are extremely painstaking. First, one part is tested; then two parts are linked together and both of them are tested as a unit. The small units are joined into bigger units for further testing, and this process continues until finally the entire machine is ready for a master test. [Malcomn Scott Carpenter]
We adopted three basic principles. First, we would use any training device or method that had even a remote chance of being useful. Second, we would make the training as difficult as possible so that we would be overtrained, if anything, rather then undertrained. And third, except for some wise scheduling of time, we decided to conduct our training on an informal basis. Everyone assumed from the start that we were mature, well-motivated individuals. Everyone knew we were all eager to make good. [Deke Slayton]
The manual went out of date as fast as the capsule grew… In the meantime… we had to work with some early drawings of the spacecraft that had been included in the original specifications. This was a bit like learning how to cook from looking into some chef's garbage pail. [Deke Slayton]
We did not blame any of our problems on such things as gremlins. For one thing, these creatures belonged to another era. [John Glenn]
We also had daily scheduling meetings to keep everyone informed of our progress and up to date on any problems which cropped up. Here is where we reviewed the work being done on the various systems. [Virgil Grissom]
Even though the electronic machines were clever, we did not let them run the show. [Alan Shepard]
Larry and Jen do Roman Numerals in C++
You may remember Larry and Jen from the popular (900,000+ hits) Deep C (and C++) slide-deck. Well, they're back - this time practising their C++ by doing the Roman Numerals problem.
Here it is in pdf too.
the right stuff
is an excellent book by Tom Wolfe (isbn 978-0-099-47937-6).
A marvelous tale of courage.
As usual I'm going to quote from a few pages:
In the military they always said "flight test" and not "test flying".
Once the theorem and the corollary was understood, the Navy's statistics about one in every four Navy aviators dying meant nothing. The figures were averages, and averages applied to those with the average stuff.
What people were seeing on television were, in fact, ordinary test events. Blown engines were par for the course in testing aircraft prototypes and were inevitable in testing an entirely new propulsion system, such as jet or rocket engines.
Conrad stares at the piece of [blank] paper and then looks up at the man and says in a wary tone, as if he fears a trick: "But it's upside down."
This obsession with active control, it was argued, would only tend to cause problems on Mercury flights. What was required was a man whose main talent was for doing nothing under stress.
The boys' response, however, had not been resignation or anything close to it. No, the engineers now looked on, eyebrows arched, as the guinea pigs set about altering the experiment.
The esprit throughout NASA was tremendous... Bureaucratic lines no longer meant anything. Anyone in Project Mercury could immediately get to see anybody else about any problem that came up.
They had barely moved the first stick of furniture in when the tour buses started arriving, plus the freelance tourists in cars. ... Sometimes people would get out and grab a handful of grass from your lawn. They'd get back on the bus with their miserable little green sprouts sticking out of their fingers. They believed in magic.
Herein the world was divided into those who had it and those who did not.
the importance of living
Is an excellent book by Lin Yutang, isbn 978-0688163525.
As usual I'm going to quote from a few pages:
In the West, the insane are so many that they are put in an asylum, in China the insane are so unusual that we worship them.
I consider the education of our senses and our emotions rather more important than the education of our ideas.
Only he who handles his ideas lightly is master of his ideas, and only he is master of his ideas is not enslaved by them.
A great man is he who has not lost the heart of a child.
Passion holds up the bottom of the world, while genius paints its roof.
The courage to be one's own natural self is quite a rare thing.
An Old Man was living with his Son at an abandoned fort on the top of a hill, and one day he lost a horse. The neighbours came to express their sympathy for his misfortune, and the Old Man asked, "How do you know this is bad luck?" A few days afterwards, his horse returned with a number of wild horses, and his neighbours came again to congratulate him on this stroke of fortune, and the Old Man replied, "How do you know this is good luck?" With so many horses around, his son began to take to riding, and one day he broke his leg. Again the neighbours came round to express their sympathy, and the Old Man replied, "How do you know this is bad luck?" The next year, there was a war, and because the Old Man's son was crippled, he did not have to go to the front.
The trouble with Americans is that when a thing is nearly right, they want to make it still better, while for a Chinese, nearly right is good enough.
When the chains of a bicycle are kept too tight, they are not conducive to the easiest running, and so with the human mind.
Tea in invented for quiet company as wine is invented for a noisy party.
Luxury and expensiveness are the things most to be avoided in architecture.
Taste then is closely associated with courage.
We must give up the idea that a man's knowledge can be tested or measured in any form whatsoever.
Only fresh fish may be cooked in its own juice; stale fish must be flavoured with anchovy sauce and pepper and mustard - the more the better.
The thing called beauty in literature and beauty in things depends so much on change and movement and is based on life. What lives always has change and movement, and what has change and movement naturally has beauty.
Zen and the art of motorcycle maintenance
is an excellent book by Robert Pirsig (isbn 978-0-099-32261-0). As usual I'm going to quote from a few pages:
By far the greatest part of his [the mechanic's] work is careful observation and precise thinking.
Care and Quality are internal and external aspects of the same thing. A person who sees Quality and feels it as he works is a person who cares. A person who cares about what he sees and does is a person who's bound to have some characteristics of Quality.
As Poincaré would have said, there are an infinite number of facts about the motorcycle, and the right ones don't just dance up and introduce themselves. The right facts, the ones we really need, are not only passive, they are damned elusive and we're not going to just sit back and "observe" them. We're going to have to be in there looking for them or we're going to be here a long time. Forever. As Poincaré pointed out, there must be a subliminal choice of what facts we observe. The difference between a good mechanic and a bad one, like the difference between a good mathematician and a bad one, is precisely this ability to select the good facts from the bad ones on the basis of quality. He has to care!
That's really why he got so upset that day when he couldn't get his engine started. It was an intrusion into his reality.
The range of human knowledge today is so great that we're all specialists and the distance between specializations has become so great that anyone who seeks to wander freely among them almost has to forego closeness with the people around him.
This isn't really a small town. People are moving too fast and too independently of one another.
I've a set of instructions at home which open up great realms for the improvement of technical writing. They begin, 'Assembly of Japanese bicycle require great peace of mind.'
Peace of mind isn't at all superficial really, I expound. It's the whole thing. That which produces it is good maintenance; that which disturbs it is poor maintenance. What we call workability of the machine is just an objectification of this peace of mind. The ultimate test's always your own serenity. If you don't have this when you start and maintain it while you're working you're likely to build your personal problems right into the machine itself.
There is an infinity of hypotheses. The more you look the more you see.
It's the sides of the mountain which sustain life, not the top.
It is not the facts but the relation of things that results in the universal harmony that is the sole objective reality.
Always take the old part with you to prevent getting a wrong part.
Impatience is close to boredom but always results from one cause: an underestimation of the amount of time the job will take.
Mu means "no thing". Like "Quality" it points outside the process of dualistic discrimination. Mu simply says, "No class; not one; not zero, not yes, not no." It states that the context of the question is such that a yes or no answer is an error and should not be given. "Unask the question" is what it says. Mu becomes appropriate when the context of the question becomes too small for the truth of the answer.
Apart from bad tools, bad surroundings are a major gumption trap.
Religion isn't invented by man. Men are invented by religion.
When handling precision parts that are stuck or difficult to manipulate, a person with mechanic's feel will avoid damaging the surfaces and work with his tools on the nonprecision surfaces of the same part whenever possible. If he must work on the surfaces themselves, he'll always use softer surfaces to work with them. ... Handle precision parts gently.
Want to know how to paint a perfect painting? It's easy. Make yourself perfect and then just paint naturally. That's the way all experts do it.
The real cycle you're working on is a cycle called yourself.
assert_not_diff
The output of Ruby's Test::Unit::TestCase'sassert_equal(expected, actual) when expected and actual are not the same is often not as useful as it could be. For example:
expected = [1,99,2,3,{:a=>34,:b=>43}]
actual = [1,2,3,4,{:a=>324,:c=>555,:b=>43},5]
assert_equal expected, actual
produces:
<[1, 99, 2, 3, {:a=>34, :b=>43}]> expected but was
<[1, 2, 3, 4, {:a=>324, :c=>555, :b=>43}, 5]>.
For large objects which differ only slightly the output that tells you where they differ quickly gets lost in the mass of output telling you where they don't.Perhaps what's happened here is a sort of Primitive Obsession.
It's as though assert_equal assumes it will only ever be called with primitives.
So I put together a little bit of code to help out. The simplest solution I can think of uses json and diff, so I thought I'd call it assert_not_diff
require 'json'
require 'Tempfile'
def assert_not_diff(lhs,rhs)
if lhs != rhs
puts `diff -y #{file_for lhs} #{file_for rhs}`
end
end
def file_for(obj)
exp = Tempfile.new("bk", "/tmp").open
exp.write(JSON.pretty_generate(obj))
exp.close
exp.path
end
expected = [1,99,2,3,{:a=>34,:b=>43}]
actual = [1,2,3,4,{:a=>324,:c=>555,:b=>43},5]
assert_not_diff expected, actual
This produces:
[ [
1, 1,
99, <
2, 2,
3, 3,
> 4,
{ {
"a": 34, | "a": 324,
> "c": 555,
"b": 43 "b": 43
} | },
> 5
] ]
Which I think is a lot more useful.Hope this proves useful to someone!
Perfect Software and other illusions about testing
is the title of an excellent book by Jerry Weinberg. As usual I'm going to quote from a few pages:
Testing a system is a process of gathering information with the intent that the information could be used for some purpose.
Without a process that includes regular technical reviews, no project will rise above mediocrity, no matter how good its machine-testing process.
At least half your testing costs can be cut before anybody ever runs a test, if only your systems are designed with testability in mind.
"We absolutely need this software in place twenty-four weeks from tomorrow. We need two weeks to staff up and get approvals. Then we'll need four weeks for requirements, four weeks for architecture. four weeks for design, and eight weeks for coding. That adds up to twenty-two weeks, so we'll have two weeks left for testing."
If you have ten kilograms of pure uranium-235 and you add another ten kilograms, you'll have twenty kilograms. But if you do this a few more times, you won't have fifty kilograms, you'll have a nuclear explosion. One plus one doesn't always equal two.
The more bugs you find, the more you're going to find, not the other way around.
The human mind craves meaning. If you feed people a random bit of data, they'll struggle to divine meaning from it - and they'll move from the intake phase to the meaning phase so fast they won't be aware of doing so.
You have to know what you're expecting before you give meaning to a test report, otherwise everything looks or sounds right. That's why I'm a strong advocate of the test-first philosophy, whereby developers write their tests to include expected results before they write a line of code. It's what we did fifty years ago, but the practice was gradually lost when industry trends separated testing from development.
That separation occurred initially because it's psychologically difficult for people to test their own programs. There's still significant risk if you rely on test-first without pair-programming or some other process that casts more than one pair of eyes, and more than one brain, on a program.
If test-first is a good idea, then significance-first is even better. Why? … if you actually perform even an enormous number of tests, you would likely lose the valuable information among all the worthless crud. The number of tests performed should be as small as possible, but no smaller.
"We didn't have any problems until we started testing. We were right on schedule. Testing screwed up everything."
"What the American public wants in the theatre is a tragedy with a happy ending." [William Dean Howells]
salmon fishing
is an excellent book by Hugh Falkus (isbn 0-85493-114-9 ). Unusually I'm going to quote from one page - the preface:
No creature on earth treats the dogmatist more sternly than Salmo salar.
Contradictory in everything he does, Salmo cannot be tied down by dogma.
In addition to providing days of rare enchantment, it taught me that much of what of what I had read about Salmo's behaviour was pretty dubious. Often, the fish I was watching seemed a different creature from the one encountered in print. And over the years it began to dawn on me that far from being fact, a lot of what I had hitherto taken for granted was indeed fallacy; that most authors had written about what they thought salmon did, not what they had seen them do.
Since then I have tested certain of their statements, some of which will be found amongst these pages: assertions by acknowledged experts and regarded as Gospel - but unsound. Most of the accepted tenets of salmon fishing started as conjecture; but gradually, parroted by writer after writer, part of this conjecture became "fact".
In salmon fishing there is no magic substitute for water-sense, skilful presentation, and persistence.
We are wise to move with the deliberation of the heron, that most stealthy of waders.
I think small hooks are preferable to big hooks - they usually get a better hold.
I remain convinced that black is the most attractive colour for a salmon fly.
I suggest that the nearest we can come to a definitive statement is to say that salmon tend to react to large flies sunk deep at temperatures below 45°, and to small flies near the surface at temperatures above 50°.
the mind of war
is an excellent book about John Boyd by Grant T. Hammond (isbn 978-1588341785). As usual I'm going to quote from a few pages:
…in order to determine the consistency of any new system we must construct or uncover another system beyond it… One cannot determine the character or nature of a system within itself. Moreover, attempts to do so lead to confusion and disorder.
he was always testing the limits - of airplanes, people, science, the military, and, most especially, bureaucracies.
In dozens of interviews, conducted for this book, the most consistent theme and nearly universal comment was that John Boyd was the essence of an honourable man and incorruptible.
Oral, not written, communication and conviction, not accuracy, still rule in military culture.
Boyd liked putting things together (synthesis) better than analysis (taking things apart)...
He also came to appreciate the routine practice and repetition that was required to become really good at something and to overcome the boredom by focusing on minute improvements.
He observed very carefully.
Boyd could go from 500 knots to stall speed, practically stopping the plane in midair, which would force any aircraft on his tail to overshoot him and thus gain the advantage for Boyd. In another trick, he would stand the F-100 on its tail and slide down the pillar of its own exhaust. Fire would come out of the intake in the nose of the aircraft and the tailpipe simultaneously. A seemingly impossible feat, it was challenged by others. Boyd when to Edwards Air Force Base in California, where NASA had two fully instrumented F-100 aircraft, and demonstrated it and other techniques to a series of nonbelievers. The test pilot at Edwards who challenged him at the time was a fellow by the name of Neil Armstrong.
Boyd had never designed an airplane before, but as he told Colonel Ricci and Gen. Casey Dempsey, "I could fuck up and do better than this."
The rule of thumb in the Air Force is that a plane will gain a pound of weight a day for the life of the aircraft.
High entropy implies a low potential for doing work, a low capacity for taking action or a high degree of confusion or disorder… The tendency is for entropy to increase in a system that is closed or cannot communicate with the external systems or environments.
A natural teacher, he understood that if he told you something, he robbed you of the opportunity to ever truly know it for yourself.
We are never deceived. We only deceive ourselves. [Goethe]
Boyd's dictum: "Ask for my loyalty and I'll give you my honesty. Ask for my honesty and you'll have my loyalty."
Subscribe to:
Posts (Atom)







