Digital Aggregates | Article: In Praise of do-while (false)

Article: In Praise of do-while (false)

In Praise of do-while (false)

J. L. Sloan

2005-08-01

Updated 2006-02-18

By now I would have thought that everyone knew the joys of the language construct do-while(false). It is a staple among C, C++, and Java programmers. You can find articles written about it on the web from as far back as 1994, which might as well be Neolithic cave drawings.

Yet I’ve continued to have problems getting code using do-while(false) through code inspections with inspectors who should know better, the result being that I ended up submitting what is, in my no reason to be humble opinion, lower quality code into the code bases of products. I did however conform to the stringent ISO9001 quality processes, so I guess that means it must be okay.

Except that it isn’t okay.

There is nothing magic about do-while(false). It does exactly what you think it does, which is to say, very little. In fact, it does so little, your typical optimizing compiler won’t generate any code for it. Yet, it is really handy in a few circumstances.

(As usual, all of my code snippets are written in C++ unless C is absolutely necessary, and all my referenced examples are from the Digital Aggregates Desperado open source library of reusable components from http://www.diag.com/navigation/downloads/Desperado.html.)

Common Exit Flow of Control

All C++ functions have a common entry point. It is frequently desirable for functions to have a common exit point. There are all sorts of reasons for this. The most pragmatic reason is having a common entry and exit point makes it easy to add debugging statements that log the arguments being passed into the function, and the results generated by the function. If the flow of control needs to return prematurely, it can do so while not avoiding the logging statement at the common exit, just by doing a break.

bool function1(int argument1) {

int rc = 0;

printf(“%s[%d]: function1(%d)\n”,

__FILE__, __LINE__, argument1);

do {

// Some really complicated code.

if (bogus) {

rc = -1;

break;

}

// Some more really complicated code.

if (giveup) {

rc = -2;

break;

}

// Yet more really complicated code.

if (error) {

rc = -3;

break;

}

} while (false);

printf(“%s[%d]: function1=%d\n”,

__FILE__, __LINE__, rc);

return rc;

}

The inner logic uses the break statement to drop out the bottom of the do-while (false). No need for labels. No need for maintaining and checking flags. And if the inner logic completes and the flow of control finds itself at the while (false), it simply drops through. No harm, no foul, no iteration.

Adherents of other languages both more modern and more ancient will recognize this control structure as something you might have known as a do-end. It would be great if C++ (and C) had something similar, perhaps a way to use break to exit out the bottom of any compound statement (that is, a block of statements enclosed in curly braces). Alas, a break can only occur in the context of a switch or loop construct. So in order to use break, we must provide the compiler with a loop construct, albeit one that never actually loops.

This pattern is applicable to any block of logic, not just functions. I frequently use it when I am writing a long sequence of data transformations or functions calls, all of which must succeed for the result to be useful. If any step in the sequence fails, it does a break to the end of the block. Refactoring fans will be pleased that the pattern can be used to refactor spaghetti code into something more readable. The Design By Contract crowd will like the fact that code written with a common exit can establish preconditions above the do and postconditions below the while (false). Formal Verification folks will like the idea of establishing invariants (assertions that remain true during execution) before and after the do-while (false). And although I find the idea of proving any non-trivial piece of code correct pretty laughable, I do find the concept of invariants when thinking about program correctness to be very powerful.

The use of the break statement is obviously not universally applicable. If you are using it from inside another looping control structure, including other do-while (false) constructs, or from inside a switch statement, then it is not going to drop to the bottom of the outer do-while (false).

Instead, if you are using an ancient language like C, which I have a lot of affection for, the same way I might have had for Latin, had I studied Latin in high school instead of goofing off in the computer lab, you could have accomplished the same thing using a goto. In fact, this application is one of the few in which I find the use of goto acceptable.

But if you are using a modern language that doesn’t have a goto, or if like me you find the use of goto a slippery slope, or even perhaps it is too reminiscent of those thousands of lines of FORTRAN IV that you wrote decades ago, the memory of which you are desperately trying to suppress, then this is a useful technique.

Desperado uses this pattern in several places, but for a simple example, see the method CellRateThrottle::admissible(ticks_t ticks).

The use of do-while (false) to implement a common exit flow of control is merely good practice. There is another context in which it is absolutely necessary.

Compound Statements and Preprocessor Macros

My name is John, and I use the C preprocessor when writing in C++. As much as the C++ purists like inline functions (and truth be told so do I), there are situations in which they just don’t cut it. Desperado makes use of the C preprocessor in its generics.h header file, which provides preprocessor macros to do fun things like compute the largest signed two’s complement binary number of any basic data type. I’ve tried to write an inline function to do that, and I would be pleased to see the results of anyone who did so successfully without using the preprocessor. (I’m sure it could be done with a templated function, but then it could not be used in C.) The C preprocessor is a powerful form of code reuse known as code generation, and like all powers, with it comes responsibility. It must be used only for good and never for evil.

So given that I’m going to use the C preprocessor whether the C++ crowd likes it or not, consider the following code snippet.

#define TRANSFORM(_A_, _B_) \

function1(_A_); \

function2(_B_)

Now consider its use in this context.

if (transformable)

TRANSFORM(x, y);

It’s not going to do the right thing, is it? The preprocessor will expand it thusly.

if (transformable)

function1(x);

function2(y);

This is clearly not what the user of the macro intended. You might be able to make up a lot of excuses for writing macros like the one above, but regardless, you have done something to surprise anyone that uses it. You have designed an abstraction that does not conform to the behavior any competent programmer would expect. You can argue that your coding standard requires curly braces around even single statements in if-else blocks. This is not going to be helpful to your fellow developer who has to port ten thousand lines of third-party code, code which follows its own coding standard, and wants to use your macro to make their job easier.

The logical thing is to place the function calls in a compound block instead.

#define TRANSFORM(_A_, _B_) \

{ \

function1(_A_); \

function2(_B_); \

}

Then our code snippet will expand into something like this.

if (transformable)

{

function1(x);

function2(y);

}

Looks better at first glance, doesn’t it? Now both functions are part of the conditional.

So try this.

if (transformable)

TRANSFORM(x, y);

else

TRANSFORM(q, r);

Now our snippet expands to something like this.

if (transformable)

{

function1(x);

function2(y);

};

else

{

function1(q);

function2(r);

}

This will not compile. The semicolon trailing the first invocation of the TRANSFORM macro is actually a null statement, separate from the compound block preceding it. It becomes a statement in-between the if clause and the else clause. Using a semicolon following the macro invocation in the expected way leaves the else clause dangling.

The fact that this code does not compile is the good news. The programmer using your macro will merely think that you are incompetent, and will never use your macro, nor probably any code that you write, ever again.

A much worse case would be if the resulting code compiled, but did the wrong thing. I have tried very hard to find a code snippet which compiles but does the wrong thing. I have been unsuccessful. I’m not saying that such a code snippet does not exist, merely that I am not smart enough to find it. If such a snippet exists, then the programmer using your macro will think that you are incompetent while they sit with a baseball bat in the bushes next your house waiting for you to come home. If we were truly judged by a jury of our peers, it would be completely justifiable homicide.

A common approach to fixing this is to use the macro without a semi-colon at the end.

if (transformable)

TRANSFORM(x, y)

else

TRANSFORM(q, r)

This is an unsatisfying solution. You are requiring the user to write code in an unexpected and surprising way. Worse, the requirement to omit the semi-colon is merely an artifact of having to use a compound statement. If a thousand years from now the definition of your macro changes so that it is not a compound statement, then you must churn every single application of it to add the semi-colon. Or you have to put the semi-colon in the macro definition itself, which may cause all sorts of wackiness to ensue. Wouldn’t it be better to just make the macro work like any other C++ statement?

Like Lassie, do-while (false) comes to our rescue. What is it, girl? The barn is on fire? Timmy fell into the well? We write our macro thusly.

#define TRANSFORM(_A_, _B_) \

do { \

function1(_A_); \

function2(_B_); \

} while (false)

The preprocessor now expands our macro into a single C++ statement that must be properly terminated by a semicolon. Hence

if (transformable)

TRANSFORM(x, y);

else

TRANSFORM(q, r);

becomes

if (transformable)

do {

function1(x);

function2(y);

} while (false);

else

do {

function1(q);

function2(r);

} while (false);

The semicolon, added by the user of the macro, is now a required part of the syntax, not a dangling null statement.

All of the snippets I have shown not only compile, but do the expected thing when executed. The do-while (false) control structure serves as a compound statement that is both syntactically and semantically well behaved.

An example of this use of do-while (false) can be found in the reinitializeobject() macro in the Desperado reinitializeobject.h header file. This macro, which is so scary it merits an article all of its own, re-initializes an existing object by using do-while(false) to combine an explicit destructor call with a call to a placement new operator. (Before you send me email, yes, this is a bad idea, which is why Desperado does not use this macro itself.)

One context in which do-while (false) does not work is when you are using the preprocessor to generate code that declares variables.

#define ALLOCATE(_A_, _B_) \

do { \

int _A_; \

int _B_; \

} while (false)

The variables will be allocated on the stack then immediately deallocated when the do-while (false) construct terminates. This is fine if the scope of the variables is limited to the code inside the do-while (false). It is not so useful if they are being declared for use outside of that scope. The simple compound statement has the same flaw.

The Little Control Structure That Could

I hope I have given you a new appreciation of do-while (false), the control structure that does so much, while generating so little in return.