Minimum fuss
One way to waste a lot of time trying to find bugs is to write code like this:
#define min(a,b) a < b ? a : b // ... int i(5); int j = min(++i, 8); double d(3.0/min(1.0,2.0)); std::cout << j << std::endl; std::cout << d << std::endl;
(This code is in min.cpp in git://git.istic.org/cplusplus.git.)
As you can see, the programmer has written a macro whose intent is to evaluate
to the lesser of its two arguments. But this is not the actual behaviour. Let's
look first at j.
int j = min(++i, 8);
If we're thinking of min as if it were a function, we'd expect this to evaluate
++i, which is 6, compare it with 8, and then return 6. But in fact j gets
the value 7. How? What does this code become after being run through the macro
preprocessor? You can find out by running the above snippet through gcc -E:
the -E option tells it to stop after the preprocessor step and output what it
has so far. (It doesn't matter whether the snippet would compile or not.)
int j = ++i < 8 ? ++i : 8;
It should be fairly obvious that ++i is going to be evaluated twice: first to
determine the result of ++i < 8, and then (assuming that is true) to compute
the result. In this case, it's fairly obvious that our macro is doing the wrong
thing because the expression we are giving it has noticeable side-effects, but
we could just as easily have done
int j = min(some_function_that_takes_ages_to_run(), 100);
and then the result would be correct, but in many cases the code would take twice as long to run. But what of our other use? We might expect
double d(3.0/min(1.0,2.0));
to evalute to 3.0/1.0, which equals 3.0, but in fact it becomes 2.0. When run through the preprocessor, this line becomes
double d(3.0/1.0 < 2.0 ? 1.0 : 2.0);
which, given that the precedence of / is higher than <, is evaluated in the
order the spacing suggests: that is, the division is done before the comparison.
Another more obvious error results if I try to use the macro like this:
std::cout << min(5,6) << std::endl;
This is for much the same reason: running it through the preprocessor yields the code
std::cout << 5 < 6 ? 5 : 6 << std::endl;
which means that it first does std::cout << 5 , and then tries to compare
the result of that (which is std::cout, because operator<< should always
return its left-hand argument) with 6. There is no such comparison operator
defined, so the compilation fails.
How can we avoid these problems. The first possibility is to define the macro in a more sensible way:
#define min(a,b) ((a) < (b) ? (a) : (b))
The extra set of brackets round the outside stops the ternary operator (? :)
being broken up by other operators, like in my d example, and the brackets
around a and b mean that if they are expressions they will stay together.
It doesn't fix the problem of evaluating arguments too many times, which is
inherent to macros and there's no easy way of avoiding. There's another problem
that's inherent to macros as they are done in C: they don't have any namespacing
mechanism. A macro introduces a new global name, so if I had the following line
somewhere after my macro definition (maybe even in an include file I don't
control), it would fail to compile.
int k = std::numeric_limits<unsigned int>::min();
The macro preprocessor doesn't understand namespaces, so it tries to expand the
min macro even though I actually want to call a static method, and it finds
the call doesn't have enough arguments, so the program fails even before getting
to the compiler proper. The workaround for this, should you ever be stuck with
such a macro interfering with your function or variable names, is to put
brackets around the name of the function:
int k = (std::numeric_limits<unsigned int>::min)();
This separates the min from the following (, which ensures that it is not
treated as a macro.
All of these workarounds are just hackery that shouldn't be necessary any more.
C++ compilers, unlike many early C compilers, honour the inline keyword on
functions, which means you get the advantages of using a function—respecting
namespaces, evaluating its arguments exactly once, not having to worry about the
precedence of operators inside the function body—as well as the efficiency
advantage that macro-lovers cling to, of not introducing the cost of a function
call. The inline keyword is just a hint to the compiler, and it's not obliged
to insert the code inline, but sensible compilers almost always will. You need
to make min templated, so that like its macro counterpart, it can operate on
values of any type (as long as operator< is defined appropriately).
template<typename T>
inline T min(T a, T b)
{
return a < b ? a : b;
}
(The earlier example, using this definition, is given in min-inline.cpp.)
Another good feature of having this is that, even though it is inline, an
actual function is still generated by the compiler. Unlike a macro, you can pass
a pointer to this function to higher-order functions. For example, you could
pass it to std::transform to ensure that all the elements of a list of
numbers are non-negative, by taking the minimum of each with zero.
There's even better news to come: you don't even have to write this function. It
already exists, and is called std::min; std::max exists as well. There is
one slight gotcha with the latter. If asked to guess how std::max is defined,
you might write something like:
template<typename T>
inline T max(T a, T b)
{
return a > b ? a : b;
}
which seems pretty sane, but in fact the crucial line has a < b ? b : a.
There are two reasons for this. First, it means that a user-defined type only
needs to define operator< and not operator> to work with both
std::min and std::max. Second, it ensures that std::min and std::max
called on the same arguments always give different results, even if the
comparison operators are slightly oddly defined. (This might occur if not all
values of the type in question are comparable; that is, if operator<
doesn't represent a total order.)
These two properties are very useful, but it can cause some confusion when you
are replacing a naïve max macro or inline with std::max. If you are taking
the max of a NaN with a real number (whatever the precision), then the version
above will return the real number, but std::max will return the NaN. This is
because all comparisons with NaNs return false: real numbers are neither greater
than nor less than NaNs. That's a minor point, but it caught me out in similar
circumstances.
One final point: if you're developing on Windows, in general it's useful to call
std::min and std::max by their full names rather than bring them into your
namespace with a using std::min declaration. windows.h defines min and
max macros unless you tell it not to, so it's very easy to end up with them
available in your file. If you use std::min, then if this occurs you get an
obvious preprocessor error and you can fix your includes or use the bracket
workaround discussed above, whereas if you just call it min then it will
silently call the macro instead, which may not do what you want.
It's so hard to see the Sun with the truth in your eyes.
Comments on Minimum fuss | no comments | Post a comment