My C code works with -O3 but not with -O0
- v1.2 - added comments from HN
- v1.3 - added “What does fits mean ?”
- v1.4 - added the conclusion
When something very basic goes wrong, I have this hierarchy of potential culprits:
- the compiler
- buggy hardware
- OS vendor
- last and least me, because I don’t make mistakes :)
So what doesn’t work ? I am doing a simple cast from a floating point value
to an integer value in C. This happens in a routine, that checks, if a double
is integral and if it would fit into a long.
What does “fits” mean ?
If the value is stored into a long instead of a double, it should have the same numerical value. I.e. it shouldn’t matter if the number is stored as a double or a long, even when casting back and forth.
int fits_long( double value)
{
long l_val;
int fits;
if( value < (double) LONG_MIN || value > (double) LONG_MAX)
return( 0);
l_val = (long) value;
fits = ((double) l_val == value);
return( fits);
}
First it checks, if the double
value is within the limits of long
.
If yes, then the the double
value is converted to a long
, and then checked
if both values as double
still match.
This works for LONG_MIN
, INT_MIN
, INT_MAX
but it doesn’t
work for LONG_MAX
.
Blame the compiler
I wrote a few tests to see what’s going on. There is a repository up
on github that has the
source. Let’s look at one file long.c
:
#include <stdio.h>
#include <limits.h>
#include <stdlib.h>
#include <assert.h>
#include <float.h>
int fits_long( double value)
{
long l_val;
int fits;
if( value < (double) LONG_MIN || value > (double) LONG_MAX)
return( 0);
l_val = (long) value;
fits = (double) l_val == value;
if( ! fits)
fprintf( stderr, "Mismatch %f with %ld\n", value, l_val);
return( fits);
}
int main( int argc, char *argv[])
{
double value;
fprintf( stderr, "long (%d bytes): %ld - %ld\n", (int) sizeof( long), LONG_MIN, LONG_MAX);
fprintf( stderr, "double (%d bytes): %g - %g\n", (int) sizeof( double), DBL_MIN, DBL_MAX);
assert( (double) LONG_MAX <= DBL_MAX);
// keep my code optimiza
if( argc == 1)
value = (double) LONG_MAX;
else
value = (double) (LONG_MAX - atoi( argv[ 1]));
if( fits_long( value))
printf( "FITS\n");
else
printf( "NOPE\n");
return( 0);
}
Let’s build it with cc -o long-O0.exe long.c
and run it. cc
is gcc-9. This is
the output on my Intel Xeon machine:
long (8 bytes): -9223372036854775808 - 9223372036854775807
double (8 bytes): 2.22507e-308 - 1.79769e+308
Mismatch 9223372036854775808.000000 with -9223372036854775808
NOPE
But with optimized code cc -O3 -o long-O3.exe long.c
I get:
long (8 bytes): -9223372036854775808 - 9223372036854775807
double (8 bytes): 2.22507e-308 - 1.79769e+308
FITS
Suddenly it works! This has got to be a compiler problem. So lets try another compiler clang-9. But it gets the same result!
Or is it the CPU ?
Suspect the CPU
Intel CPUs have a history of bugs. Did I hit one of those ? Let’s fire up
the Raspberry Pi and see if results differ on ARM.
On ARM the long-long.c
test exhibits the same behavior (long.c
is fine,
because long
is only 4 bytes large on the PI 4.)
./long-long-O0.exe
long-long (8 bytes): -9223372036854775808 - 9223372036854775807
double (8 bytes): 2.22507e-308 - 1.79769e+308
Mismatch 9223372036854775808.000000 with -9223372036854775808
NOPE
./long-long-O3.exe
long-long (8 bytes): -9223372036854775808 - 9223372036854775807
double (8 bytes): 2.22507e-308 - 1.79769e+308
FITS
So it must be a compiler bug! Or ?
Do another experiment
You can give long.exe
a number that will be
subtracted from LONG_MAX
. A manual binary search brought me to the
conclusion,that the problem manifests only with numbers that are larger than
LONG_MAX - 512
.
And 512 are seven bits. Which is peculiar, but does it matter ?
Gather some more data
The problem happens when long
or long-long
match the size of double
a floating point number. Both are 8 bytes large.
How can a 64 bit integer fit into a 64 bit double ?
What does C say ?
If you look at the “New C Standard” we have:
When a finite value of real floating type is converted to an integer type other than
_Bool
, the fractional part is discarded (i.e., the value is truncated toward zero).
Fine long
is not _Bool
.
If the value being converted is outside the range of values that can be represented, the behavior is undefined.
Well we checked against LONG_MIN
and LONG_MAX
and both are well within
DBL_MIN
and DBL_MAX
. So the value seems cool.
Wikipedia
A floating point in 64 bit C ought to be an IEEE 754 which looks like this:
The exponent takes up 11 bit, and the fraction is 52 bit.
Wikipedia says:
The 53-bit significand precision gives from 15 to 17 significant decimal digits precision (2−53 ≈ 1.11 × 10−16). If a decimal string with at most 15 significant digits is converted to IEEE 754 double-precision representation, and then converted back to a decimal string with the same number of digits, the final result should match the original string.
15 is also what DBL_DIG
gives
us. But our LONG_MAX
is 9,223,372,036,854,775,807 so it has 19 digits. If we
subtract 512 it becomes 9,223,372,036,854,775,295 and it fits. That doesn’t
match up.
Godbolt
Looking at the -O3
assembler output on GodBolt of
a simplified version of fits_llong
:
#include <limits.h>
int fits_llong( double value)
{
long long l_val;
l_val = (long long) value;
return( l_val == value);
}
we see this assembler output:
cvttsd2si rax, xmm0
cvtsi2sd xmm1, rax
cmpeqsd xmm1, xmm0
movq rax, xmm1
and eax, 1
ret
The generated code uses floating point (FP) xmm
registers which are 128 bits wide. The
conversion and the comparison are
made in xmm
and the flag is then moved out of xmm
into integer eax
and
returned.
When we look at the -O0
code
push rbp
mov rbp, rsp
movsd qword ptr [rbp - 8], xmm0
cvttsd2si rax, qword ptr [rbp - 8]
mov qword ptr [rbp - 16], rax
cvtsi2sd xmm0, qword ptr [rbp - 16]
ucomisd xmm0, qword ptr [rbp - 8]
sete cl
setnp dl
and cl, dl
and cl, 1
movzx eax, cl
pop rbp
ret
we see that the comparison is done with ucomisd
which is 64 bits wide.
So it’s clear, why -O0
and -O3
differ. In -O3
the computation is done
in higher precision registers and the outcome is different, than if it were
done in -O0
. It’s a little surprising to me, that ARM shows the same
effect. But I know very little about that CPU.
Making it fail predictably
If the double
is not passed in a xmm
register but rather via a true double
the function fails in -O3
and -O0
. To achieve sending a true double I
pass it via a pointer, which is obviously less efficient:
int fits_long( double *value)
{
long l_val;
int fits;
if( *value < (double) LONG_MIN || *value > (double) LONG_MAX)
return( 0);
l_val = (long) *value;
fits = (double) l_val == *value;
if( ! fits)
fprintf( stderr, "Mismatch %g with %ld\n", *value, l_val);
return( fits);
}
But will this always work ? Sometimes I feel like the compiler isn’t my friend
anymore, but rather an enemy. How can I be sure, that fits_long
won’t get
inlined and consequently “improved” to use a double
instead of a double *
?
Since it is not a static, the compiler should keep it as is, so the symbol is
available for linking. But I suspect nothing prevents the compiler from
inlining it anyway in a local calling function. I could move the function to
a different .o
file, so inlining can’t happen. But then I live in fear of
link-time optimizers becoming more aggressive…
I will adorn it with __attribute__(( noinline))
and hope that all compilers
will understand and respect that.
Form an opinion
If you pass a value as double
into a function, it may be using higher
precision registers. The following calculations can be done, and in -O3
in my case
will be done, with those registers exclusively. The results of the casting
operation differ in the 128 bit and 64 bit floating point (FP) domain.
Only a true double
size variable (via a pointer) gives a reliable 64 bit
FP domain result:
__attribute__(( noinline))
int fits_long( double *value)
{
long l_val;
int fits;
l_val = (long) *value;
fits = (double) l_val == *value;
return( fits);
}
So who is to blame ?
In terms of the C-Standard, I am hitting the clause
If the value being converted is outside the range of values that can be represented, the behavior is undefined.
The value LONG_MAX
if 64 bit long can not be represented in a double
also
64 bit long. The mistake is basically on the side of the caller of the function
casting a LONG_MAX
to a double and passing that along.
There is no warning from the compiler though. So you have to know which numbers are kosher and which are not (which I don’t).
Comments from Hacker News
So I linked this article on HackerNews, were it gained some traction. Some of the comments on HackerNews were very illuminating:
Use -fsanitize=undefined
hannob pointed out
that this could have been caught more easily, if one had used the
-fsanitize=undefined
option. I didn’t think of that, but I will remember it
now.
Check FP registers with fetestexcept
dfranke mentioned use of the
floating point status registers. I didn’t know these were accessible in C,
since I don’t do FP computations a lot.
Reading up on this topic a bit more, it looks like a cast from double
to
long
, might raise an exception if the processor is configured that way.
So this is something I’d like to avoid.
SSE registers are 128 bits wide, but precision is 64 bit
heftig notices, that what I thought were 128 bits operations are really 64 bit after all.
The best comment was deleted
I am sure I read this, but I can’t find it anymore. So I don’t know who wrote it. Why was it best ? Because it just showed how I am tripping over my own feet again and again :)
// keep my code optimiza
if( argc == 1)
value = (double) LONG_MAX;
else
value = (double) (LONG_MAX - atoi( argv[ 1]));
The use of some command-line argument to keep the optimizer from optimizing code away is valid, because the input is unpredictable. Yet because I am lazy, I gave a default value, if there are no arguments…
The optimizer snatches up the opportunity to forego the fits_long
call
when argc == 1 (1 is a constant) and does all the calculation at compile time
rather than execution time, with different results.
Fixing this to
// keep my code optimiza
value = (double) (LONG_MAX - atoi( argv[ 1]));
(and supplying 0 on the command line), produces NOPE
output with -O0
and -O3
.
Conclusion
Let’s build it back up from the ground.
What is problematic about this code ?
int fits_long( double value)
{
long l_val;
double d_val;
int fits;
l_val = (long) value; // 1
d_val = (double) l_val; // 2
fits = d_val == value; // 3
return( fits);
}
The conversion in 1 (long) value
may produce an undefined value for l_val
,
if the value is out of bounds for long
.
The same goes for the conversion in 2. Lets assume the undefined value returned from
1 is by chance LONG_MAX
, which we know to not be a double
. Again
we would be running into a case were the value is undefined in 2.
The outcome of the comparison in 3 of an undefined d_val
with value
is
therefore unpredictable. This is no good.
It would be nice to check beforehand, if a value is undefined
and then not attempt the comparison. It’s known,
that 64 bit doubles can represent all integers in the range of -1*2^53+1
to 1*2^53+1
.
Checking for that we get code that doesn’t produce undefined values:
int fits_long( double value)
{
long l_val;
double d_val;
int fits;
if( value < (double) -((1L<<52)+1) && value > (double) ((1L<<52)+1))
return( 0);
l_val = (long) value;
d_val = (double) l_val;
fits = d_val == value;
return( fits);
}
But these are by far not all the integers that a double can “cleanly” represent.
For instance you can store without loss the number 1000000000000000000
into
a double
and a long
. 1*2^53
is only 4503599627370496
. Three digits less.
To check for all values, one needs to use the math library function lrint in conjunction with fetestexcept of the fenv library. With the fenv library, you get access to the floating point status registers.
int fits_long( double d)
{
long l_val;
double d_val;
// may be needed ?
// #pragma STDC FENV_ACCESS ON
feclearexcept( FE_INVALID); // 1
l_val = lrint( d); // 2
d_val = (double) l_val; // 3
if( fetestexcept( FE_INVALID)) // 4
return( 0);
return( d_val == d);
}
What this does is in 1, clear the FP exception flag FE_INVALID
with
feclearexcept
. Then it will do the conversion in 2. If the returned value
is undefined, the FE_INVALID
flag will be set. The returned value is now
casted back into a double
in 3. The devils advocate, tells me that this
could go wrong too and, since the FP registers are sticky, there is no harm and possible gain in
doing the cast before the check. Then it is tested in 4 with
fetestexcept
, if undefined values have been produced.
If yes, we skip the comparison.
Now we know that d_val
and d
are valid. So we can compare without fear of
comparing undefined values.
Caveat
Use of fenv seems to be fairly tricky though. Perusing clang and python discussions, one gets the impression that the optimizer could get in the way. There is a
#pragma STDC FENV_ACCESS ON
for this, but… it’s a pragma. Not everyone supports it.
Open end
I am still looking for an easier way without having to link -lm
to check,
if a double
can be represented by an integer of the same size or not.
I suspect there isn’t any.
Post a comment
All comments are held for moderation; basic HTML formatting accepted.