Introduction
With each new release in FME, we are working hard to handle the rounding errors inherent in binary floating-point conversions, which is considered a loss in precision when you are expecting 0.2000000000000000000000 but end up with 0.2000000000000000111022 when there is a float-to-string conversion or vice versa. These conversions occur while using various transformers. This is because some transformers have a specific data type assigned to them, so at some point throughout the translation, the data will go through a “Float To String” or a “String to Float” conversion in the background. There are situations where this conversion cannot happen without the user being very aware of it.
Within FME
Within FME, for a float-to-string example, when the numbers are logged or displayed, they are not normally displayed in hex but are converted to strings. For a string-to-float example, the user is asked to type numbers as parameters in transformers, they are not asked for hex numbers but instead strings. Inside FME, these strings are then converted to an IEEE 754 floating-point number for mathematical processing. Think of a string as a number as a decimal (base 10) value, and a float is a number as a binary (base 2) value.
FME does, however, work very hard to ensure that round-trip values are the same. So the input of float-to-string-to-float or string-to-float-to-string values is identical to the output value. This means that transformers can take a value, convert it for use within the transformer and then output it back in the original format.
The Mathematics Behind This
String to Float
Strings can be stored as an array of characters, but if the string represents a number, it can not be used for calculations, it must be converted into a 64-bit binary floating-point number or a decimal floating-point number. If you require precision, decimal floating-point systems are the way to go, but if you require fast performance, you should use 64-bit binary floating-point systems in FME.
Commonly, Strings are converted into Double data types, which are 64 bits. 1 bit is for the sign (positive or negative), 11 bits are for the exponent, and 52 bits are for the mantissa (significand). All math values are done with binary values, with only 52 bits for a number, not all numbers can be converted precisely.
Consider two cases: the values "0.25" and "0.2"
"0.25" can be stored as a Double precisely:
The denominator of the fraction is a power of 2
binary: 0 01111111101 0000000000000000000000000000000000000000000000000000
hex: 0x3FD0000000000000
decimal: 0.25
"0.2" looks simple but cannot be stored as a Double precisely. Here are the Double values that it lies between:
The denominator of the fraction is a prime number, which is not a power of 2
binary: 0 01111111100 1001100110011001100110011001100110011001100110011001
hex: 0x3FC9999999999999
decimal (rounded off): .199999999999999983346654630623...
binary: 0 01111111100 1001100110011001100110011001100110011001100110011010
hex: 0x3FC999999999999A
decimal (rounded off): .200000000000000011102230246252...
In this case, the mathematically closest value is the second one, 0x3FC999999999999A.
The data is changing, even with a simple number like “0.2” when converting from String to Double. Those numbers at the end are not random but the true value that the number has changed into when stored as a Double. The decimal is rounded, but it is the computer rounding the binary that results in the change in the decimal number when it is converted back.
Float to String
Representing the exact value of any float as a String (base 10) might require many more than 17 digits. Once we decide to limit the number of digits, we are CHANGING the value. Limiting to 17 digits is conventionally used, as it has the essential property of giving each unique IEEE floating-point value its own unique string representation. If you use fewer than 17 digits in the string, you cannot guarantee exact round-tripping back to Double. You also cannot guarantee comparisons (like equality) with the Strings. They will mirror the same comparisons with the Double values.
As of FME 2017.1, FME will use the fewest number of decimal digits needed to ensure that if the value goes back to a Double afterward, it is the original Double. This means that there are cases where FME will use less than 17 digits of decimal precision. However, for reasons previously stated, FME will never use more than 17 digits.
Conclusion
Even though precision loss improves with each FME release, some might still occur. It is best to be aware of it and know why it is occurring. Running all of your math calculations through an AttributeRounder transformer or using the @round function is a good way to avoid this.
Comments
0 comments
Please sign in to leave a comment.