Strtod Is Wild
The strtod function in C converts decimal strings to floating-point numbers, facing challenges in accuracy, precision, and memory management. David M. Gay's contributions are significant in its implementation.
Read original articleThe article discusses the complexities of the strtod function in the C standard library, which converts decimal strings to floating-point binary numbers. While it appears straightforward, achieving 100% accuracy in this conversion is challenging due to the potential for arbitrarily long input strings and the need to handle various floating-point representations. The strtod function must account for numerous tricky cases, including different ways to express the same number, precision issues, and rounding rules dictated by IEEE standards. The implementation of strtod often relies on arbitrary-precision arithmetic, which can lead to significant memory allocation challenges. The article highlights the contributions of David M. Gay, a key figure in developing modern strtod implementations, and notes that there are alternative implementations with varying accuracy and memory usage. The author reflects on the relevance of these concepts in their own work on arbitrary-precision arithmetic for a game that requires precise tracking of positions during infinite zooming. The discussion emphasizes the importance of understanding the underlying complexities of seemingly simple functions in programming.
- The strtod function converts decimal strings to floating-point binary numbers but is complex to implement accurately.
- It must handle arbitrarily long input and various floating-point representations, leading to potential precision issues.
- David M. Gay's work is foundational to modern strtod implementations, focusing on accuracy and memory management.
- Different implementations of strtod exist, with varying levels of accuracy and memory usage.
- The author relates these concepts to their own work on arbitrary-precision arithmetic in game development.
Related
The Byte Order Fiasco
Handling endianness in C/C++ programming poses challenges, emphasizing correct integer deserialization to prevent undefined behavior. Adherence to the C standard is crucial to avoid unexpected compiler optimizations. Code examples demonstrate proper deserialization techniques using masking and shifting for system compatibility. Mastery of these concepts is vital for robust C code, despite available APIs for byte swapping.
Floating Point Math
Floating point math in computing can cause inaccuracies in decimal calculations due to binary representation limitations. Different programming languages manage this with varying precision, affecting results like 0.1 + 0.2.
Neo Geo Dev: Fixed Point Numbers
The article explains fixed point numbers for the Neo Geo, enabling decimal-like calculations using integers. It discusses their advantages, drawbacks, and practical coding examples for game development, emphasizing precision management.
Strlcpy and how CPUs can defy common sense
The article compares the performance of `strlcpy` in OpenBSD and glibc, revealing glibc's faster execution despite double traversal, emphasizing instruction-level parallelism and advocating for sized strings for efficiency.
Creating invariant floating-point accumulators
The blog addresses challenges in creating invariant floating-point accumulators for the astcenc codec, emphasizing the need for consistency across SIMD instruction sets and the importance of adhering to IEEE754 rules.
- Google's double-conversion [1], which is best known for introducing the Grisu family of new float-to-decimal algorithms but also has a much less documented float-to-decimal algorithm via successive approximations AFAIK.
- The Eisel-Lemire algorithm [2], which is a Grisu3-like algorithm and returns either correct digits or a much rare fallback signal and currently in the standard libraries of Go and Rust.
- I believe Microsoft's own C Runtime (msvcrt, later ucrt) also has a completely separate code which algorithm is roughly similar to one of above.
These implementations also clearly demonstrate that such conversion only needs a bigint support of the bounded size (~3 KB) and can be done in much smaller code than dtoa.
[1] https://github.com/google/double-conversion
[1] https://lemire.me/blog/2020/03/10/fast-float-parsing-in-prac...
In other words-- what percentage of outstanding publicly-accessible data sets require an implementation of strod which can allocate memory on the heap?
Even this article, which talks about millions of digits, could be parsed just fine with a strod that's limited to 64 characters.
Related
The Byte Order Fiasco
Handling endianness in C/C++ programming poses challenges, emphasizing correct integer deserialization to prevent undefined behavior. Adherence to the C standard is crucial to avoid unexpected compiler optimizations. Code examples demonstrate proper deserialization techniques using masking and shifting for system compatibility. Mastery of these concepts is vital for robust C code, despite available APIs for byte swapping.
Floating Point Math
Floating point math in computing can cause inaccuracies in decimal calculations due to binary representation limitations. Different programming languages manage this with varying precision, affecting results like 0.1 + 0.2.
Neo Geo Dev: Fixed Point Numbers
The article explains fixed point numbers for the Neo Geo, enabling decimal-like calculations using integers. It discusses their advantages, drawbacks, and practical coding examples for game development, emphasizing precision management.
Strlcpy and how CPUs can defy common sense
The article compares the performance of `strlcpy` in OpenBSD and glibc, revealing glibc's faster execution despite double traversal, emphasizing instruction-level parallelism and advocating for sized strings for efficiency.
Creating invariant floating-point accumulators
The blog addresses challenges in creating invariant floating-point accumulators for the astcenc codec, emphasizing the need for consistency across SIMD instruction sets and the importance of adhering to IEEE754 rules.