September 10th, 2024

Strtod Is Wild

The strtod function in C converts decimal strings to floating-point numbers, facing challenges in accuracy, precision, and memory management. David M. Gay's contributions are significant in its implementation.

Read original articleLink Icon
Strtod Is Wild

The article discusses the complexities of the strtod function in the C standard library, which converts decimal strings to floating-point binary numbers. While it appears straightforward, achieving 100% accuracy in this conversion is challenging due to the potential for arbitrarily long input strings and the need to handle various floating-point representations. The strtod function must account for numerous tricky cases, including different ways to express the same number, precision issues, and rounding rules dictated by IEEE standards. The implementation of strtod often relies on arbitrary-precision arithmetic, which can lead to significant memory allocation challenges. The article highlights the contributions of David M. Gay, a key figure in developing modern strtod implementations, and notes that there are alternative implementations with varying accuracy and memory usage. The author reflects on the relevance of these concepts in their own work on arbitrary-precision arithmetic for a game that requires precise tracking of positions during infinite zooming. The discussion emphasizes the importance of understanding the underlying complexities of seemingly simple functions in programming.

- The strtod function converts decimal strings to floating-point binary numbers but is complex to implement accurately.

- It must handle arbitrarily long input and various floating-point representations, leading to potential precision issues.

- David M. Gay's work is foundational to modern strtod implementations, focusing on accuracy and memory management.

- Different implementations of strtod exist, with varying levels of accuracy and memory usage.

- The author relates these concepts to their own work on arbitrary-precision arithmetic in game development.

Link Icon 4 comments
By @lifthrasiir - 7 months
To be clear, David Gay's dtoa is not a modern implementation of decimal-to-float conversion. There had been several much simpler and performant alternatives, including:

- Google's double-conversion [1], which is best known for introducing the Grisu family of new float-to-decimal algorithms but also has a much less documented float-to-decimal algorithm via successive approximations AFAIK.

- The Eisel-Lemire algorithm [2], which is a Grisu3-like algorithm and returns either correct digits or a much rare fallback signal and currently in the standard libraries of Go and Rust.

- I believe Microsoft's own C Runtime (msvcrt, later ucrt) also has a completely separate code which algorithm is roughly similar to one of above.

These implementations also clearly demonstrate that such conversion only needs a bigint support of the bounded size (~3 KB) and can be done in much smaller code than dtoa.

[1] https://github.com/google/double-conversion

[1] https://lemire.me/blog/2020/03/10/fast-float-parsing-in-prac...

By @Neywiny - 7 months
I think recently I had to make my own because strtod was too large for my embedded system. I just wanted simple xx.yy type strings, so it was as easy as parsing 2 integers, convert, divide, add, done. I often wish these functions had more variants for stuff like that.
By @fph - 7 months
I hate the anti-responsive design of this page: I zoom in (ctrl-plus in Firefox), and the text gets smaller.
By @jancsika - 7 months
Would be cool to see a histogram of all of the web's textual representation of floats based on character length.

In other words-- what percentage of outstanding publicly-accessible data sets require an implementation of strod which can allocate memory on the heap?

Even this article, which talks about millions of digits, could be parsed just fine with a strod that's limited to 64 characters.