September 1st, 2024

Creating invariant floating-point accumulators

The blog addresses challenges in creating invariant floating-point accumulators for the astcenc codec, emphasizing the need for consistency across SIMD instruction sets and the importance of adhering to IEEE754 rules.

Read original articleLink Icon
Creating invariant floating-point accumulators

The blog discusses the challenges of creating invariant floating-point accumulators for the astcenc codec, which aims to ensure consistent output across different instruction sets like NEON, SSE4.1, and AVX2. The need for invariance arises from the inherent issues with floating-point arithmetic, where the order of operations can affect the final result due to precision variations. The author highlights several problems encountered during the implementation, including the differences in accumulation methods between scalar and vectorized code, variable-width accumulators, and loop tail handling. To address these issues, the author standardized on using fixed-width vector accumulators and adjusted the loop tail processing to maintain consistency across different SIMD widths. The blog emphasizes the importance of adhering to IEEE754 rules and avoiding compiler optimizations that could introduce variability. Additionally, it warns against using fast approximations and fused operations, which can compromise invariance. The author concludes that while achieving invariance may seem complex, careful attention to detail can lead to stable outputs.

- The astcenc codec aims for consistent output across various SIMD instruction sets.

- Floating-point arithmetic can introduce variability due to precision and order of operations.

- Standardizing on fixed-width accumulators helps maintain invariance in vectorized code.

- Compiler settings and optimizations can significantly impact floating-point determinism.

- Fast approximations and fused operations should be avoided to ensure consistent results.

Link Icon 6 comments
By @boulos - 8 months
This seems to keep coming up, and I see confusion in the comments. There is a standard: IEEE 754-2008. There are additional things people add like approximate reciprocals and approximate sqrt. But if you don't use those, and you don't make an association error, you get consistent results.

The question here with association for summation is what you want to match. OP chose to match the scalar for-loop equivalent. You can just as easily make an 8-wide or 16-wide "virtual vector" and use that instead.

I suspect that an 8-wide virtual vector is the right default for people currently, since systems since Haswell support it, all recent AMD, and if you're using vectorization, you can afford to pay some overhead on Arm with a double-width virtual vector. You don't often gain enough from AVX512 to make the default 16-wide, but if you wanted to focus on Skylake+ (really Cascadelake+) or Genoa+ systems, it would be a fine choice.

By @kardos - 8 months
Exact floating point accumulating is more or less solved with xsum [1] -- would it work in this context?

[1] https://gitlab.com/radfordneal/xsum

By @waynecochran - 8 months
Invariance w floating point arithmetic seems like a fool's errand. If the numbers one is working with are roughly on the same order of magnitude than I would consider integer / fixed point instead. You get the same results in this case (as long as you are careful).
By @someguydave - 8 months
Seems crazy to try to paper over hardware implementation differences in software. Some org should be standardizing floating point intrinsics
By @baq - 8 months
By @modulovalue - 8 months
I'm still wondering if there could exist an alternative world where efficient addition over decimal numbers that we developers use on a day to day basis is associative. Is that even possible or is there perhaps some fundamental limit that forces us to trade associativity for performance?

It seems to me that non associative floating point operations force us into a local maximum. The operation itself might be efficient on modern machines, but could it be preventing us from applying other important high level optimizations to our programs due to its lack of associativity? A richer algebraic structure should always be amenable to a richer set of potential optimizations.

---

I've asked a question that is very much related to that topic on the programming language subreddit:

"Could numerical operations be optimized by using algebraic properties that are not present in floating point operations but in numbers that have infinite precision?"

https://www.reddit.com/r/ProgrammingLanguages/comments/145kp...

The responses there might be interesting to some people here.