`noexcept` affects libstdc++'s `unordered_set`
The article examines the impact of the `noexcept` specifier on `std::unordered_set` performance in libstdc++, highlighting optimization opportunities and advocating for improvements to handle hash function efficiency better.
Read original articleThe article discusses how the `noexcept` specifier impacts the performance of `std::unordered_set` in GNU libstdc++. It explains that the layout of nodes in hash-based associative containers can change based on whether the hash function is marked as `noexcept`. The author illustrates how the `erase` function can be optimized by precomputing and storing the hash value in each node, which can improve performance during element removal and rehashing. The article highlights that libstdc++ does not cache hash values by default, assuming that recomputing them is inexpensive, which can lead to performance issues when using non-trivial hash functions. The author provides benchmarks showing the performance differences based on the type of hash function used, emphasizing that marking a non-trivial hash function as `noexcept` can lead to increased calls to the hash function and slower performance. The conclusion suggests that while these details may not be critical for most programming tasks, they highlight the importance of understanding how `noexcept` can affect performance in specific scenarios. The author advocates for improvements in libstdc++ to better handle these cases.
- The `noexcept` specifier can significantly affect the performance of `std::unordered_set`.
- Storing hash values in nodes can optimize performance during element removal and rehashing.
- libstdc++ assumes recomputing hash values is cheap, which can lead to inefficiencies with non-trivial hash functions.
- Marking non-trivial hash functions as `noexcept` can result in performance degradation.
- The author calls for improvements in libstdc++ to address these performance issues.
Related
Group Actions and Hashing Unordered Multisets
Group actions are used to analyze hash functions for unordered sets and multisets, ensuring order-agnostic hashing. By leveraging group theory, particularly abelian groups, hash functions' structure is explored, emphasizing efficient and order-independent hashing techniques.
Benchmarking Perfect Hashing in C++
Benchmarking perfect hashing functions in C++ using clang++-19 and g++-13 reveals mph as the fastest with limitations. Various hash function implementations are compared for lookup time, build time, and size, aiding system optimization.
Ask HN: Fast data structures for disjoint intervals?
The author seeks recommendations for innovative data structures to improve read speeds for managing disjoint time intervals, noting that existing solutions often do not outperform simple ordered maps.
Unsafe Read Beyond of Death
The article details the "Unsafe Read Beyond of Death" optimization for the GxHash algorithm, enhancing performance through SIMD instructions and achieving over tenfold speed increases for small payloads while ensuring safety.
C++'s `Noexcept` Can (Sometimes) Help (Or Hurt) Performance
The article examines the mixed performance effects of the `noexcept` keyword in C++, revealing that its impact varies by compiler, OS, and code context, urging cautious application by developers.
In fact most c++ developers believe that throwing an exception in a noexcept function is undefined behavior. It is not. The behavior is defined to call std::terminate. Which would lead one to ask how does it know to call that. Because noexcept functions have a hidden try catch to see if it should call it. The result is that noexcept can hurt performance, which is surprising behavior. C++ is just complicated.
> The outcome is that libstdc++’s unordered_set has performance characteristics that subtly depend on the true name and noexceptness of the hash function.
> - A user-defined struct H : std::hash<std::string> {} will see smaller allocations and more calls to the hash function, than if you had just used std::hash<std::string> directly. (Because std::hash<std::string> is on the blacklist and H is not.)
> - A user-defined hasher with size_t operator() const noexcept will see smaller allocations and more calls to the hash function (especially during rehashing). One with size_t operator() const will see larger allocations and fewer calls to the hash function.
I think this would be an ABI break.
I mean, of course you should follow this article's advice on noexcept, but there's a whole world of associative container performance optimizations out there.
std unordered map is slow by design for several reasons.
Things of interest are:
- open addressing method: linear, quadratic, robinhood, hybrid
- the default hash function
- the logic for the number of buckets (power of two, prime) and associated hash projection (fibonacci, modulo).
Boost seems to be doing a decent job at this now.
Related
Group Actions and Hashing Unordered Multisets
Group actions are used to analyze hash functions for unordered sets and multisets, ensuring order-agnostic hashing. By leveraging group theory, particularly abelian groups, hash functions' structure is explored, emphasizing efficient and order-independent hashing techniques.
Benchmarking Perfect Hashing in C++
Benchmarking perfect hashing functions in C++ using clang++-19 and g++-13 reveals mph as the fastest with limitations. Various hash function implementations are compared for lookup time, build time, and size, aiding system optimization.
Ask HN: Fast data structures for disjoint intervals?
The author seeks recommendations for innovative data structures to improve read speeds for managing disjoint time intervals, noting that existing solutions often do not outperform simple ordered maps.
Unsafe Read Beyond of Death
The article details the "Unsafe Read Beyond of Death" optimization for the GxHash algorithm, enhancing performance through SIMD instructions and achieving over tenfold speed increases for small payloads while ensuring safety.
C++'s `Noexcept` Can (Sometimes) Help (Or Hurt) Performance
The article examines the mixed performance effects of the `noexcept` keyword in C++, revealing that its impact varies by compiler, OS, and code context, urging cautious application by developers.