October 2nd, 2024

An extensive benchmark of C and C++ hash tables

The article introduces a benchmarking suite for C and C++ hash-table libraries, evaluating performance across various configurations and operations, while noting limitations in memory usage metrics and lower load factors.

Read original article

An extensive benchmark of C and C++ hash tables

This article presents a comprehensive benchmarking suite for C and C++ hash-table libraries, addressing gaps in existing benchmarks that primarily focus on C++ implementations. The author, Jackson Allan, aims to highlight the performance of various hash tables under specific conditions, particularly for C programmers who may overlook newer libraries. The benchmarking setup includes three configurations: 32-bit integer keys with cheap operations, 64-bit integer keys with expensive operations, and 16-character C-string keys with costly hash and comparison functions. Each table is tested using the same hash functions, and the maximum load factor is set at 0.875. The benchmarks measure various operations, including insertion, deletion, and lookup times, while acknowledging limitations such as the exclusion of memory usage metrics and performance at lower load factors. The article also details several C++ hash tables benchmarked, including absl::flat_hash_map, ankerl::unordered_dense, and boost::unordered_flat_map, among others, each with unique design features and performance characteristics. The results aim to provide insights into the efficiency of these hash tables, particularly in scenarios relevant to developers.

- The benchmarking suite focuses on both C and C++ hash tables, addressing a lack of C table coverage.

- Three configurations are used to evaluate performance under different conditions.

- The benchmarks measure various operations, including insertion, deletion, and lookup times.

- Limitations include the exclusion of memory usage metrics and performance at lower load factors.

- Several C++ hash tables are benchmarked, each with distinct design features and performance characteristics.

Benchmarking Perfect Hashing in C++

Benchmarking perfect hashing functions in C++ using clang++-19 and g++-13 reveals mph as the fastest with limitations. Various hash function implementations are compared for lookup time, build time, and size, aiding system optimization.

How to implement a hash table in C (2021)

This article explains implementing a hash table in C, covering linear/binary search, hash table design, simple hash function, collision handling, resizing, and API design. Code snippets and GitHub repository link provided.

Ask HN: Fast data structures for disjoint intervals?

The author seeks recommendations for innovative data structures to improve read speeds for managing disjoint time intervals, noting that existing solutions often do not outperform simple ordered maps.

`noexcept` affects libstdc++'s `unordered_set`

The article examines the impact of the `noexcept` specifier on `std::unordered_set` performance in libstdc++, highlighting optimization opportunities and advocating for improvements to handle hash function efficiency better.

Questioning the Criteria for Evaluating Non-Cryptographic Hash Functions

The article evaluates non-cryptographic hash functions, emphasizing efficiency and uniform output distribution. It discusses the avalanche criterion, performance testing, and the importance of selecting hash functions based on application needs.

1 comments

By @unclad5968 - 3 months

Why does the STL use such a poor implementation? It says it's performance is due to chasing pointers and allocating/freeing each node individually. Are stl implemented not able to change the details due to backwards compatibility?

An extensive benchmark of C and C++ hash tables

Related

Benchmarking Perfect Hashing in C++

How to implement a hash table in C (2021)

Ask HN: Fast data structures for disjoint intervals?

`noexcept` affects libstdc++'s `unordered_set`

Questioning the Criteria for Evaluating Non-Cryptographic Hash Functions

Related

Benchmarking Perfect Hashing in C++

How to implement a hash table in C (2021)

Ask HN: Fast data structures for disjoint intervals?

`noexcept` affects libstdc++'s `unordered_set`

Questioning the Criteria for Evaluating Non-Cryptographic Hash Functions