Binary Search Tree with SIMD
Clément Jean presents a cache-friendly algorithm for binary search trees in Go, optimizing memory access with SIMD operations. The approach enhances search efficiency, demonstrated through benchmark tests, and suggests broader applications in data structures.
Read original articleClément Jean explores a cache-friendly algorithm for data structures like trees, focusing on binary search trees (BST) implemented in Go. The traditional array representation of a BST lacks cache efficiency, prompting the adoption of a new layout resembling parent-children triangles. This layout enables Single Instruction, Multiple Data (SIMD) operations on parent-child pairs for efficient searching. The article delves into the implementation details, including ARM64 assembly code for SIMD instructions. By loading elements into vectors and applying SIMD operations, the algorithm efficiently searches for elements in the BST. The process involves comparing elements, mapping indices, and updating search parameters iteratively. Benchmark tests demonstrate the performance benefits of the SIMD-based binary search compared to a traditional binary search. The article concludes by highlighting the potential applications of this cache-friendly BST algorithm within a frozen AVL Tree data structure. Readers are encouraged to explore further details on the implemented data structure and provide feedback for future improvements.
Related
B-Trees Require Fewer Comparisons Than Balanced Binary Search Trees
B-trees require fewer comparisons than balanced binary search trees due to better access locality. Larger k values in B-trees lead to tighter bounds, outperforming AVL trees for k>=8, offering practical advantages.
Do not taunt happy fun branch predictor
The author shares insights on optimizing AArch64 assembly code by reducing jumps in loops. Replacing ret with br x30 improved performance, leading to an 8.8x speed increase. Considerations on branch prediction and SIMD instructions are discussed.
How to implement a hash table in C (2021)
This article explains implementing a hash table in C, covering linear/binary search, hash table design, simple hash function, collision handling, resizing, and API design. Code snippets and GitHub repository link provided.
Modeling B-Trees in TLA+
Lorin Hochstein explores B-trees using TLA+, modeling operations like key-value retrieval and insertion. He emphasizes historical outputs and node structures, offering insights into B-tree functionality in databases.
Compact Fenwick trees for dynamic ranking and selection (2019)
The paper presents Compact Fenwick trees, enhancing array storage and operations efficiency. It introduces optimized variants for faster searches and reduced space usage, aiming to create a dynamic bit vector with logarithmic time complexity. Additionally, it proposes solutions to leverage CPU cache structures for enhanced performance.
I don't follow at all the example of searching for the number 62, deciding that the triplet 41-23-61 contains all numbers greater than 62, and then deciding to "access the 4th child" 73-67-79 — what is the 4th child in a binary tree? How do we determine the index to move to in our flattened array?
It would probably be better to read the original paper "FAST: Fast Architecture Sensitive Tree Search on Modern CPUs and GPUs" (https://doi.org/10.1145/1807167.1807206) if this is of interest.
I'd expect that the natural SIMD-extension to a binary search tree would be a 4-BTree (assuming 4-elements compared per SIMD unit).
For AVX512, I'd expect a 16-BTree to be used.
----------
EDIT:
> The paper mentionned at the beginning, propose to have a binary tree layed out like the following:
> And if you take time to understand how this maps back to the binary tree, you will notice that we are storing parent-children triangles. With this, we can now apply SIMD operations on both the parent and the children in order to either dtermine if the data we are looking for is in the triangle, or if we should continue our search.
Hmmmmmmm. So maybe they are making a B-Tree here except not calling it one?
Related
B-Trees Require Fewer Comparisons Than Balanced Binary Search Trees
B-trees require fewer comparisons than balanced binary search trees due to better access locality. Larger k values in B-trees lead to tighter bounds, outperforming AVL trees for k>=8, offering practical advantages.
Do not taunt happy fun branch predictor
The author shares insights on optimizing AArch64 assembly code by reducing jumps in loops. Replacing ret with br x30 improved performance, leading to an 8.8x speed increase. Considerations on branch prediction and SIMD instructions are discussed.
How to implement a hash table in C (2021)
This article explains implementing a hash table in C, covering linear/binary search, hash table design, simple hash function, collision handling, resizing, and API design. Code snippets and GitHub repository link provided.
Modeling B-Trees in TLA+
Lorin Hochstein explores B-trees using TLA+, modeling operations like key-value retrieval and insertion. He emphasizes historical outputs and node structures, offering insights into B-tree functionality in databases.
Compact Fenwick trees for dynamic ranking and selection (2019)
The paper presents Compact Fenwick trees, enhancing array storage and operations efficiency. It introduces optimized variants for faster searches and reduced space usage, aiming to create a dynamic bit vector with logarithmic time complexity. Additionally, it proposes solutions to leverage CPU cache structures for enhanced performance.