July 28th, 2024

Unsafe Read Beyond of Death

The article details the "Unsafe Read Beyond of Death" optimization for the GxHash algorithm, enhancing performance through SIMD instructions and achieving over tenfold speed increases for small payloads while ensuring safety.

Read original articleLink Icon
Unsafe Read Beyond of Death

The article discusses an optimization technique called "Unsafe Read Beyond of Death" (URBD) developed for the GxHash non-cryptographic hashing algorithm. The author, while working on GxHash during parental leave, aimed to enhance performance by utilizing SIMD (Single Instruction, Multiple Data) instructions, which allow parallel processing of multiple data elements. Traditional methods for handling uneven input lengths involve slower scalar operations, but GxHash avoids this by loading uneven parts into zero-padded SIMD registers.

To further improve speed, the author experimented with reading beyond the end of the input buffer, a risky operation that can lead to crashes. This method resulted in over a tenfold speed increase for small payloads, although it risks reading invalid memory. To mitigate this, a safety check ensures that the end of the buffer lies within the same memory page, which is typically 4KB in size.

Additionally, the article describes how to mask out invalid bytes and incorporate the length of the input into the hashing process to maintain accuracy. The final implementation of URBD allows for a safe and efficient way to handle uneven input lengths while maximizing performance. Benchmark results indicate that GxHash is now the fastest non-cryptographic hashing algorithm, particularly for small payloads, outperforming other algorithms significantly. The source code and benchmarks are available in the GxHash repository.

Link Icon 0 comments