June 25th, 2024

NUMA Emulation Yields "Significant Performance Uplift" to Raspberry Pi 5

Engineers at Igalia developed NUMA emulation for ARM64, enhancing Raspberry Pi 5 performance. Linux kernel patches showed 18% multi-core and 6% single-core improvement in Geekbench tests. The concise code may be merged into the mainline kernel for broader benefits.

Read original articleLink Icon
NUMA Emulation Yields "Significant Performance Uplift" to Raspberry Pi 5

Engineers at Igalia have developed NUMA emulation for ARM64, aiming to boost performance on the Raspberry Pi 5. The emulation, implemented through Linux kernel patches, showed an 18% improvement in multi-core performance and a 6% enhancement in single-core performance based on Geekbench tests. By dividing physical RAM into chunks and using an allocation policy like interleaving, the BCM2721 memory controller can better utilize memory chip parallelism. The code for NUMA emulation on ARM64 is concise, with around 100 lines added. The performance benefits observed on the Raspberry Pi 5 and other ARM64 systems have sparked interest in merging this feature into the mainline kernel. The development is seen as promising for enhancing the performance of these systems through simple yet effective changes in memory management.

Related

More ARM Linux Laptops Are on the Way

More ARM Linux Laptops Are on the Way

More ARM Linux laptops are emerging, including Tuxedo Computers' "Drako" with Qualcomm's Snapdragon X Elite chipset to rival Apple's M2. This signals progress in ARM-based Linux devices, supported by Qualcomm's collaboration with Linaro for smoother integration. Challenges persist in ensuring compatibility and driver support, akin to Windows ARM laptops and Apple silicon MacBooks.

Arm64EC – Build and port apps for native performance on Arm

Arm64EC – Build and port apps for native performance on Arm

Arm64EC is a new ABI for Windows 11 on Arm devices, offering native performance benefits and compatibility with x64 code. Developers can enhance app performance by transitioning incrementally and rebuilding dependencies. Specific tools help identify Arm64EC binaries and guide the transition process for Win32 apps.

Zlib-ng 2.2 Speeds Up Compression By ~12% On x86_64 CPUs

Zlib-ng 2.2 Speeds Up Compression By ~12% On x86_64 CPUs

Zlib-ng 2.2 release candidate offers 12% faster compression on x86_64 CPUs with performance optimizations, revamped memory allocation, modern API, and CPU intrinsics support. Improvements focus on memory allocation, system calls reduction, and small buffer processing efficiency. Michael Larabel praises enhanced compression speed and memory handling.

Testing AMD's Bergamo: Zen 4c

Testing AMD's Bergamo: Zen 4c

AMD's Bergamo server CPU, based on Zen 4c cores, prioritizes core count over clock speed for power efficiency and density. It targets cloud providers and parallel applications, emphasizing memory performance trade-offs.

Testing AMD's Giant MI300X

Testing AMD's Giant MI300X

AMD introduces Radeon Instinct MI300X to challenge NVIDIA in GPU compute market. MI300X features chiplet setup, Infinity Cache, CDNA 3 architecture, competitive performance against NVIDIA's H100, and excels in local memory bandwidth tests.

Link Icon 1 comments