August 10th, 2024

AMD's Strix Point: Zen 5 Hits Mobile

AMD has launched its Zen 5 architecture with the Ryzen AI 9 HX 370, featuring a dual-cluster design, enhanced multithreaded performance, and competitive memory bandwidth against Intel's processors.

Read original articleLink Icon
CuriositySkepticismEnthusiasm
AMD's Strix Point: Zen 5 Hits Mobile

AMD has introduced its latest CPU architecture, Zen 5, through the Ryzen AI 9 HX 370, part of the Strix Point APUs. This marks a shift as mobile versions of AMD's processors are launching concurrently with their desktop counterparts, a departure from previous generations. The Strix Point features a dual-cluster design with 12 Zen 5 cores, including a high-performance cluster and a density-optimized cluster, aimed at enhancing multithreaded performance. The architecture boasts significant improvements over its predecessor, Zen 4, including a more capable branch predictor, larger caches, and a wider execution pipeline. The design incorporates a unique branch target buffer (BTB) setup, allowing Zen 5 to track a substantial number of branch targets efficiently. Additionally, the CPU's fetch and decode stages are organized into clusters, optimizing instruction delivery. The backend resources have also been enhanced, with a larger reorder buffer and unified schedulers, improving overall execution efficiency. The Ryzen AI 9 HX 370 utilizes a 128-bit LPDDR5-7500 memory setup, providing competitive bandwidth compared to Intel's offerings. Overall, Zen 5 represents a significant evolution in AMD's CPU technology, positioning the company as a formidable competitor in the mobile processor market.

- AMD's Zen 5 architecture is now available in mobile form with the Ryzen AI 9 HX 370.

- The Strix Point APUs feature a dual-cluster design for improved multithreaded performance.

- Significant enhancements include a more capable branch predictor and larger caches.

- The architecture allows for efficient instruction delivery and execution with improved backend resources.

- The Ryzen AI 9 HX 370 offers competitive memory bandwidth compared to Intel's processors.

AI: What people are saying
The comments on AMD's Zen 5 architecture reveal various perspectives and concerns regarding its performance and implications in the market.
  • Users express interest in the battery life of the Ryzen AI 9 HX 370, noting its competitiveness with ARM architectures.
  • There is a discussion about the efficiency of ARM versus x86 CPUs, with some suggesting that AMD and Intel may lag behind Apple and Qualcomm in design efficiency.
  • Commenters inquire about the availability of mini PCs with Zen 5 and seek clarification on technical terms like "strix point."
  • Concerns are raised about memory bandwidth limitations and their potential impact on applications like LLM inference.
  • Some users advocate for better fan control options in laptops to enhance user experience.
Link Icon 12 comments
By @jml7c5 - 3 months
I sit firm in my belief that the best thing Microsoft could do for their laptop ecosystem is to add support for a "max fan speed" slider somewhere prominent in the Windows UI.

People want the option to make their laptop silent or nearly silent. And when users do need the power, they generally prefer a slightly slower laptop at a reasonable volume rather than the roar of a jet engine.

Laptop manufacturers want their devices to score high on benchmarks. The best way to do that is to add a fan that can become very loud.

The incentives are not aligned.

All laptops should be designed to operate passively 100% of the time, if the owner so chooses. I doubt manufacturers will go that route unless Microsoft nudges them towards it. It would have downstream effects on how review sites benchmark laptops (i.e., at various power draws/noise levels producing a curve rather than a single number), which would have downstream effects on what CPU designers optimize for. It'd be great for consumers.

By @sm_1024 - 3 months
IMO, the most interesting thing about this line is the battery life---within an hour of MBP3 and within 2 hours of Asus's Qualcomm. Making it comparable to ARM architectures.

Which is a little surprising because ARM is commonly believed to be much more power efficient than x86.

[1] https://youtu.be/Z8WKR0VHfJw?si=A7zbFY2lsDa8iVQN&t=277

By @aurareturn - 3 months
One of these has to be true (or both true):

1. ARM is inherently more efficient than x86 CPUs in most tasks

2. Nuvia and Apple are better CPU designers than AMD and Intel

Here are results from Notebookcheck:

Cinebench R24 ST perf/watt

* M3: 12.7 points/watt

* X Elite: 9.3 points/watt

* AMD HX 370: 3.74 points/watt

* AMD 8845HS: 3.1 points/watt

* Intel 155H: 3.1 points/watt

In ST, Apple is 3.4x more efficient than Zen5. X Elite is 2.4x more efficient than Zen5.

Cinebench R24 MT perf/watt

* M3: 28.3 points/watt

* X Elite: 22.6 points/watt

* AMD HX 370: 19.7 points/watt

* AMD 8845HS: 14.8 points/watt

* Intel 155H: 14.5 points/watt

In MT, Apple is 1.9x more efficient than Zen4 and 1.4x more efficient than HX 370. I expect M3 Pro/Max to increase the gap because generally, more cores means more efficiency for Cinebench MT. X Elite is also more efficient but the gap is closer. However, we should note that in a laptop, ST matters more for efficiency because of the burst behavior of usage. It's easier to gain in MT efficiency as long as you have many cores and run them at lower wattage. In this case, AMD's Zen5 12 core setup and 24 threads works well in Cinebench. Cinebench loves more threads.

One thing that is intriguing is that X Elite does not have little cores which hurts its MT efficiency. It's likely a remnant of Nuvia designing a server CPU, which does not need big.Little but Qualcomm used it in a laptop SoC first.

Sources: https://www.youtube.com/watch?v=ZN2tC8DfJnc

https://www.notebookcheck.net/AMD-Zen-5-Strix-Point-CPU-anal...

By @mshockwave - 3 months
Switching from individual schedulers to unified one for integer execution makes sense to me, but I still don’t quite understand why FP execution units do the opposite, could somebody explain why?
By @EVa5I7bHFq9mnYK - 3 months
Do all new processors include NPUs now? What if I don't need AI, should I still pay for the unneeded transistors? If only things could be made modular/reconfigurable.
By @apatheticonion - 3 months
Are there any mini PCs with Zen 5?
By @CyberDildonics - 3 months
What does "strix point" mean?
By @imtringued - 3 months
>Read bandwidth from a single cluster caps out at just under 62 GB/s. The memory controller has a bit more bandwidth on tap, but you’ll need to load cores from both clusters to get it.

Except for DRR5-7500 it isn't just "a bit more" it is actually double at 120GB/s. This might pose a challenge for LLM inference, which absolutely needs the full 120GB/s.

By @nullc - 3 months
Speaking of Zen 5, are there any rumors on when 128 core turin-x will ship?