July 22nd, 2024

Arm's Neoverse V2, in AWS's Graviton 4

Amazon Web Services (AWS) launches Graviton 4 with 96 Neoverse V2 cores, Arm's latest high-performance line. It offers competitive latencies, memory access, and bandwidth, aligning closely with AMD's Zen 4 architecture.

Read original articleLink Icon
Arm's Neoverse V2, in AWS's Graviton 4

Amazon Web Services (AWS) has introduced the Graviton 4, featuring 96 Neoverse V2 cores, the latest high-performance core line from Arm Ltd. The system architecture includes Arm’s CMN-700 mesh interconnect with 36 MB of shared L3 cache. Graviton 4 offers a dual socket configuration with 192 cores and 1536 GB of DDR5. It shows competitive core-to-core latencies and memory access latencies, although remote DRAM access incurs higher penalties. In terms of bandwidth, Graviton 4 outperforms older setups like Milan-X. Neoverse V2, running at up to 2.8 GHz, boasts an 8-component TAGE predictor for branch prediction and a triple level BTB scheme for branch target caching. The core also features a 64 KB L1 instruction cache and a 1536 entry micro-op cache for improved frontend performance. Neoverse V2's out-of-order execution capabilities and integer cluster performance align closely with AMD’s Zen 4 architecture. Overall, Graviton 4 demonstrates competent performance in a dual socket configuration, with a focus on balancing bandwidth, latency, and core efficiency.

Related

Testing AMD's Bergamo: Zen 4c

Testing AMD's Bergamo: Zen 4c

AMD's Bergamo server CPU, based on Zen 4c cores, prioritizes core count over clock speed for power efficiency and density. It targets cloud providers and parallel applications, emphasizing memory performance trade-offs.

Benchmarking ARM Processors: Graviton 4, Graviton 3 and Apple M2

Benchmarking ARM Processors: Graviton 4, Graviton 3 and Apple M2

The blog post compares ARM processors, highlighting Graviton 4's enhanced performance over Graviton 3. Graviton 4 matches Apple M2 in URL parsing but lags in Unicode validation and JSON parsing. Despite falling behind Apple M2 in some tasks, Graviton 4 shows significant improvements over Graviton 3, especially in base64 encoding/decoding.

Qualcomm's Oryon Core: A Long Time in the Making

Qualcomm's Oryon Core: A Long Time in the Making

Qualcomm's Oryon Core, a product of the Nuvia acquisition, enhances Snapdragon X Elite with 12 cores in quad-core clusters for high performance and efficiency. It competes with AMD and Intel, boasting unique features for improved performance in the mobile processor market.

AWS Graviton4 Benchmarks Prove to Deliver the Best ARM Cloud Server Performance

AWS Graviton4 Benchmarks Prove to Deliver the Best ARM Cloud Server Performance

AWS released Graviton4 processors for R8g instances, boasting 30% better performance than Graviton3. Featuring 96 Neoverse-V2 cores, DDR5-5600 memory, and enhancements, Graviton4 excels in web apps, databases, and Java software. Benchmark tests show competitive performance against AMD and Intel. Ampere Computing's ARM64 processors were not included due to limited availability.

An interview with AMD's Mike Clark, 'Zen Daddy' says 3nm Zen 5 is coming fast

An interview with AMD's Mike Clark, 'Zen Daddy' says 3nm Zen 5 is coming fast

AMD's Mike Clark discusses Zen 5 architecture, covering 4nm and 3nm nodes. 4nm chips launch soon, with 3nm to follow. Zen 'c' cores may integrate into desktop processors. Zen 5 enhances Ryzen CPUs with full AVX-512 acceleration, emphasizing design balance for optimal performance.

Link Icon 5 comments
By @adrian_b - 4 months
It is interesting that a Neoverse V2 core has the same area as a Zen 4c core (the low-frequency compact variant of Zen 4).

It is true that at this equal area the Neoverse V2 core includes an additional 1 MB of L2 cache memory, but the greater L2 cache memory is not enough to make it reach the performance of Zen 4c for the applications that are not limited by the memory bandwidth (where Graviton 4 may win).

While Neoverse V2 has a lower, but nonetheless acceptable, performance in comparison with the old Zen 4, it is likely that it also has a lower power consumption, therefore lower operating costs for Amazon, but the value is not disclosed by Amazon. In any case, because the Graviton 4 instances are offered at a lower price, they may be preferable for many applications.

However the new Zen 5 will be in a different performance league. Arm has also announced the successor of Neoverse V2, i.e. Neoverse V3, which is presumably derived from Cortex X4. That will be a faster core, but the differences between Neoverse V2 and Neoverse V3 are much smaller than those between Zen 4 and Zen 5, so the advance of Zen 5 vs. Neoverse V3 will be greater, in everything except possibly the power consumption.

By @janice1999 - 4 months
AMD was originally planning to have an ARM CPU that would be compatible with its AM4 socket. I believe their first customer was Amazon but Amazon dropped them for an in-house design that became Graviton due to performance reasons. Given AMD was not in a good financial state at the time it made sense to concentrate on Zen. It would probably be a very different world if AMD had managed to debut a 'Zen' inspired ARM CPU alongside its first batch of Zen processors.
By @tedunangst - 4 months
Amazing times that there are two CPU architectures from five vendors that mostly work.
By @fulafel - 4 months
The memory latency graph is pretty dramatic, also for the other systems vs the older Westmere Xen. Intel was doing something right in 2010.