September 13th, 2024

Better-performing "25519" elliptic-curve cryptography

Amazon Web Services has improved the performance of x25519 and Ed25519 cryptographic algorithms, achieving significant gains and enhancing security through automated reasoning and architecture-specific optimizations in AWS LibCrypto.

Read original articleLink Icon
CuriositySkepticismExcitement
Better-performing "25519" elliptic-curve cryptography

Amazon Web Services (AWS) has enhanced the performance of its x25519 and Ed25519 elliptic-curve cryptography algorithms through automated reasoning and microarchitecture-specific optimizations. These algorithms, crucial for secure online communications, are implemented in AWS's open-source cryptographic library, AWS LibCrypto (AWS-LC). The recent updates include assembly-level implementations that improve both performance and correctness assurance. The optimizations leverage the unique characteristics of different CPU microarchitectures, specifically targeting x86_64 and Arm64 architectures. The new implementations have shown significant performance gains, with Ed25519 signing operations increasing by an average of 108% and x25519 operations by 113% across various microarchitectures. Additionally, the team employed constant-time execution strategies to mitigate side-channel attack risks. The correctness of these implementations is verified using automated reasoning techniques, ensuring that the algorithms function as intended. This work not only enhances the efficiency of cryptographic operations but also contributes to the security of AWS services, aligning with the growing demand for robust cryptographic solutions in the digital landscape.

- AWS has improved x25519 and Ed25519 cryptographic algorithms for better performance and correctness.

- The enhancements utilize automated reasoning and optimizations tailored to specific CPU architectures.

- Performance gains include a 108% increase in Ed25519 signing operations and a 113% increase in x25519 operations.

- Constant-time execution strategies are implemented to protect against side-channel attacks.

- The correctness of the implementations is verified using automated reasoning techniques.

AI: What people are saying
The comments on the article about AWS's improvements to x25519 and Ed25519 cryptographic algorithms reveal several key themes and points of discussion.
  • Performance Optimizations: Several commenters highlight existing optimized implementations of these algorithms, noting significant performance gains compared to OpenSSL.
  • Security Concerns: There are discussions about the potential impact of optimizations on security features, particularly regarding timing side channels and formal verification of the implementation.
  • Post-Quantum Cryptography: Some comments question the relevance of x25519 in post-quantum contexts, suggesting that its inclusion in modern protocols may be overstated.
  • Adoption and Usage: Commenters speculate on the potential for ed25519 to become the default key algorithm over RSA, given its performance advantages.
  • Technical Curiosity: There is a strong interest in the technical details of the implementation, including the use of formal methods and the implications of architecture-specific optimizations.
Link Icon 10 comments
By @SEJeff - 8 months
The firedancer team at one of the better HFT firms wrote an AVX512 optimized implementation of ed25519 and X25519 that’s significantly faster than OpenSSL.

https://github.com/firedancer-io/firedancer/pull/716

Ditto for sha256: https://github.com/firedancer-io/firedancer/pull/778

And sha512: https://github.com/firedancer-io/firedancer/pull/760

If you’re an optimization nerd, this codebase is wild.

By @nanolith - 8 months
The formal methods nerd in me is happy to see HOL Light being used to formally verify this implementation. I'm curious to see how closely their abstract machine models follow specific machine implementations. OOO, speculation, and deep pipelining have non-trivial impacts on potential side channels, and these vary quite a bit by stepping and architecture.
By @notfed - 8 months
> The x25519 algorithm also plays a role in post-quantum safe cryptographic solutions, having been included as the classical algorithm in the TLS 1.3 and SSH hybrid scheme specifications for post-quantum key agreement.

Really though? This mostly-untrue statement is the line that warrants adding hashtag #post-quantum-cryptography to the blogpost?

By @saghm - 8 months
My (probably naive) understanding is that 25519 already provided better performance than other algorithms used for similar purposes (e.g. RSA) when tuned for a roughly similar level of security; anecdotally, generating 2048-bit or larger RSA keys for me tends to be a lot slower than ed25519. At times I've run into places that require me to use RSA keys though (ironically, I seem to remember first experiencing this with AWS years back, although I honestly can't recall if this is still the case or not).

If this further improvement becomes widely used, it would be interesting to see if it's enough to tip the scales towards ed25519 being more of the de facto "default" ssh key algorithm. My experience is that a decent number of people still use RSA keys most of the time, but I don't feel like I have nearly enough of a sample size to conclude anything significant from that.

By @aseipp - 8 months
I was aware of s2n-bignum which is a very cool project, but apparently there is a larger sister project, aws-lc, that aims for broader set of APIs including OpenSSL compatibility, while retaining the general approach and vibe (lots of formal verification + performance work): https://github.com/aws/aws-lc

That's pretty sweet. I'm currently using BoringSSL in a project as a supplement to OpenSSL (mostly because it is much easier to build for Windows users than requiring them to fiddle with msys2/vcpkg etc; the alternative is to rely on the Windows CNG API, but it lacks features like ed25519 support.) I wonder how much effort it would take to use aws-lc instead... Not that I'm that interested, BSSL is pretty good, but free performance and heavy automated verification is always nice :)

Related: one of the authors of this post, John Harrison, wrote a really good book about automated theorm proving about 15 years ago while working on floating point verification at Intel -- there's still no other book quite like this one, I think https://www.cl.cam.ac.uk/~jrh13/

By @fefe23 - 8 months
Holy shit these claims are wild! It's not just a percent more performance here and there, the graphs look more like 50% more throughput on the same hardware (depending on the cpu architecture).

My immediate fear was that they optimized away the security features like absence of timing side channels, but they say they still have those.

They also claim to have formal proof of correctness, which is even more amazing, because they are not doing it on a symbolic level but on a machine instruction level. Apparently they tought their reasoning system the semantics of all the CPU instructions used in the assembler implementation.

I'll still wait what djb has to say about this, but it looks freaking amazing to me.

By @jonmon6691 - 8 months
I'm assuming when they say that this improves user experience, that it implies the use case is primarily TLS. In which case store-now-decrypt-later attacks are already considered an urgent threat with regard to post quantum crypto. With FIPS 203 being released and Chrome is already using an implementation based on the draft standard, this seems like this algo (at least for TLS) should be on its way out.
By @webXL - 8 months
Why don't they just focus on making a Gravitron variant with those algorithms in the circuitry?
By @londons_explore - 8 months
Does 25519 suffer from key/data-dependant execution time?

Is this implementation resistant to that?

If it isn't, it's kinda a footgun which shouldn't be published for general use.