November 5th, 2024

State of Python 3.13 Performance: Free-Threading

Python 3.13 introduces free-threading without the GIL, enhancing performance for parallel applications. However, current slowdowns due to interpreter limitations are expected to improve in future releases.

Read original articleLink Icon
State of Python 3.13 Performance: Free-Threading

Python 3.13 has introduced significant performance enhancements, particularly with the experimental free-threaded mode that allows CPython to operate without the Global Interpreter Lock (GIL). This change aims to improve the utilization of multi-core processors, which has been a limitation in previous versions. The release also features a new just-in-time (JIT) compiler and the inclusion of the mimalloc memory allocator. The article discusses the implications of free-threading, particularly in the context of the PageRank algorithm, which is computationally intensive and benefits from parallelization. Traditional multiprocessing methods have drawbacks, such as high memory overhead and communication costs, which can hinder performance. The article compares single-threaded, multithreaded, and multiprocessing implementations of PageRank, highlighting that while the multithreaded approach in Python 3.13 without the GIL shows promise, the free-threaded build currently introduces performance slowdowns due to the disabling of the specializing adaptive interpreter. Future releases, particularly Python 3.14, are expected to address these issues, making free-threading a more viable option for parallel applications. Overall, while the free-threaded mode presents a promising advancement, it remains experimental and is not yet suitable for production use.

- Python 3.13 introduces free-threading, allowing execution without the GIL.

- The new JIT compiler and mimalloc allocator are also part of this release.

- Free-threading can significantly enhance performance for parallel applications.

- Current performance measurements show slowdowns with free-threading due to interpreter limitations.

- Future Python releases are expected to improve the viability of free-threading for production use.

Link Icon 11 comments
By @eigenspace - 6 months
I don't really have a dog in this race as I don't use Python much, but this sort of thing always seemed to be of questionable utility to me.

Python is never really going to be 'fast' no matter what is done to it because its semantics make most important optimizations impossible, so high performance "python" is actually going to always rely on restricted subsets of the language that don't actually match language's "real" semantics.

On the other hand, a lot of these changes to try and speed up the base language are going to be highly disruptive. E.g. disabling the GIL will break tonnes of code, lots of compilation projects involve changes to the ABI, etc.

I guess getting loops in Python to run 5-10x faster will still save some people time, but it's also never going to be a replacement for the zoo of specialized python-like compilers because it'll never get to actual high performance territory, and it's not clear that it's worth all the ecosystem churn it might cause.

By @Decabytes - 6 months
I'm glad the Python community is focusing more on CPython's performance. Getting speed ups on existing code for free feels great. As much as I hate how slow Python is, I do think its popularity indicates it made the correct tradeoffs in regards to developer ease vs being fast enough.

Learning it has only continued to be a huge benefit to my career, as it's used everywhere which underlies how important popularity of a language can be for developers when evaluating languages for career choices

By @ijl - 6 months
Performance for python3.14t alpha 1 is more like 3.11 in what I've tested. Not good enough if Python doesn't meet your needs, but this comes after 3.12 and 3.13 have both performed worse for me.

3.13t doesn't seem to have been meant for any serious use. Bugs in gc and so on are reported, and not all fixes will be backported apparently. And 3.14t still has unavoidable crashes. Just too early.

By @runjake - 6 months
If it were ever open sourced, I could see Mojo filling the performance niche for Python programmers. I'm hopeful because Lattner certainly has the track record, if he doesn't move on beforehand.

https://en.wikipedia.org/wiki/Mojo_(programming_language)

By @the5avage - 6 months
Can someone share insight into what was technically done to enable this? What replaced the global lock? Is the GC stopping all threads during collection or an other locking mechanism?
By @biglost - 6 months
I'm not smart nor have any university title butmy opinion is this it's very good, but efforts should also go into remove features, not just python, i get it, it would breake anything.
By @0xDEADFED5 - 6 months
Nice benchmarks. Hopefully some benevolent soul with more spare time than I can pitch in on threadsafe CFFI
By @santiagobasulto - 6 months
With these new additions it might make sense to have a synchronized block as in Java?
By @aitchnyu - 6 months
Are there web frameworks taking advantage of subinterpreters and free threading yet?