July 8th, 2024

Python Has Too Many Package Managers

Python's package management ecosystem faces fragmentation issues. PEP 621 introduced pyproject.toml for project configurations, leading to new package managers like Poetry. Conda offers robust dependency management, especially for data science workflows.

Read original articleLink Icon
Python Has Too Many Package Managers

Python's package management ecosystem has long been criticized for its fragmentation and lack of a unified tool akin to Cargo for Rust or npm for JavaScript. Various tools like pip, venv, pyenv, and pipenv have attempted to address this issue, but each comes with its own set of limitations and complexities. The recent acceptance of PEP 621 aimed to consolidate Python project configurations into a pyproject.toml file, leading to the emergence of new package managers like Poetry, PDM, Flit, and Hatch. Among these, Poetry stands out for its comprehensive approach to dependency resolution and virtual environment management, although it still faces challenges with slow resolution times and potential issues with dependency bounds. Additionally, the Conda ecosystem, spearheaded by tools like conda and mamba, offers a robust solution for managing Python and non-Python dependencies, particularly catering to data science workflows. While Conda's approach may not be ideal for all use cases, it remains a popular choice for data scientists due to its comprehensive features and integration with key Python tools like Ray and Metaflow.

Related

What's up Python? Django get background tasks, a new REPL, bye bye gunicorn

What's up Python? Django get background tasks, a new REPL, bye bye gunicorn

Several Python updates include Django's background task integration, a new lightweight Python REPL, Uvicorn's multiprocessing support, and PyPI blocking outlook.com emails to combat bot registrations, enhancing Python development and security.

Maker of RStudio launches new R and Python IDE

Maker of RStudio launches new R and Python IDE

Posit introduces Positron, a new beta IDE merging R and Python development. Built on Visual Studio Code, it offers a user-friendly interface, data exploration tools, and seamless script running for polyglot projects.

Python grapples with Apple App Store rejections

Python grapples with Apple App Store rejections

Python 3.12 faced rejections in Apple's App Store due to the "itms-services" string. Python developers discussed solutions, leading to a consensus for Python 3.13 with an "--with-app-store-compliance" option to address the issue.

Python Modern Practices

Python Modern Practices

Python development best practices involve using tools like mise or pyenv for multiple versions, latest Python version, pipx for app running. Project tips include src layout, pyproject.toml, virtual environments, Black, flake8, pytest, wheel, type hinting, f-strings, datetime, enum, Named Tuples, data classes, breakpoint(), logging, TOML config for efficiency and maintainability.

Reproducibility in Disguise

Reproducibility in Disguise

Reproducibility in software development is supported by tools like Bazel, addressing lifecycle challenges. Vendor dependencies for reproducibility face complexity, leading to proposed solutions like vendoring all dependencies for control.

Link Icon 36 comments
By @slt2021 - 6 months
I see a lot of package managers I never head of, am a happy venv & pip user.

  One of the key faults of pip is what happens when you decide to remove a dependency. Removing a dependency does not actually remove the sub-dependencies that were brought in by the original dependency, leaving a lot of potential cruft.

This is not really an issue if your virtual environments are disposable. Just nuke and recreate venv from scratch using only what you need.

This is similar approach to “zero-based budgeting”. It forces you to carefully pick your dependencies and think about what you carry.

I never mention transitive dependencies in my requirements.txt file, just direct dependencies and rely on pip to install all transitive libs.

You dont even have to freeze the version, just list the name and pull up latest version whenever you run pip upgrade

If you dont do that, you can quickly go down the javascript’s path of bloated node_modules.

Can people explain why venv&pip is a bad solution that doesnt work for them they have to resort to other package managers?

Even venv is not really required if you dockerize your python apps, which you will have to do anyways at deploy time

By @sealedservant - 6 months
I concur and I also think that there are too many build backends.

pdm is my current favorite package manager. It is fully PEP-compliant and the lockfile generation is nice. I wouldn't call hatch a package manager because I don't think it can make lockfiles.

uv is on my radar but it doesn't look ready for primetime yet. I saw they are building out a package management API with commands such as `uv add` and `uv remove`. Cross-platform lockfiles, editable installs, in-project .venv, and a baked-in build backend might be enough for me to make the switch. It's my pipe dream to get the full build/install/publish workflow down to a single static binary with no dependencies.

Anna-Lee Popkes has an awesome comparison of the available package managers [0] complete with handy Venn diagrams.

The pyOpenSci team has another nice comparison of the available build backends [1].

[0] https://alpopkes.com/posts/python/packaging_tools/

[1] https://www.pyopensci.org/python-package-guide/package-struc...

By @wantsanagent - 6 months
I use pip. I plan to continue using pip. If I need an isolated environment, I use conda, but then I install everything with pip. If I need to guarantee versions I pip freeze.

There's a lot of cruft and desire for a one-size-fits-all solution but the base tools are probably good enough. My setup is not the one-size-fits-all solution but it works for me, and my team, and lots of other teams.

Beware anyone who tells you that thirty years of tooling doesn't have a solution to the problem you're facing and you need this new shiny thing.*

*Playing with shiny things is fun and should not be discouraged, but must not be mandated either.

By @korijn - 6 months
I have worked with poetry professionally for about 5 years now and I am not looking back. It is exceptionally good. Dependency resolution speed is not an issue beyond the first run since all that hard to acquire metadata is actually cached in a local index.

And even that first run is not particularly slow - _unless_ you depend on packages that are not available as wheels, which last I checked is not nearly as common nowadays as it was 10 years ago. However it can still happen: for example, if you are working with python 3.8 and you are using the latest version of some fancy library, they may have already stopped building wheels for that version of python. That means the package manager has to fall back to the sdist, and actually run the build scripts to acquire the metadata.

On top of all this, private package feeds (like the one provided by azure devops) sometimes don't provide a metadata API at all, meaning the package manager has to download every single package just to get the metadata.

The important bit of my little wall of text here though is that this is all true for all the other package managers as well. You can't necessarily attribute slow dependency resolution to a solver being written in C++ or pure python, given all of these other compounding factors which are often overlooked.

By @nrclark - 6 months
I ship a lot of Python in a CI/CD or devops context, and also deploy it to embedded targets. I've never needed anything more than pip, a venv, and pip-tools (to provide pip-compile). Venvs are treated as disposable. The basic workflow I use is:

One-time (or as-needed for manual upgrades):

   1. Make a venv with setuptools, wheel, and pip-tools (to get pip-compile) installed.
   2. Use venv's pip-compile to generate a fully-pinned piptools-requirements.txt for the venv.
   3. Check piptools-requirements.txt into my repo. This is used to get a stable, versioned `pip-compile` for use on my payload requirements.
During normal development:

    1. Add high-level dependencies to a `requirements.in` file. Usually unversioned, unless there's a good reason to specify something more exact.
    2. On changes to `requirements.in`, make a venv from `piptools-requirements.txt` and its `pip-compile` to solve `requirements.in` into a fully-pinned `requirements.txt`.
    3. Check requirements.in and requirements.txt into the repo. 
    4. Install packages from requirements.txt when making the venv that I need for production.
This approach is very easy to automate, CI/CD friendly, and completely repeatable. It doesn't require any nonstandard tools to deploy (and only needs pip-compile when recompiling requirements.txt). It also makes a clear distinction between "what packages do the developers actually want?" and "what is the fully-versioned set of all dependencies".

It's worked great for me over the years, and I'd highly recommend it as a reliable way to use the standard Python package tooling.

By @VagabundoP - 6 months
Use Rye. It wasn't abandoned it ownership was transferred.

Rye uses other pretty standard stuff under the hood, tools that follow PEPs, its just a front end that is sane. uv is fast as well. It downloads the pinned version of standalone Python, it keeps everything in its own venv and theres very little messing/tweaking of the environment.

It is messy, although its getting better. I doubt everything will ever standardise to one tool however.

By @artexy - 6 months
The people who say "just use pip and venv" don't understand the issue.

Distutils has been ripped out of Python core, setuptools is somewhat deprecated but not really. Just don't call setup.py directly. Or use flit. Or perhaps pyproject.toml? If the latter, flit, poetry and the 100 other frontends all have a different syntax.

Would you like to copy external data into the staging area while using flit? You are out of luck, use poetry. Poetry versions <= 1.2.3.4.5.111 on the other hand do not support a backend.

Should you use pip? Well, the latest hotness are the search-unfriendly named modules "build" and "install", which create a completely isolated environment. Which is the officially supported tool that will stay supported and not ripped out like distutils?

Questions over questions. And all that for creating a simple zip archive of a directory, which for some reason demands gigantic frameworks.

By @poikroequ - 6 months
Poetry isn't perfect but I'm happy to have it. It installs most packages without issue and effortlessly handles multiple versions of Python on the same system.

In my experience, Poetry works much better than, say... ... npm.

By @beaugunderson - 6 months
I share the author's excitement about `uv`. I was a big fan of `pip-compile` because it was the simplest possible way to have clear top-level dependencies that also froze sub-dependencies (you note your top-level dependencies in `requirements.in`, then use `pip-compile` to freeze those plus the sub-dependencies in `requirements.txt`, and it adds comments noting what top-level dependency brought it in).

`uv` is basically that but faster.

By @LarsDu88 - 6 months
Python actually now has over a dozen package managers, none of which are as good as Cargo in Rust. Here's my attempt at a nearly comprehensive rundown.
By @actionfromafar - 6 months
After all, the motto is "there's more than one way to do it"!
By @jen20 - 6 months
I've found Nix to be basically the be-all-and-end-all of making Python work reasonably. Mitchell Hashimoto put out a post about how to actually do that a while ago [1].

[1]: https://mitchellh.com/writing/nix-with-dockerfiles

By @SethMLarson - 6 months
Happily using pip, venv, and pip-tools for every project and still finding them more than suitable. They might not have the marketing budget or pizazz of others, but if you're looking for effective and boring tools that get the job done so you can solve more interesting problems they work just fine.
By @simonw - 6 months
On Rye: "This project was ultimately abandoned by its author in 2023 and given to Astral.sh in favor of supporting uv instead"

I don't think that's quite the right way to frame this. Handing Rye over to a company that could maintain it full time isn't the same thing as "abandoning" it - and the new maintainers are active on that project: https://github.com/astral-sh/rye/commits/main/

By @forgottofloss - 6 months
I made https://pip.wtf, which is a "god damn it, I'm doing this myself" alternative for single-file scripts that just need some basic deps. You paste some code into your script and then it installs dependencies to a local directory.
By @woodruffw - 6 months
> Naturally this led to a proliferation of new Python package managers which leverage the new standard. Enter poetry, PDM, Flit, and Hatch.

An important qualification: Poetry uses pyproject.toml, but it doesn't use the standard (i.e. PEP 518, and 621) metadata layout. This in practice means that it doesn't follow the standard; it just happens to (confusingly) use a file with the same name.

To the best of my knowledge, the others fully comply with the listed PEPs. In practice this means that the difference between them is abstracted away by tools like `build`, thanks to PEP 517.

By @SirMaster - 6 months
This is what I never liked about python. How much work it could be to get some random python program from GitHub up and running.

Compared to .NET and when I compile a framework-independent, single file executable.

By @odie5533 - 6 months
I disagree with the title, but it's an okay rundown of the various package managers. Wish the author had tried out hatch though since it seems good. Also rye is not abandoned. You can see the repo has updates within the last 24 hours for a new release. I think they want uv to eat rye, but that hasn't happened yet.

My current favorites are uv + mise. Handles lockfiles, multiple versions of python, and it's very fast since uv is very fast. Have not tried pdm or hatch though.

By @chpatrick - 6 months
I really like Nix for Python, as long as the packages I need are already packaged. Otherwise I'll use pip and venv to try stuff out. Can't stand conda.
By @foundart - 6 months
Package management has been a problem almost every time I have dabbled in Python. This is a great overview of the situation and will save me time the next time.
By @p1esk - 6 months
Looks like Conda is still the best package/env manager for ML engineers.
By @aborsy - 6 months
I use Python but I’m not a professional Python programmer. I use pip and venv, with requirements.txt. It copies Python binary and some of the shared standard libraries from the system, installs the required packages in sites-packages, and it always works fine.

Am I missing anything major by not using conda and Poetry?

By @mherrmann - 6 months
In 10 years of using Python mostly full-time, I have not yet had a project where pip and vent were not enough.
By @yegle - 6 months
Why vendoring is not a common practice baffles me, especially since the leftpad incident happened over 8 years ago.

For Python, you can use `pip wheel` (https://pip.pypa.io/en/stable/cli/pip_wheel/) to download .whl files of your dependencies in a folder, add that folder to your version control, and update `sys.path` to include your .whl.

For updating packages, you run `pip wheel` again and check in the new .whl files after carefully review the changes.

By @tandav - 6 months
There is also virtualenvwrapper. It’s quite handy to create, list, remove virtual environment. I prefer to store all venvs in ~/.cache/virtualenvs instead of .venv in project directory, makes it more clean, no need to exclude for backups or git repository.

https://virtualenvwrapper.readthedocs.io/en/latest/

By @VGHN7XDuOXPAzol - 6 months
when we did a comparison of package managers that lock dependencies, we wrote up some interesting notes at https://github.com/matrix-org/synapse/issues/11537#issuecomm...

Notable omission in pip-tools which many are suggesting here as being simpler: it can't write requirements files for multiple environments/platforms without running it once for each of those environments and having one file for all of them.

We settled on Poetry at the time but it has been quite unstable overall. Not so bad recently but there were a lot of issues/regressions with it over time.

For this reason I am happy to see new takes on package management, hopefully some of these will have clearer wins over the others, where you have to spend ages trying to figure out which one will do what you need.

By @dogaar - 6 months
One thing this blog does not question is whether package managers are needed at all. Look at how Deno does it: packages are imported from URLs and downloaded to a local cache at run-time. This allows CDN-based package distribution, but you still need a package search engine.
By @g8oz - 6 months
I just want to point out that Composer for PHP is a great experience.
By @AtlasBarfed - 6 months
"Finally, because dependency resolution is a directed acylic graph (DAG)"

... Is it? :-)

Mwahahaha

By @joeyagreco - 6 months
1. Use pip as a package manager 2. If you need a fresh env to test in, use venv

This is the flow I follow for Python development

By @big-green-man - 6 months
I think this problem comes from ideas we had before that were well intentioned. The idea that led to all of this is, what if we could save developers a ton of work and users a ton of disk space by packaging libraries for them to use? And then we have a ton of problems as a result of this decision. C is now your system's API, mismatched dependencies, a gazillion package managers.

And then what solution do we have? Virtual environments, virtual machines, docker and appimage. We package all dependencies and even entire operating systems so as to avoid all these problems. It's legacy support all the way down.

From scratch, I'd say, devs should just pull all dependencies into their code and package them with their product. Users should never even have to touch something like pip, or a virtual environment. A package manager that allows a developer to publish tools others can use to build code, but that packages the dependencies with their package instead of pulling it for users, would be ideal. Where possible, avoid dependency on anything external entirely. What's that XKCD about yet another standard? I know it will never happen, but I sure do wish it worked like that.

By @VeejayRampay - 6 months
can't wait to throw all that mediocre tooling away over when astral has finally nailed it like we all know they will
By @jacobgorm - 6 months
Cargo is strictly worse than any of the solutions for python. Doing even the simplest things seems to involve pulling down and compiling hundreds of dependencies.
By @exe34 - 6 months
What we need is a new package manager for Python, incorporating the lessons learnt from the others so far [0].

[0] https://xkcd.com/927/

By @gjvc - 6 months
usual bullshite about pip and venv by people who think the "activate" script does something useful
By @igorguerrero - 6 months
Someone from Rust talking about these tools being slow... That's rich.