An analysis of module names inside top PyPI packages
The blog post emphasizes Python package naming conventions, mapping module names to package names, and analyzing PyPI data. Insights include normalized names, common prefixes/suffixes, and advice for developers to follow conventions and avoid namespace packages.
Read original articleThe blog post discusses the importance of naming conventions in Python packages and the challenges associated with mapping module names to package names. The author outlines a plan to gather data on package names from PyPI, analyze file structures, and identify naming conventions. The analysis reveals insights such as the prevalence of normalized module names matching package names, common prefixes and suffixes used in package names, and the impact of namespace packages on naming conventions. The post concludes with recommendations for package developers to adhere to naming conventions, upload wheels, and avoid namespace packages when possible. The author plans to continue monitoring naming conventions in Python packages and refining the analysis.
Related
Start all of your commands with a comma (2009)
The article discusses creating a ~/bin/ directory in Unix to store custom commands, avoiding name collisions with system commands by prefixing custom commands with a comma. This technique ensures unique, easily accessible commands.
Python Modern Practices
Python development best practices involve using tools like mise or pyenv for multiple versions, latest Python version, pipx for app running. Project tips include src layout, pyproject.toml, virtual environments, Black, flake8, pytest, wheel, type hinting, f-strings, datetime, enum, Named Tuples, data classes, breakpoint(), logging, TOML config for efficiency and maintainability.
Reproducibility in Disguise
Reproducibility in software development is supported by tools like Bazel, addressing lifecycle challenges. Vendor dependencies for reproducibility face complexity, leading to proposed solutions like vendoring all dependencies for control.
Python Has Too Many Package Managers
Python's package management ecosystem faces fragmentation issues. PEP 621 introduced pyproject.toml for project configurations, leading to new package managers like Poetry. Conda offers robust dependency management, especially for data science workflows.
Simple notes for Emacs with an efficient file-naming scheme
The Denote package for Emacs by Protesilaos Stavrou simplifies note-taking with structured file names, emphasizing predictability, flexibility, and integration with other packages. It promotes clear naming conventions and customizable workflows.
https://pypi.org/project/xml-from-seq/ → xml_from_seq
https://pypi.org/project/cast-from-env/ → cast_from_env
Simple normalization, right? But `pip` installs one with underscores and one with dashes:
>>> from importlib.metadata import metadata
>>> metadata('xml_from_seq')['Name']
'xml_from_seq'
>>> metadata('cast_from_env')['Name']
'cast-from-env'
so that's what ends up in `pip freeze`.I _think_ it's because there a bdist in PyPI for one, and not the other, so `pip` is using different "backends" that normalize the names into `METADATA` differently... ugh.
It's a shame that there isn't (currently) a reliable way to perform this backwards link: the closest current things are `{dist}.dist-info/METADATA` (unreliable, entirely user controlled) and `direct_url.json` for URL-installed packages, which isn't present for packages resolved from indices.
Edit: PEP 710[1] would accomplish the above, but it's still in draft.
yaml -> pip install pyyaml
cv2 -> pip install opencv-contrib-python
PIL -> pip install pillow (wtf, this should be a misdemeanor punishable by being forced to used windows for a year)
And can we please ban "py" and "python" from appearing inside the name of python packages?Or else I'm going to start writing some python packages with ".js" in their name.
Now there's a somewhat useful "make a pull request to an open source project" exercise.
user/package-name group/package-name
etc...
On the other hand, you know, it's already source code, it can do whatever it wants...
Related
Start all of your commands with a comma (2009)
The article discusses creating a ~/bin/ directory in Unix to store custom commands, avoiding name collisions with system commands by prefixing custom commands with a comma. This technique ensures unique, easily accessible commands.
Python Modern Practices
Python development best practices involve using tools like mise or pyenv for multiple versions, latest Python version, pipx for app running. Project tips include src layout, pyproject.toml, virtual environments, Black, flake8, pytest, wheel, type hinting, f-strings, datetime, enum, Named Tuples, data classes, breakpoint(), logging, TOML config for efficiency and maintainability.
Reproducibility in Disguise
Reproducibility in software development is supported by tools like Bazel, addressing lifecycle challenges. Vendor dependencies for reproducibility face complexity, leading to proposed solutions like vendoring all dependencies for control.
Python Has Too Many Package Managers
Python's package management ecosystem faces fragmentation issues. PEP 621 introduced pyproject.toml for project configurations, leading to new package managers like Poetry. Conda offers robust dependency management, especially for data science workflows.
Simple notes for Emacs with an efficient file-naming scheme
The Denote package for Emacs by Protesilaos Stavrou simplifies note-taking with structured file names, emphasizing predictability, flexibility, and integration with other packages. It promotes clear naming conventions and customizable workflows.