November 2nd, 2024

Don't return named tuples in new APIs

The article argues against using named tuples in new APIs unless updating existing tuple-returning APIs, citing complexity and recommending alternatives like dataclasses for better clarity and usability.

Read original article

ConfusionDisagreementFrustration

The article discusses the use of named tuples in new APIs, arguing against their introduction unless updating an existing API that already returns a tuple. Named tuples can complicate both the API's implementation and its usage, as they require supporting both index-based and attribute-based access, leading to increased testing and potential for user error. The author suggests that named tuples signal complexity, which may not be appropriate if the data structure can be represented simply. Alternatives such as dataclasses, dictionaries, TypedDicts, or SimpleNamespace are recommended for better readability and ergonomics. The emphasis is on prioritizing clarity over brevity in code design, advocating for the use of named tuples only when they enhance an existing tuple return type.

- Named tuples should only be used when updating existing APIs that already return tuples.

- They complicate both implementation and user experience by requiring dual access methods.

- Alternatives like dataclasses, TypedDicts, and SimpleNamespace offer better readability and usability.

- Prioritize clarity and ergonomics in code design over brevity.

- Named tuples may indicate that a data structure is too complex for a simple tuple.

Against Names

The article explores the challenges of naming in computer science, highlighting anonymous identifiers in version control and utility CSS as ways to simplify workflows while balancing named and unnamed elements.

Know your Python container types

The article discusses Python's container types: lists, tuples, named tuples, sets, dictionaries, and dataclasses, highlighting their uses, differences, and recommendations for appropriate applications in programming.

Fighting back against proper noun feature names (2021)

Scott Kubie argues against using proper nouns for product features, stating it increases cognitive load and complicates communication. He advocates for simplicity, suggesting unnamed features enhance user experience and clarity.

Don't let dicts spoil your code

Roman Imankulov critiques Python dictionaries for causing technical debt and complicating maintenance. He advocates for domain models, dataclasses, and Pydantic to enhance clarity and structure in evolving codebases.

TypedDicts are better than you think

TypedDicts in Python 3.8 enhance type annotations for dictionaries, allowing optional fields, improving function signatures, and offering better type safety. Upcoming PEPs will introduce extra and read-only item features.

AI: What people are saying

The comments reflect a diverse range of opinions on the use of named tuples versus dataclasses in Python APIs.

Named tuples have quirks, such as unexpected equality behavior, which some commenters find problematic.
Dataclasses are mutable by default, raising concerns about true immutability compared to named tuples.
Some commenters argue that tuples are fundamental to Python and should not be dismissed in favor of newer constructs.
There is a sentiment that Python has become overly complex with multiple ways to structure data, which can be confusing.
Several commenters advocate for dataclasses as a suitable replacement for both named tuples and TypedDicts in new APIs.

15 comments

By @dathery - 6 months

Another problem brought about by their design being backwards-compatible with tuples is that you get wonky equality rules where two namedtuples of different types and with differently-named attributes can compare as equal:

    >>> Foo = namedtuple("Foo", ["bar"])
    >>> Baz = namedtuple("Baz", ["qux"])
    >>> Foo(bar="hello") == Baz(qux="hello")
    True

This also happens with the "new-style" namedtuples (typing.NamedTuple).

I like the convenience of namedtuples but I agree with the author: there are enough footguns to prefer other approaches.

By @xg15 - 6 months

Counterpoint: Named tuples are immutable, while dataclasses are mutable by default.

You can use frozen=true to "simulate" immutability, but that just overwrites the setter with a dummy implementation, something you (or your very clever coworker) can circumvent by using object.__setattr__()

So you neither get the performance benefits nor the invariants of actual immutability.

By @webprofusion - 6 months

Oh you mean Python library APIs. I totally thought this was going to be a generic article about APIs delivered over http, the first thing I'd think of when someone says API.

By @heavyset_go - 6 months

Author could have used NamedTuple instead of dataclass or TypedDict:

    from typing import NamedTuple

    class Point(NamedTuple):
        x: int
        y: int
        z: int

I don't see "don't use namedtuples in APIs" as a useful rule of thumb, to be honest. Ordered and iterable return-types make sense for a lot of APIs. Use them where it makes sense.

By @the__alchemist - 6 months

I think the best option for this, which is one listed in the article, is the dataclass. It's like a struct in C or Rust. It's ideal for structured data, which is, I believe, what a named tuple is intended for.

By @math_dandy - 6 months

One advantage of (Named)Tuples over dataclasses or SimpleNamespaces is that they can be used as indices into numpy arrays, very useful when you API is returning a point or screen coordinates or similar.

By @mont_tag - 6 months

This article seems vacuous to me. It misses the point that tuples are fundamental to the language with c-speed native support for packing, unpacking, hashing, pickling, slicing and equality tests. Tuples appear everywhere from the output of doctest, to time tuples, the result of divmod, the output of a csv reader and the output of a sqlite3 query.

Tuples are a core concept and fundamental data aggregation tool for Python. However, this post uses a trivial `Point()` class strawman to try to shoot down the idea of using tuples at all. IMO that is fighting the language and every existing API that either accepts tuple inputs or returns tuple outputs. That is a vast ecosystem.

According the glossary a named tuple "any type or class that inherits from tuple and whose indexable elements are also accessible using named attributes." Presumably, no one disputes that having names improves readability. So really this weak post argues against tuples themselves.

By @Spivak - 6 months

I think for the same reason you should avoid TypedDicts for new APIs as well. Dataclasses are the natural replacement for both.

By @Joker_vD - 6 months

> This leads to writing tests for both ways of accessing your data, not just one of them. And you shouldn't skimp on this

Or you can just keep returning namedtuple instead of something else, because then you absolutely can skimp on testing whether what you return does, in fact, satisfies the namedtuple's interface.

By @ReflectedImage - 6 months

namedtuple is preferable as it's the more Pythonic solution. Simpler is better.

By @personjerry - 6 months

I feel like get mouse coordinates is a perfect time to return a named tuple though?

By @doctorpangloss - 6 months

Data classes can gracefully replace tuples everywhere. Set frozen, then use a mixin or just author a getitem and iter magic, and you’re done.

By @awinter-py - 6 months

these can be more memory-efficient than classes or dictionaries.

there was a point a while back where python added __slots__ to classes to help with this; and in practice these days the largest systems are using numpy if they're in python at all

not sure what modern versions do. but in the olden days, if you were creating lots of small objects, tuples were a low-overhead way to do it

By @solarkraft - 6 months

> But there are three more ways to do the same data structure

Thanks, I hate it. There’s a lot I like about Python, but this is a major pain point.

NamedTuple, TypedDict, Dataclass, Record ... Remember the Zen of Python? „There should be one-- and preferably only one --obvious way to do it“ - it feels like Python has gone way overboard with ways to structure data.

In Javascript everything is an object, you can structurally type them with Typescript and I don’t feel like I’m missing much.

By @pipeline_peak - 6 months

You’d think a much an easy to use high level language would have:

Point(x,y,z)

Don't return named tuples in new APIs