August 17th, 2024

The weird of function-local types in Rust

The article examines function-local types in Rust, highlighting accessibility issues, macro complications, and potential workarounds. It emphasizes careful design to prevent compilation errors and documentation test failures.

Read original articleLink Icon
The weird of function-local types in Rust

The article discusses the complexities of function-local types in Rust, particularly in the context of macros that generate code. It begins with an example of a struct `User` defined within a function, which cannot be accessed outside its scope due to Rust's name resolution rules. The author explains that while local items can be defined within function blocks, they are not accessible from outside those blocks, leading to compilation errors when attempting to reference them. The article further explores how macros, which generate code, can inadvertently create child modules that do not have access to local types, complicating the use of such types in generated code. The author presents a workaround involving traits to access local types but notes that this approach is not practical for production code. Additionally, the article highlights the implications for documentation tests, which may fail if they rely on macros that generate child modules. The author concludes that while generating child modules for privacy in macros can be useful, it is essential to avoid referencing items from the surrounding scope to prevent issues, especially in documentation tests.

- Function-local types in Rust are not accessible outside their defining scope.

- Macros that generate child modules may lead to compilation errors when referencing local types.

- Workarounds exist but are often impractical for production code.

- Documentation tests can fail if they rely on macros that create child modules.

- Careful design is needed when using macros to avoid scope-related issues.

Link Icon 3 comments
By @noelwelsh - 8 months
This seems like an oversight in the design of Rust. I would think that each function call should create a distinct function-local type, so the trick they use to extract the type from the function shouldn't work. I think what's needed is path-dependent types [1] as found in Scala.

[1]: http://lampwww.epfl.ch/~amin/dot/fpdt.pdf

By @dathinab - 8 months
> So there is just no way to refer to the User struct outside of the function scope, right?...

no matter what tricks you come up with, treat it as that (in case of it being associated to a type treat it as a anonymous type accidentally expose)

also please _never_ place a module in a function, for various subtle reasons it's technically possible but you really really should not do it

I mean in general limit what items (types, impl blocks) you place in function to very limited cases. If you have a type complex enough so that you need a builder defined in a function you are definitely doing something wrong I think.

> Does this mean generating child modules for privacy in macros is generally a bad idea? It depends...

IMHO if we look at derive like macro usage, yes it's always a bad idea.

Derive like thinks should mainly generate impl blocks, if it really really is necessary types and only if there really is no other way modules.

Furthermore they should if possible not introduce any of this in the scope. E.g. it's a not uncommon pattern to place all generated code in `const _:() = {/here/};` which is basically a trick/hack to create a new scope similar to a function scope into which you can place items (functions, imports, types, impl blocks) without polluting the parent scope (and yes that doesn't work for modules they are always scoped by other modules).

So does that mean the builder derive does it all wrong?

I don't think so sometimes you need to do bad decisions because there are no good solutions.

By @jerf - 8 months
I have found, across several languages I've used, that types embedded into functions are generally a bad idea, and I think the general principle is that types generally end up needing to be exposed to any code that will also test that code. So, for instance, it's fine to confine types to some particular module, as long as those types are internal-only, but confining them within functions generally becomes a bad idea.

I know the complaints many of you are gearing up to type, but my statement is a bit more complicated than you may have realized on first read; the key is the word "becoming", that I'm looking at the lifetime of the code and not a snapshot. The problem with embedding types into those smaller scopes is that while it may work at first... of course it does, it compiles, right?... they become an impediment to a number of operations over time. First, as I mentioned, testing is very likely at some point over the evolution of the module to want to either provide input or examine output, intermediate or otherwise, that exists in those types. Second, as the code grows, you want to be able to refactor things freely, and types embedded in functions form a barrier to refactoring because to refactor you'll have to do something to expose that type now to multiple functions. You do not want barriers to refactoring. Barriers to refactoring are a bigger expense over the long term than any small local gain from putting a type here instead of there, especially when anyone should have "Jump to Definition" readily available in this post-LSP era.

Considered over time, over the evolution of the code base, I've just never had any super-local types like this "survive". Every time I think I've found an exception, I've either had my test code or the desire to refactor force me to lift it to the module level. So I just start there now.

To the extent there is an exception, testing-only code may be. Testing-only code has very different constraints than production code anyhow. Even then, though, I still find that refactoring problem arises, and test code needs to be refactorable too.

On the plus side, while I label them "a bad idea", they are not a "bad idea" that destroys your code base or anything. On the grand scale of "bad ideas" in code, this is down in the "inconvenience" part of the scale. It is almost self-evidently not some sort of disaster and I am not claiming it is. You can always lift it out and move on. But it is one of the many little hygiene habits that add up that helps keep code fluid and refactoring always available to me at a minimum activation-energy cost, because that is really important.

(This applies specifically to types that you explicitly define. You can in Haskell, for instance, bash a new type together anywhere simply by creating a tuple (x, y). But this doesn't trigger what I'm talking about because any other bit of the code can bash the exact same type together simply by creating another tuple of the same type, and they'll unify just fine without having to share a type definition in common. No impediment of any kind is created by a new tuple type in that language.)