July 30th, 2024

Functional languages should be so much better at mutation than they are

The article explores integrating mutation into functional programming, discussing methods like restricted mutable structures, local mutation, and linearity, while highlighting challenges and the need for a balanced approach.

Read original articleLink Icon
Functional languages should be so much better at mutation than they are

Functional programming languages often emphasize avoiding mutation, but this perspective may overlook the necessity of mutation in certain data structures. For instance, structures like union-find require mutation for efficiency, and persistent data structures can introduce significant overhead. The article discusses various approaches to integrating mutation into functional programming without compromising its principles.

One option is to allow unrestricted mutable data structures, which can lead to accidental side effects, undermining the benefits of functional programming. Another approach is to permit mutation only in specific contexts, such as IO, which can complicate function types and call sites. A third method involves local mutation, where mutable state is confined to a specific region, as seen in Haskell's ST Monad. However, this can increase complexity and reduce code readability.

Linearity is proposed as a way to manage mutation while maintaining purity, but it presents challenges in practical implementation. The article also mentions functional farming, which tracks linearity at runtime, allowing for in-place updates without exposing mutable state. However, this relies on reference counting, which may not be feasible for all languages.

Ultimately, the author expresses uncertainty about the future of mutation in functional programming, suggesting that a new approach may be needed to balance efficiency and purity. The integration of regions with functional programming could enhance usability, ensuring that mutable data structures are as accessible as persistent ones.

Related

I Probably Hate Writing Code in Your Favorite Language

I Probably Hate Writing Code in Your Favorite Language

The author critiques popular programming languages like Python and Java, favoring Elixir and Haskell for immutability and functional programming benefits. They emphasize personal language preferences for hobby projects, not sparking conflict.

Synchronous Core, Asynchronous Shell

Synchronous Core, Asynchronous Shell

A software architecture concept, "Synchronous Core, Asynchronous Shell," combines functional and imperative programming for clarity and testing. Rust faces challenges integrating synchronous and asynchronous parts, prompting suggestions for a similar approach.

Synchronous Core, Asynchronous Shell

Synchronous Core, Asynchronous Shell

Gary Bernhardt proposed a Synchronous Core, Asynchronous Shell model in software architecture, blending functional and imperative programming. Rust faces challenges integrating sync and async functions, leading to a trend of adopting this model for clarity and control.

My programming beliefs as of July 2024

My programming beliefs as of July 2024

Evan Hahn emphasizes tailored programming approaches, distinguishing "simple" from "easy," promoting testability through modularity, and advocating for ethical coding practices prioritizing societal impact and nuanced thinking in software development.

Functional programming languages should be better at mutation than they are

Functional programming languages should be better at mutation than they are

The article explores integrating mutation into functional programming, discussing methods like restricted mutation, local mutation, and linearity, while emphasizing the need for user-friendly solutions that maintain functional principles.

Link Icon 22 comments
By @Leftium - 9 months
> ...functional programming is mostly about avoiding mutation at all costs

Slightly different perspective from Grokking Simplicity[1]: functional programming is not about avoiding mutation because "mutation is bad." In fact, mutation is usually the desired result, but care must be taken because mutation depends on when and how many times it's called.

So good FP isn't about avoiding impure functions; instead it's about giving extra care to them. After all, the purpose of all software is to cause some type of mutation/effect (flip pixels on a screen, save bits to storage, send email, etc). Impure functions like these depend on the time they are called, so they are the most difficult to get right.

So Grokking Simplicity would probably say this:

1. Avoid pre-mature optimization. The overhead from FP is usually not significant, given the speed of today's computers. Also performance gains unlocked by FP may counter any performance losses.

2. If optimization via mutation is required, push it as far outside and as late as possible, keeping the "core" functionally pure and immutable.

This is similar to Functional Core, Imperative Shell[2]; and perhaps similar to options 1 or 2 from the article.

[1]: https://www.manning.com/books/grokking-simplicity

[2]: https://hw.leftium.com/#/item/18043058

By @jmull - 9 months
> A lot of people think that functional programming is mostly about avoiding mutation at all costs.

People should try to stop thinking of mutation as something to be avoided, and start thinking of it as something to be managed.

Mutating state is good. That's usually the whole point.

What's bad is when you create "accidental state" that goes unmanaged or that requires an infeasible effort to maintain. What you want is a source of truth for any given bit of mutable state, plus a mechanism to update anything that depends on that state when it is changed.

The second part is where functional programming shines -- you have a simple way to just recompute all the derived things. And since it's presumably the same way the derived things were computed in the first place, you don't have to worry about its logic getting out-of-sync.

By @pornel - 9 months
> Rust's shared XOR mutable references […] makes linearity nearly useless or inevitably creates a parallel, incomplete universe of functions that also work on linear values.

Yup. Rust can't abstract over mutability. For owned values, it defaults to exclusive ownership, and needs explicit Rc<T> and clone() to share them. For references, in practice it requires making separate `foo()` and `foo_mut()` functions for each type of the loan.

Even though this sounds horribly clunky, it works okay in practice. Ability to temporarily borrow exclusively owned objects as either shared or exclusive-mutable adds enough flexibility.

Rust is known for being difficult, but I think a lot of that is due to lack of GC. Rust can't make any reference live longer, and programmers have to manually get scopes of loans right, or manually use Rc<RefCell> to have a mini DIY GC. Perhaps a shared XOR mutable with a GC could be the best of both?

By @RandomThoughts3 - 9 months
The article utterly falls apart in its first paragraph where it itself acknowledges that the whole ML family including Ocaml has perfect support for mutation, rightfully assume most Ocaml programmers would choose to not use it most of the time but then assume incorrectly that it’s because the language makes it somehow uneasy. It’s not. It’s just that mutation is very rarely optimal. Even the exemple given fails:

> For example, let's say you're iterating over some structure and collecting your results in a sequence. The most efficient data structure to use here would be a mutable dynamic array and in an imperative language that's what pretty much everyone would use.

Well, no, this is straight confusion between what’s expressed by the program and what’s compiled. The idiomatic code in Ocaml will end up generating machine code which is as performant than using mutable array.

The fact that most programming languages don’t give enough semantic information for their compiler to do a good job doesn’t mean it necessary has to be so. Functional programmers just trust that their compiler will properly optimize their code.

It gets fairly obvious when you realise that most Ocaml developers switch to using array when they want to benefit from unboxed floats.

The whole article is secretly about Haskell and fails to come to the obvious conclusion: Haskell choice of segregating mutations in special types and use monads was an interesting and fruitful research topic but ultimately proved to be a terrible choice when it comes to language design (my opinion obviously not some absolute truth but I think the many fairly convoluted tricks haskellers pull to somehow reintroduce mutations support it). The solution is simple: stop using Haskell.

By @tome - 9 months
I'm not convinced about the dismissal of option 2. I agree ST is clunky but not for the reasons given. It's clunky because it's impossible to mix with other effects. What if I want ST and exceptions, for example, and I want the presence of both to be tracked in the type signature? ST can't do that. But my effect system, Bluefin, can. In fact it can mix not only state references and exceptions, but arbitrary other effects such as streams and IO.

* https://hackage.haskell.org/package/bluefin-0.0.2.0/docs/Blu...

* https://hackage.haskell.org/package/bluefin-0.0.6.0/docs/Blu...

By @nmadden - 9 months
I don’t know how Swift and Koka handle things, but I’ve written a lot of Tcl that uses the same CoW reference-counting trick. (Tcl is an under-appreciated FP language: everything is a string, and strings are immutable, so it has had efficient purely declarative data structures for decades).

The downside in Tcl is that if you refactor some code suddenly you can add a new reference and drop into accidentally quadratic territory because now everything is being copied. This leads to some cute/ugly hacks that are only explainable in terms of interpreter implementation details, purely to reduce a refcount in the right spot.

By @dave4420 - 9 months
A variant of option 4 is to keep track of references you know cannot possibly be shared, and update those by mutation. Compared to reference counting, it misses some opportunities for mutation, but avoids the false sharing.

I think Roc is doing this.

By @anfelor - 9 months
Disclosure: I work on Koka's FBIP optimization (Option 4).

> The most efficient data structure to use here would be a mutable dynamic array and in an imperative language that's what pretty much everyone would use. But if you asked an OCaml programmer, they would almost certainly use a linked list instead.

I agree with this sentiment. However, OCaml does have mutable arrays that are both efficient and convenient to use. Why would a programmer prefer a list over them? In my opinion, the main benefit of lists in this context is that they allow pattern matching and inductive reasoning. To make functional programming languages more suited for array programming, we would thus need something like View Patterns for arrays.

A related issue is that mutation can actually be slower than fresh allocations in OCaml. The reason for this is that the garbage collector is optimized for immutable datastructures and has both a very fast minor heap that makes allocations cheap and expensive tracking for references that do not go from younger to older elements. See: https://dev.realworldocaml.org/garbage-collector.html#scroll...

> Unfortunately, this makes it impossible to use any standard functions like map on linear values and either makes linearity nearly useless or inevitably creates a parallel, incomplete universe of functions that also work on linear values.

You can implement polymorphism over linearity: this is done in Frank Pfenning's SNAX language and planned for the uniqueness types in a branch of OCaml.

> This might sound a little dangerous since accidentally holding on to a reference could turn a linear time algorithm quadratic

No, the in-place reuse optimization does not affect the asymptotic time complexity. But it can indeed change the performance drastically if a value is no longer shared since copies are needed then.

> A tracing garbage collector just doesn't give you this sort of information.

It is possible to add One-bit Reference Counts to a garbage collector, see https://gitlab.haskell.org/ghc/ghc/-/issues/23943

> for now even these struggle to keep up with tracing garbage collectors even when factoring in automatic reuse analysis.

I investigated the linked benchmarks for a while. The gap between Koka and Haskell is smaller than described in that initial comment, but a tuned GHC is indeed a bit faster than Koka on that benchmark.

By @kccqzy - 9 months
The author didn't write a good objection to option 2. Both the ST monad (real mutations) and the variety of State monads (simulated mutations) work fine in practice. What's even better is the STM monad, the software transactional memory monad that is not only about mutations but also solves synchronization between threads in a way that's intuitive and easy to use. But let's stick to the ST monad. Has the author looked at how hash maps and hash sets are implemented in Haskell? It's arrays and mutation!

> And if you only need mutation locally inside a function, using ST makes your code fundamentally more imperative in a way that really forces you to change your programming style. This isn't great either and doesn't exactly help with readability, so the mental overhead is rarely worth it.

What?? You are explicitly opting into writing mutation code. Of course that's going to change your programming code. It is expected to be different. Readability is increased because it clearly delineates a different coding style. And it's not even that different from other monadic code, even when you compare to non-mutating monadic code.

Terrible article.

By @throwaway81523 - 9 months
Ben Lippmeier's Disciplined Disciple Compiler (DDC), for his language later called Discus, was interesting. It was/is an experimental language that managed mutation through an effect typing system. In the intro to his thesis he talks about it some.

Discus language: http://discus-lang.org/

Thesis: https://benl.ouroborus.net/papers/2010-impure/lippmeier-impu...

The thesis is the more interesting of those two links IMHO. The intro is chapter 1 that starts at page 17 of the pdf. It has one of the better critiques of Haskell that I've seen, and explains why uncontrolled mutation is not the answer. Reference types ala ML aren't the answer either, in his view.

By @jonathanyc - 9 months
I’m not even a big OCaml fan (you can use Algolia on my comment history…), but this article is just factually wrong.

> For example, let's say you're iterating over some structure and collecting your results in a sequence. The most efficient data structure to use here would be a mutable dynamic array and in an imperative language that's what pretty much everyone would use.

> But if you asked an OCaml programmer, they would almost certainly use a linked list instead.

What? One of OCaml’s most notable features as a functional programming language is how it was designed to support mutation (see “the value restriction”, e.g. https://stackoverflow.com/questions/22507448/the-value-restr...) In my own OCaml programs I used mutation whenever appropriate (my only complaint would be that I wish there were a little more syntactic sugar around e.g. hash table access).

I wanted to like this post but it seems like low-effort clickbait.

By @rocqua - 9 months
How come the CoW method requires runtime reference counting? A lot of the same benefit (but not all) should be available based on static analysis right?

Especially if the approach isn't really Copy on Write, but Copy only when someone might want to use the old value. Default to trying to mutate in place, if you can prove that is safe.

For most locals, that should be rather doable, and it would be a pretty big gain. For function parameters it probably gets hairy though.

By @Mathnerd314 - 9 months
There is a way to do FBIP without reference counting - use immutable store semantics (always copy). At that point you are doing a lot of copying but Haskell etc. are actually pretty good at managing this sort of profligate duplication. And of course it is possible to use RC-at-compile-time and other analyses to optimize away the copying - the difference is that runtime RC is not required like it is in Koka. There is even a static analysis algorithm from 1985 https://dl.acm.org/doi/abs/10.1145/318593.318660 (never implemented in Haskell because they went with the ST monad) There is a theorem in the paper that for "natural" translations of imperative programs into FP, their algorithm will optimize the FP back to the imperative program or better.
By @PaulHoule - 9 months
I learned programming in the 1980s based on examples from the 1970s and I would see (and write JNI wrappers for) FORTRAN codes in the 1990s that were built around algorithms that could work on data in place with minimal duplication of data structures such as sorts, FFTs, even when it wasn't obvious that they could do so.
By @ChadNauseam - 9 months
I enjoyed this article. As someone who has written too much haskell and ocaml, and now writes mostly Rust, I am biased but I think this problem is mostly solved by rust. (The author mentions rust in option 3, but I think underappreaciates it.)

The author mentions linear types. This is a bit of a pet peeve of mine because, while very useful, linear types are not the concept that many people think they are and they are not implemented in rust (and neither are affine types). What rust implements is referred to as a "uniqueness type".

The difference has to do with how they deal with what linear-types-people call "exponentials". A linear type, as the article mentions, is the type of a value must be "consumed" exactly once (where consuming means passing it into a function, returning it, or sometimes destructuring it). Of course, in this mad world of ours we sometimes need to consume a value more than once, and indeed a language with only linear types would not be turing-complete. This escape hatch is called an "exponential", I guess because exponential is kind of like the opposite of linear. A value with an exponential type can be used as often as you want. It is essentially most types in most programming languages.

IF a function expects a value with a linear type, can you pass an a value with an exponential type to it? The answer is that you can. Try this in linear haskell if you don't believe me. A function taking a value with a linear type just says "I consume this value exactly once, but you can pass me whatever you want". The restriction is that values with linear types can only be passed to functions that expect linear types. A value with a linear type must be consumed exactly once, so you certainly can't pass it to a function that expects a value with an exponential type, because it might use that value twice. In other words, a linear type is a restriction on the callee.

Those familiar with rust will notice that this is not how rust works. If a function takes T, and you have &T, you just cannot call that function. (Ignore Clone for now.) However, in the world of linear types, this would be allowed. This makes linear types not useful for the article's purposes, although they are still very useful for other things.

What rust wants is a constraint not provided by linear types. Where linear types are a restriction on the callee, rust wants to be able to restrict the caller. It wants to be able to restrict the caller in such a way that it can know that there are no other references to a variable that's been passed into a function. People call this a "uniqueness type" because you can say "I want this type to be 'unique'" (where 'unique' means not-aliased). Honestly this name doesn't make a lot of sense, but it makes a little more sense when you think of it in terms of references. If a reference is unique, then it means that no other reference that points to the same object (which is the requirement rust imposes on mutable references). So while a linear type allows you to pass a non-linear variable to a function that expects a linear one, rust doesn't allow you to pass a non-unique variable to a function that expects a unique one.

And adding this requirement to mutations resolves 90% of the issues that make mutability annoying. Mutability becomes challenging to understand when:

1. You have multiple references pointing to the same data in memory. 2. You change the data using one of these references. 3. As a result, the data appears to have changed when accessed through any of the other references.

This simply cannot happen when mutability requires values to not be aliased.

By @classified - 9 months
A comment from the article page:

I blame Haskell and the sudden fixation on absolute purity that manifested just as the VC-startup set decided it was the next secret sauce, to the point of literally redefining what "functional programming" meant in the first place.

I think that fixation has forced a lot of focus on solving for "how do we make Haskell do real work" instead of "how do we make programming in general more predictable and functional", and so the latter fight got lost to "well we bolted lambdas into Java somehow, so it's functional now".

Bullseye.

By @woctordho - 9 months
The lazy clone in HVM may be a viable approach to mutation
By @freeduck - 9 months
By @narski - 9 months
I recently ran into this issue when trying to memoize a simple numerical sequence in Hoon (yes, that Hoon. I know, I know...).

Let's use the fibonacci sequence as an example. Let's write it the classic, elegant way: f(n) = f(n-1) + f(n-2). Gorgeous. It's the sum of the two previous. With the caveat that f(n=0|1) = n. In Python:

  # fib for basic b's
  def fib(n):
    ## Base case
    if n == 0 or n == 1:
      return n
    
    return fib(n-1) + fib(n-2)
Right off the bat, performance is O(n)=n*2. Every call to f(n-1) will also need to compute f(n-2) anyways! It's a mess. But since Python passes arrays and dictionaries as pointers (cough, sorry! I meant to say references) it's super easy to memoize:

  # optimize-pilled memoize-chad version
  def fib(n, saved={}):
    if n in saved:
      return saved[n]
    
    if n == 0 or n == 1:
      saved[n] = n
    else:
      saved[n] = fib(n-1) + fib(n-2)
    
    return saved[n]
Okay, now our version is nearly as fast as the iterative approach.

This is the normal pattern in most languages, memoizing otherwise "pure" functions is easy because you can reference a shared object using references, right? Even with multithreading, we're fine, since we have shared memory.

Okay, but in Hoon, there are no pointers! Well, there kinda are. The operating system lets you update the "subject" of your Urbit (the context in which your programs run), and you can do this via the filesystem (Clay) or daemons (Gall agents, which have their own state kind of).

But to do this within a simple function, not relying on fancy OS features? It's totally possible, but a huge pain the Aslan.

First, here's our bog-standard fib in Hoon:

  |=  n=@ud
  ?:  (lte n 1)
    n
  %+  add
    $(n (dec n))
  $(n (sub n 2))
Now, I memoize on the way down, by calculating just f(n-1) and memoizing those values, to acquire f(n-2):

  :-  %say
  |=  [* [n=@ud ~] [cache=(map @ud @ud) ~]]
  :-  %noun
  ^-  [sum=@ud cache=(map @ud @ud)]
  =/  has-n  (~(get by cache) n)
  ?~  has-n
    ?:  (lte n 1)
      [n (~(put by cache) n n)]
    =/  minus-1  $(n (dec n))
    =/  minus-2 
      =/  search  (~(get by cache.minus-1) (sub n 2))
      ?~  search  0
      (need search)
    :-  (add sum.minus-1 minus-2)
    (~(put by cache.minus-1) n (add sum.minus-1 minus-2))
  [(need has-n) cache]
and that works in the Dojo:

  > =fib-8 +fib 8
  > sum.fib-8
  21
but it sure is easier in Python! And I'm not picking on Hoon here, it's just pure functional programming that makes you think this way - which as a hacker is fun, but in practice is kinda inconvenient.

I even wonder how much faster I actually made things. Let's see:

  > =old now
  > =res +fib 18
  > sum.res
  2.584
  > (sub now old)
  1.688.849.860.263.936
  :: now with the non-memoized code...
  > =before now
  > +fib 18
  2.584
  > (sub now before)
  1.125.899.906.842.624
Ha! My super improved memoized code is actually slower! That's because computing the copies of the map costs more than just recurring a bunch. This math should change if I try to compute a bigger fib number...

Wait. Nevermind. My memoized version is faster. I tested it with the Unix time command. It's just that Urbit Dojo has a wierd way of handling time that doesn't match my intuition. Oh well, I guess I can learn how that works. But my point is, thinking is hard, and in Python or JS or C I only have to think in terms of values and pointers. And yes, that comes with subtle bugs where you think you have a value but you really have a pointer! But most of the time it's pretty easy.

Btw sorry for rambling on with this trivial nonsense - I'm a devops guy so this is probably super boring and basic for all you master hn swe's. But it's just a tiny example of the constant frustrations I've had trying to do things that would be super simple if I could just grab a reference and modify something in memory, which for better or worse, is how every imperative language implicitly does things.

By @scrubs - 9 months
This post is great! And so are the comments. Another reason hn rocks.
By @cryptica - 9 months
IMO, Functional Programming was a zero interest rate phenomenon. Some mathematicians suffering from professional deformation believed that programming should adhere to the purity of mathematical conventions... Meanwhile, there was no proof whatsoever to support the hypothesis that constraining programming to such conventions would be beneficial in a practical sense.

FP proponents spotted a small number of problems which arose in certain specific OOP implementations and stubbornly decided that OOP itself was to blame.

Some FP proponents have pointed out that passing around instance references to different parts of the code can lead to 'spooky action at a distance.' For example, if references to the same child instance are held by two different parent modules and they both invoke state-mutating methods on the child instance, it becomes difficult to know which of the two parent modules was responsible for the mutation of state within the child instance...

The mistake of FP proponents is that they failed to recognize the real culprit of the flaws that they've identified. In the case of 'spooky action at a distance', the culprit is pass-by-reference.

If you keep that in mind and consider what OOP pioneers such as Alan Kay have said about OOP "It's about messaging," it becomes an irrefutable fact that this flaw has nothing to do with OOP but is merely a flaw in specific implementations of OOP... Flawed implementations which neglected the messaging aspect of OOP.

To summarize it simply; with OOP, you're not supposed to pass around instances to each other, especially not references. What you're supposed to pass around are messages. The state should be fully encapsulated. Messages are not state, instances are state. Instances shouldn't be moved around across multiple modules, messages should.

If you approach OOP with these principles in mind, and make the effort to architect your systems in such a way, you will see it solves all the problems that FP claims to solve abd it doesn't introduce any of its problems... Which are numerous and would require an entire article to enumerate.