July 19th, 2024

Ruby Methods Are Colorless

JP Camara discusses colorless methods in Ruby, exploring synchronous and asynchronous functions without explicit color markers. Ruby's Threads and Fibers support seamless, concurrent programming, contrasting with color-coded functions in other languages.

Read original articleLink Icon
Ruby Methods Are Colorless

JP Camara delves into the concept of colorless methods in Ruby as part of a series on concurrency and asynchronous programming. The discussion revolves around the distinction between synchronous (blue) and asynchronous (red) functions, drawing parallels with languages like JavaScript and Go. In Ruby, the absence of explicit color distinctions between methods allows for inherent asynchronous behavior without the need for special markers. The article explores how Ruby achieves this colorless nature through its support for Threads and Fibers, enabling concurrent programming without the complexities associated with color-coded functions. Threads and Fibers in Ruby provide the foundation for seamless, colorless programming, with recent enhancements in Ruby 3 empowering Fibers with additional capabilities through the FiberScheduler. This colorless approach to concurrency in Ruby contrasts with the challenges posed by color-coded functions in languages like JavaScript, highlighting the flexibility and simplicity of Ruby's concurrency model.

Link Icon 28 comments
By @vlucas - 7 months
Long-time JS/TS/Node programmer here.

Knowing ahead of time which functions are async is a feature.

It's a big neon sign that says "hey, this function call is expensive". This is a good thing for programmers to easily see and know at the call site.

If you make multiple calls with async/await in a row, the performance issues are plainly obvious at the call site. With "colorless" functions, this information is hidden in a deeper layer. You have to know what the function does on the inside to even know what its performance impacts are.

Also, a nitpick - you can call async functions from sync ones, you just can't access the return value. Sometimes, you don't need to.

By @thechao - 7 months
I've implemented coroutines in C and C++; my preferred multitasking environment is message-passing between processes. I'm not quite sure what the async/await stuff is buying us (I'm thinking C++, here). Like, I get multi-shot stackless coroutines, i.e., function objects, but I don't get why you'd want to orchestrate some sort of temporal Turing pit of async functions bleeding across your code base.

I dunno. Maybe I'm old now?

Anyways; good for Ruby! Async/await just seems very faddish to me: it didn't solve any of the hard multithreading/multiprocessing problems, and introduced a bunch of other issues. My guess is that it was interesting type theory that bled over into Real Life.

By @svieira - 7 months
Anytime this comes up I plug the excellent "Unyielding" (https://glyph.twistedmatrix.com/2014/02/unyielding.html) and "Notes on structured concurrency" (https://vorpus.org/blog/notes-on-structured-concurrency-or-g...) as the counterpoint to "What color is your function". Being able to see structurally what effects your invocation of a function will result in is very helpful in reasoning about the effects of concurrency in a complicated domain.
By @dang - 7 months
Related:

What color is your function? (2015) - https://news.ycombinator.com/item?id=28657358 - Sept 2021 (58 comments)

What Color Is Your Function? (2015) - https://news.ycombinator.com/item?id=23218782 - May 2020 (85 comments)

What Color is Your Function? (2015) - https://news.ycombinator.com/item?id=16732948 - April 2018 (45 comments)

What Color Is Your Function? - https://news.ycombinator.com/item?id=8984648 - Feb 2015 (146 comments)

By @downsplat - 7 months
Translated into simple language, Ruby chose to expose the multithreading paradigm (multple threads over shared data), like Java and others.

Multithreading is strictly more powerful than single threaded event loops. For some kinds of software there is just no alternative - a modern browser engine for example needs to be multithreaded.

The trade off is that you need to make sure your code is thread safe, which is not trivial as the collection of articles explains. That's your function color right there, green functions are verified thread safe, gray functions are not or not sure.

Personally in nearly 30 years of programming I've never needed to write multithreaded code. I still haven't found a business need that could not be met with suitable choices between multiprocessing (i.e fork) and event loops.

I'll definitely take wait/async programming over having to worry about concurrent thread safety any day of the week.

By @throw10920 - 7 months
As they should be.

I object to doing what a computer can do for me (in programming), and manually creating separate versions of functions that are identical up to async absolutely falls into that category.

By @pjungwir - 7 months
Great article! I'm looking forward to reading the rest of the series.

I noticed a couple details that seem wrong:

- You are passing `context` to `log_then_get` and `get`, but you never use it. Perhaps that is left over from a previous version of the post?

- In the fiber example you do this inside each fiber:

    responses << log_then_get(URI(url), Fiber.current)
and this outside each fiber:

    responses << get_http_fiber(...)
Something is not right there. It raised a few questions for me:

- Doesn't this leave `responses` with 8 elements instead of 4?

- What does `Fiber.schedule` return anyway? At best it can only be something like a promise, right? It can't be the result of the block. I don't see the answer in the docs: https://ruby-doc.org/3.3.4/Fiber.html#method-c-schedule

- When each fiber internally appends to `responses`, it is asynchronous, so are there concurrency problems? Array is not thread-safe I believe. So with fibers is this safe? If so, how/why? (I assume the answer is "because we are using a single-threaded scheduler", but that would be interesting to put in the post.)

By @clayg - 7 months
I feel like I have some intuative understanding of how go achieves colorless concurrency using "go routines" that can park sync/blocking io on a thread "as needed" built into the runtime from the very begining.

I don't understand how Ruby added this after the fact, globally to ALL potential cpu/io blocking libraries/functions without somehow expressing `value = await coro`

Python is currently going through an "coloring" as the stdlib and 3rd-party libraries adapt to the explicit async/await syntax and it's honestly kind of PITA. Curious if there's any more info on how Ruby achived this.

By @janci - 7 months
> 3. You can only call a red function from within a red function

The base of most arguments against async. And it's false. You can call red from blue. And you should, sometimes.

By @Rapzid - 7 months
99.9% of these "colored" function articles have an incomplete or even flawed understanding of async/await symantics.

Fibers are not fungible with async/await. This is why structured concurrency is a thing.

By @wiseowise - 7 months
Kotlin solved this pointless debate long time ago the moment they’ve released coroutines.

Best of both worlds: you no longer have two functions with ReturnType and Promise<ReturnType>. You just mark potentially blocking function with suspend and you’re done.

By @fny - 7 months
I'm confused, and please correct me if I'm wrong.

Aren't all these calls blocking? Doesn't `File.read` still block? Sure it's multithreaded, but it still blocks. Threading vs an event loop are two different concurrency models.

By @revskill - 7 months
I loved Ruby as a total beginner but hate it as an experienced programmer.

Colorless brings no meaning when i look at the signature of a method, which is a warning !

Async at the boundary, sync at the core is my favorite paradigm.

By @curtisblaine - 7 months
So maybe it's me, but isn't that line mapping on `&:value` in Ruby the exact equivalent of doing `Promise.all` on a bunch of async functions in Javascript, with the downside that you don't explicitly say that the array you calling `value` on is a bunch of asynchronous things that need to be (a)waited for to realize? In other words, since you have color anyway, isn't it better to highlight that upfront rather than hiding it until you need to actually use the asynchronous return values?
By @mst - 7 months
> Even more onerous, if it isn’t built into your language core like JavaScript/node.js, adding it later means modifying your entire runtime, libraries and codebases to understand it.

Interestingly, while this has proven true of async/await for many languages it has not at all been true for perl.

The pluggable keywords feature lets us register 'async' and 'await' with the parser as (block scoped) imported keywords and with a little suspend/resume trickery you get https://p3rl.org/Future::AsyncAwait which I've been using happily pretty much since it was released (generally operating on https://p3rl.org/IO::Async::Future and https://p3rl.org/Mojo::Promise objects, often both in the smae process).

I even wrote https://p3rl.org/PerlX::AsyncAwait as a pure perl proof of concept later on, which injects computed gotos as resume points ala the switch/case trick you can use for resumable functions in C (nobody should really be using that one, mind, I wrote it to prove that I could and as potential fodder for https://p3rl.org/App::FatPacker usage later).

I do very much appreciate there are a lot of reasons one might dislike perl (I've been writing it long enough my list is probably longer than most naysayers') but its sheer malleability as a language remains unusually good.

By @dmux - 7 months

    def log_then_get(url, context)
      puts "Requesting #{url}..."
      get(url, context)
    end
 
    def get(uri, context)
      response = Net::HTTP.get(uri)
      puts caller(0).join("\n")
      response
    end
 
    def get_http_thread(url)
      Thread.new do
        log_then_get(URI(url), Thread.current)
      end
    end

Good example of the downsides of dynamic typing:

1) get_http_thread takes a url (String) and converts it to a URI object

2) log_then_get defines its parameter as `url`, but really its expecting a URI object

3) get defines its parameter as `uri`, but we're passing it an argument called `url` from within log_then_get.

Lots of traps readily awaiting an unsuspecting programmer or newcomer to a project that contains code like this.

By @stephen - 7 months
Maybe its Stockholm syndrome after ~4-5 years of TypeScript, but I like knowing "this method call is going to do I/O somewhere" (that its red).

To the point where I consider "colorless functions" to be a leaky abstraction; i.e. I do a lot of ORM stuff, and "I'll just call author.getBooks().get(0) b/c that is a cheap, in-memory, synchronous collection access ... oh wait its actually a colorless SQL call that blocks (sometimes)" imo led to ~majority of ORM backlash/N+1s/etc.

Maybe my preference for "expressing IO in the type system" means in another ~4-5 years, I'll be a Haskell convert, or using Effect.ts to "fix Promise not being a true monad" but so far I feel like the JS Promise/async/await really is just fine.

By @eduction - 7 months
>Because threads share the same memory space they have to be carefully coordinated to safely manage state. Ruby threads cannot run CPU-bound Ruby code in parallel, but they can parallelize for blocking operations

Ugh. I know Ruby (which I used to code in a lot more) has made some real progress toward enabling practical use of parallelism but this sounds still pretty awful.

Is there any effort to make sharing data across threads something that doesn't have to be so "carefully coordinated" (ala Clojure's atom/swap!, ref/dosync)?

Is the inability to parallelize CPU-bound code to do with some sort of GIL?

By @thwarted - 7 months
> Async code bubbles all the way to the top. If you want to use await, then you have to mark your function as async. Then if someone else calling your function wants to use await, they also have to mark themselves as async, on and on until the root of the call chain. If at any point you don’t then you have to use the async result (in JavaScript’s case a Promise<T>).

I find many descriptions of async code to be confusing, and this kind of description is exactly why.

This description is backwards. You don't choose to use await and then decorate functions with async. Or maybe you do and that's why so many async codebases are a mess.

You don't want to block while a long running operation completes, so you decorate the function that performs that operation with async and return a Promise.

But Promises have exactly the same value as promises in the real world: none until they are fulfilled. You can't do further operations on a promise, you can only wait for it to be done, you have to wait for the promise to be fulfilled to get the result that you actually want to operate on.

The folly of relying on a promise is embodied in the character Whimpy from Popeye: "I'll gladly pay you Tuesday for a hamburger today".

Once you have a promise, you have to await on it, turning the async operation into a synchronous operation.

This example seems crazy to me:

    async function readFile(): Promise<string> {
      return await read();
    }
This wraps what should be an async operation that returns a promise (read) in an expression that blocks (await read()) inside a function that returns a promise so you didn't need to block on it!. This is a useless wrapper. This kind of construct is probably the significant contribution to the mess: just peppering code with async and await and wrapper functions.

await is the point where an async operation is blocked on to get back into a synchronous flow. Creating promises means you ultimately need to block in a synchronous function to give the single threaded runtime a chance to make progress on all the promises. Done properly, this happens by the event loop. But setting that up requires the actual operation of all your code to be async and thus callback hell and the verbose syntactic salt to even express that in code.

That all being said, this piece is spot on. Threads (in general, but in ruby as the topic of this piece) and go's goroutines encapsulate all this by abstracting over the state management of different threads of execution via stacks. Remove the stacks and async programming requires you to manage that state yourself. Async programming removes a very useful abstraction.

Independent threads of execution, if they are operating system managed threads, operating system managed processes (a special case of OS managed threads), green threads, or go routines, are a scheduler abstraction. Async programming forces you to manage that scheduling. Which may be required if you don't also have an abstraction available for preemption, but async leaks the single threaded implementation into your code, and the syntactic salt necessary to express it.

By @hosh - 7 months
So is Erlang/Elixir colorless or do those function calls have color?
By @lowbloodsugar - 7 months
Only having blue functions is not the same as being colorless.
By @moralestapia - 7 months
The argument for "color"ed functions in Javascript is flawed and comes from somebody with a (very) shallow understanding of the language.

Javascript is as "colorless" any other programming language. You can write "async"-style code without using async/await at all while it being functionally equivalent.

Async/await is just syntactic sugar that saves you from writing "thenable" functions and callbacks again and again and again ...

Instead of:

  function(...).then(function() {
    // continue building your pyramid of doom [1]
  })
... you can just do:

  await function()
That's literally it.

1: https://en.wikipedia.org/wiki/Pyramid_of_doom_(programming)

By @lofaszvanitt - 7 months
Ruby is like Japanese candlestick charts, both of them are absolute bullshit, idiotic things.
By @bilalq - 7 months
Other languages may handle it differently, but having to manage threads is not a small compromise for going colorless. You're now forced to deal with thread creation, thread pooling, complexities of nested threads, crashes within child or descendant threads, risks of shared state, more difficult unit testing, etc.
By @BiteCode_dev - 7 months
I don't like colored function for obvious reasons, but fully colorless for async means you don't know when things are async or not.

There are a lot of things I dislike in JS, but I think the I/O async model is just right from an ergonomics point of view. The event loop is implicit, any async function returns a promise, you can deal with promises from inside sync code without much trouble.

It's just the right balance.