July 4th, 2024

Sans-IO: The secret to effective Rust for network services

Firezone utilizes connlib, a Rust library for managing network connections and WireGuard tunnels sans-IO. This approach enhances testing, customization, and functionality assurance, promoting efficient and flexible network services development.

Read original articleLink Icon
Sans-IO: The secret to effective Rust for network services

At Firezone, a connectivity library named connlib is used to manage network connections and WireGuard tunnels securely. The library is designed with a sans-IO approach in Rust, emphasizing state machines over direct socket interactions. This design choice allows for better testing, customization, and assurance of functionality. The post discusses the benefits of sans-IO design in Rust for network services, highlighting its compatibility with async models and addressing challenges like the "function colouring" debate. By abstracting away IO operations and introducing abstractions like Transmit and event loops, developers can create efficient and flexible network services. The article also delves into implementing a STUN binding protocol using sans-IO principles, showcasing how to handle input, transmit data, and abstract time for protocol operations. Overall, the sans-IO approach in Rust enables cleaner, more modular, and testable network service implementations.

Related

Interface Upgrades in Go (2014)

Interface Upgrades in Go (2014)

The article delves into Go's interface upgrades, showcasing their role in encapsulation and decoupling. It emphasizes optimizing performance through wider interface casting, with examples from io and net/http libraries. It warns about complexities and advises cautious usage.

Remembering the LAN (2020)

Remembering the LAN (2020)

The article discusses the shift from simple LAN setups in the 1990s to complex modern internet programming. It highlights DIY solutions for small businesses and envisions a future merging traditional LAN environments with modern technologies.

SquirrelFS: Using the Rust compiler to check file-system crash consistency

SquirrelFS: Using the Rust compiler to check file-system crash consistency

The paper introduces SquirrelFS, a crash-safe file system using Rust's typestate pattern for compile-time operation order enforcement. Synchronous Soft Updates ensure crash safety by maintaining metadata update order. SquirrelFS offers correctness guarantees without separate proofs, quickly verifying crash consistency during compilation. Comparative evaluations show SquirrelFS performs similarly or better than NOVA and WineFS.

Download Accelerator – Async Rust Edition

Download Accelerator – Async Rust Edition

This post explores creating a download accelerator with async Rust, emphasizing its advantages over traditional methods. It demonstrates improved file uploads to Amazon S3 and provides code for parallel downloads.

The Inconceivable Types of Rust: How to Make Self-Borrows Safe

The Inconceivable Types of Rust: How to Make Self-Borrows Safe

The article addresses Rust's limitations on self-borrows, proposing solutions like named lifetimes and inconceivable types to improve support for async functions. Enhancing Rust's type system is crucial for advanced features.

Link Icon 18 comments
By @ComputerGuru - 5 months
This is billed as something revolutionary and forward progress but that’s exactly how we used to do async in $lang - including Rust - before language support for async/await landed.

The biggest productivity boost to my rust embedded firmware development was when I could stop manually implementing state machines and marshalling all local variables into custom state after custom state between each I/O operation snd let rust do that for me by using async/await syntax!

That’s, after all, what async desugars to in rust: an automatic state machine that saves values across I/O (await) points for you.

By @zamalek - 5 months
I had been mulling over this problem space in my head, and this is a seriously great approach to the direction I have been thinking (though still needs work, footnote 3 in the article).

What got me thinking about this was the whole fn coloring discussion, and a happy accident on my part. I had been writing a VT100 library and was doing my head in trying to unit test it. The problem was that I was essentially `parser::new(stdin())`. During the 3rd or 4th rewrite I changed the parser to `parser::push(data)` without really thinking about what I was doing. I then realized that Rust was punishing me for using an enterprise OOPism anti-pattern I have since been calling "encapsulation infatuation." I now see it everywhere (not just in I/O) and the havoc it wreaks.

The irony is that this solution is taught pre-tertiary education (and again early tertiary). The simplest description of a computer is a machine that takes input, processes/transforms data, and produces output. This is relevant to the fn coloring discussion because only input and output need to be concerned with it, and the meat-and-potatoes is usually data transformation.

Again, this is patently obvious - but if you consider the size of the fn coloring "controversy;" we've clearly all been missing/forgetting it because many of us have become hard-wired to start solving problems by encapsulation first (the functional folks probably feel mighty smug at this point).

Rust has seriously been a journey of more unlearning than learning for me. Great pattern, I am going to adopt it.

Edit: code in question: https://codeberg.org/jcdickinson/termkit/src/branch/main/src...

By @ziziman - 5 months
How does this design compare to using channels to send data to a dedicated handlers. When using channels i've found multiple issues: (1) Web-shaped code that is often hard to follow along (2) Requires to manually implement message types that can then be converted to network-sendable messages (3) Requires to explicitly give a transmitter to interested/allowed entities (4) You get a result if your channel message failed to transmit but NOT if your message failed to transmit over network

But besides that it's pretty convenient. Let's say you have a ws_handler channel, you just send your data through that and there is a dedicated handler somewhere that may or may not send that message if it's able to.

By @hardwaresofton - 5 months
See also: monads and in particular the Free(r) monad, and effects systems[0].

The idea of separating logic from execution is a whole thing, well trodden by the Haskell ecosystem.

[EDIT] Also, they didn't mention how they encapsulated the `tokio::select!` call that shows up when they need to do time-related things -- are they just carrying around a `tokio::Runtime` that they use to make the loop code async without requiring the outside code to be async?

[EDIT2] Maybe they weren't trying to show an encapsulated library doing that, but rather to show that the outside application can use the binding in an async context...

I would have been more interested in seeing how they could implement an encapsulated function in the sans-IO style that had to do something like wait on an action or a timer -- or maybe the answer they're expecting there is just busy-waiting, or carrying your own async runtime instance (that can essentially do the busy waiting for you, with something like block_in_place.

[0]: https://okmij.org/ftp/Computation/free-monad.html

By @r3trohack3r - 5 months
Oh hey thomaseizinger!

I got half way through this article feeling like this pattern was extremely familiar after spending time down inside rust-libp2p. Seems like that wasn't a coincidence!

Firezone looks amazing, connect all the things!

By @amluto - 5 months
> Also, sequential workflows require more code to be written. In Rust, async functions compile down to state machines, with each .await point representing a transition to a different state. This makes it easy for developers to write sequential code together with non-blocking IO. Without async, we need to write our own state machines for expressing the various steps.

Has anyone tried to combine async and sans-io? At least morally, I ought to be able to write an async function that awaits sans-io-aware helpers, and the whole thing should be able to be compiled down to a state machine inside a struct with a nice sans-io interface that is easily callable by non-async code.

I’ve never tried this, but the main issues I would forsee would be getting decent ergonomics and dealing with Pin.

By @ethegwo - 5 months
Good job! Exposing state could make any async function 'pure'. All the user needs to do is push the state machine to the next state. I have tried to bind OpenSSL to async Rust before, its async API follows a similar design.
By @mgaunard - 5 months
This is just normal asynchronous I/O with callbacks instead of coroutines.
By @mpweiher - 5 months
Reading the article and some of the comments, it sounds like they reinvented the hexagonal or ports/adapters architectural style?
By @Uptrenda - 5 months
I don't know what the take away is supposed to be here. Everything spoken about here is already basic network programming. It seems to focus on higher level plumbing and geeks out on state management even though this is just a matter of preference and has nothing to do with networking.

The most interesting thing I learned from the article is that cloudflare runs a public stun server. But even that isn't helpful because the 'good' and 'useful' version of the STUN protocol is the first version of the protocol which supports 'change requests' -- a feature that allows for NAT enumeration. Later versions of the STUN protocol removed that feature thanks to the 'helpful suggestions' of Cisco engineers who contributed to the spec.

By @screcth - 5 months
It would be better if the compiler could take the async code and transform it automatically to its sans io equivalent. Doing it manually seems error prone and makes it much harder to understand what the code is doing.
By @tmd83 - 5 months
Does the actual traffic goes through the gateway or the gateway is only used for setting up the connection?
By @ibotty - 5 months
That's just an initial encoding as described e.g. here: https://peddie.github.io/encodings/encodings-text.html

Am I missing something?

By @Animats - 5 months
"... be it from your Android phone, MacOS computer or Linux server. "

Why would you want this in a client? It's not like a client needs to manage tens of thousands of connections. Unless it's doing a DDOS job.

By @Arnavion - 5 months
See also this discussion from a few months ago about sans-io in Rust: https://news.ycombinator.com/item?id=39957617
By @cryptonector - 5 months
> This pattern isn't something that we invented! The Python world even has a dedicated website about it.

I mean, this is basically what the IO monad and monadic programming in Haskell end up pushing Haskell programmers to do.

By @joshka - 5 months
This article / idea really refactors two things out of some IO code

- the event loop

- the state machine of data states that occur

But async rust is already a state machine, so the stun binding could be expressed as a 3 line async function that is fairly close to sans-io (if you don't consider relying on abstractions like Stream and Sink to be IO).

    async fn stun(
        server: SocketAddr,
        mut socket: impl Sink<(BindingRequest, SocketAddr), Error = color_eyre::Report>
            + Stream<Item = Result<(BindingResponse, SocketAddr), color_eyre::Report>>
            + Unpin
            + Send
            + 'static,
    ) -> Result<SocketAddr> {
        socket.send((BindingRequest, server)).await?;
        let (message, _server) = socket.next().await.ok_or_eyre("No response")??;
        Ok(message.address)
    }
If you look at how the underlying async primitives are implemented, they look pretty similar to what you;ve implemented. sink.send is just a future for Option<SomeMessage>, a future is just something that can be polled at some later point, which is exactly equivalent to your event loop constructing the StunBinding and then calling poll_transmit to get the next message. And the same goes with the stream.next call, it's the same as setting up a state machine that only proceeds when there is a next item that is being fed to it. The Tokio runtime is your event loop, but just generalized.

Restated simply: stun function above returns a future that that combines the same methods you have with a contract about how that interacts with a standard async event loop.

The above is testable without hitting the network. Just construct the test Stream / Sink yourself. It also easily composes to add timeouts etc. To make it work with the network instead pass in a UdpFramed (and implement codecs to convert the messages to / from bytes).

Adding timeout can be either composed from the outside caller if it's a timeout imposed by the application, or inside the function if it's a timeout you want to configure on the call. This can be tested using tokio test-utils and pausing / advancing the time in your tests.

---

The problem with the approach suggested in the article is that it splits the flow (event loop) and logic (statemachine) from places where the flow is the logic (send a stun binding request, get an answer).

Yes, there's arguments to be made about not wanting to use async await, but when you effectively create your own custom copy of async await, just without the syntactic sugar, and without the various benefits (threading, composability, ...), it's worth considering whether you could use async instead.