Local First, Forever
Local-first software emphasizes storing data on the user's device, occasionally syncing with the internet. Benefits include data control, but challenges like syncing between devices and company reliability exist. Using cloud services like Dropbox for syncing, including CRDT for conflict handling, is recommended for simplicity and reliability.
Read original articleThe article discusses the concept of local-first software, which prioritizes keeping data on the user's device while occasionally syncing with the internet for various purposes. The author highlights the benefits of local-first software for end-users but points out challenges related to syncing data between devices and potential issues if the company providing the software goes out of business. The solution proposed involves using widely available cloud-based file-syncing services like Dropbox to ensure continuous syncing and data accessibility. The article explores different versions of implementing syncing, including using Conflict-free Replicated Data Types (CRDT) to handle conflicts efficiently. It emphasizes the simplicity and reliability of using basic file-sync services for local-first applications, even though they may lack advanced features found in custom solutions. The conclusion suggests that while basic file-sync services may not offer real-time syncing, they provide a practical and reliable solution for casual sync needs, ensuring data availability across devices.
For example, we're building a local-first multiplayer "IDE for tasks and notes" [1] where simply syncing flat files won't work well for certain features we want to offer like real-time collaboration, permission controls and so on.
In our case we'll simply allow users to "eject" at any time by saving their "workspace.zip" (which contains all state serialized into flat files) and downloading a "server.exe/.bin" and switch to self-hosting the backend if they want (or vice versa).
Obsidian's model seems nice: base app is free, and then payment for the networked portions like sync+publish. However, there's little data available on how well this works and how big of a TAM you need to make it sustainable. Or if it's even possible without an enterprise revenue channel.
For those interested in building robust local-first + collaborative apps, I've been using Yjs for a few years now and have overall really enjoyed it. Multi-master collaboration also poses some stimulating technical and design challenges if you're looking for new frontiers beyond the traditional client-server model.
Where is your ground truth? How collaborative is a given resource? How are merge conflicts (or any overlapping interactions) handled? Depending on your answers, CRDTs might be the wrong tool.
Please don't forget about straightforward replicated state machines. They can be very easy to reason about and scale, although require bespoke implementations. A centralized server can validate and enforce business logic, solve merge conflicts, etc. Figma uses a centralized server because their ground truth may not be local.[1]
If you try a decentralized state machine approach the implementation is undoubtedly going to be more complex and difficult to maintain. However, depending on your data and interaction patterns, they still might be the better choice over CRDTs.
It could be argued that even for this example, two local-first clients editing the same file should not be automatically merged with a CRDT. One could make the case that the slower client should rename their file (fork it), merge any conflicts, or overwrite the file altogether. A centralized server could enforce these rules and further propagate state changes after resolution.
[1] https://www.figma.com/blog/how-figmas-multiplayer-technology...
IIRC the visison was that all applications could implement this and you could provide that application with your remotestorage URL, which you could self host.
I looked into this some time ago as I was fed up with WebDAV being the only viable open protocol for file shares/synchronization (especially after hosting my own NextCloud instance, which OOMed because the XML blobs for a large folder it wanted to create as a response used too much memory) and found it through this gist [0] which was a statement about Flock [1] shutting down.
It looks like a cool and not that complex protocol, but all the implementations seem to be unmaintained.
And the official javascript client [2] seems to be ironically be used mostly to access Google Drive or DropBox
Remotestorage also has an internet draft https://datatracker.ietf.org/doc/draft-dejong-remotestorage/ which is relatively easy to understand and not very long.
[0] https://gist.github.com/rhodey/873ae9d527d8d2a38213
1. Easier to develop - Sync layer handles all the tricky stuff, no need for translation layer between server and client.
2. Better user experience - Immediate UI feedback, no network dependency etc.
I suspect there will be a major tide shift within the next year or two when a local first framework with the developer experience similar to Nuxt or Next comes about. The Rails of local first.
I can't recommend enough the localfirst.fm podcast which has been a great introduction to the people and projects in the space: https://www.localfirst.fm/
> It’s perfect — many people already have it. There are multiple implementations, so if Microsoft or Apple go out of business, people can always switch to alternatives. File syncing is a commodity.
This doesn’t work for collaborative software. It’s also highly questionable for realtime software like chat. That’s a solution looking for a problem.
There is exciting movement in the space but imo people focus too much on CRDTs, seemingly in the hopes of optimal solutions to narrow problems.
What we need is easy-to-use identity management, collaborative features without vendor lock in and most importantly, a model that supports small-medium sized businesses that want to build apps while making a living.
If something happens to the provider, or they decide they don't like you or your files, your data is gone. Worse than gone, because you still have the empty proxies -- the husks of your files.
I personally know of more than one instance where seemingly innocuous data triggered some automated system at Dropbox and the user was locked out of their files without recourse.
If you're using cloud storage, make *absolutely certain* you have it set to download all files. If your cloud storage exceeds the drive space of a laptop (small businesses, etc), get a cheap dedicated PC and a big drive, then set up at least one dedicated cloud mirror.
Local-first cloud storage is great, but the potential for catastrophic data-loss is not even remotely close to zero.
But I have used this app every week since, and one of the lessons is that operations-based files grow pretty quickly. If you want to keep sync times short and bandwidth usage to a minimum, you have to consider how you keep read and write times to a minimum. I use localStorage for the client-side copy, and reaching the 5 MB quota isn't that hard either. These things can be solved, but you have to consider them during the design phase.
So yes, it's cool stuff, but the story isn't over with using automerge and op-based files.
At least for myself, I barely use any local first software. The software that I do use that is local in any important sense of the word is basically local-only software. I realize this every time I lose connection on my phone. It becomes pretty much a pretty bad camera compared to my Sony.
I live in a country were I have good 3G speed pretty much everywhere so internet connectivity is never an issue, not even on moving things like trains or boats. The very few times I have been flying or whatever I simply don't do any work because it's usually uncomfortable anyway.
This is the main reason I don't really care about local first and have been diving into Phoenix Liveview the last couple of weeks. The productivity boost I get and cool realtime apps it empowers me to build is more important to me than the dream of making local first web apps. A realtime demo of things updating with multiplayer functionality is a far easier sell than "look, the app works even when I turn on flight mode". And honestly, in like 99% of the time, it is also more useful.
I have done local first web apps before and it is always such a pain because syncing is a near impossible problem to solve. What happens if you and someone else have done changes to the same things like 2 hours or more ago? Who even remembers which value is correct? How do you display the diffs?
No matter what you do, you probably need to implement some kind of diffing functionality, you need to implement revisions because the other guy will complain that the changes were overwritten and so on. There is just so many issues that is very hard to solve and require so much work to be done that it isn't worth it unless you are a large team with a lot of resources. You end up with a complicated mess of code that is like git but worse in every way.
It's easier to simply say the app doesn't work offline because we rarely are offline and no one will pay for the effort required. Unfortunately.
Not saying that's not iPhone's fault, but I doubt any of this works on that platform
I don't think this is true, granted there are some big challenges to transfering data between devices without a central server, but there are several projects like https://dxos.org/ which use p2p, and also there's https://ditto.live/ which uses bluetooth/wifi direct for cases where all users will be in the same room or on the same local network (imagine wanting to play chess with a friend sitting in a different row on a plane without wifi - I was in this situation recently and was pretty surprised that I couldn't find anything on the app store that could do this!)
Of course most of the time its better to have a server because p2p still has a lot of difficulties and often having a central 'source of truth' is worth the costs that come with a server based architecture. So imo things like https://electric-sql.com/ or https://www.triplit.dev/ or the upcoming https://zerosync.dev/ will be far better choices for anyone wanting to build a local first app used by many users.
Sync services haven't evolved much. I guess, a service that would provide lower APIs and different data structures (CRDTs, etc.) would be a hacker's dream. Also, E2EE would be nice.
And if they closed the shop, I would have all the files on my devices.
> Dropbox and other file-sync services, while very basic, offer enough to implement it in a simple but working way.
That's how I use KeePassXC. I put the .kdbx file in Seafile, and have it on all my devices. Works like a charm.
[0] "FDB - a reactive database environment for your files" https://www.youtube.com/watch?v=EvAFEC6n7NI
It's vexing how many tools just assume always-on connectivity. I don't want a tasks-and-notes tool that I need to run in a browser. I want that data local, and I may want to sync it, but it should work fine (other than sync) without the Internet.
This is also true for virtually every other data tool I use.
However, I like the abstractions of CRDTs and libs like Automerge can solve most of the problems. If you must handle all types of file, just be prepared to ask the user to solve them by hand.
Let's say a day BeanCount (my preferred personal finance software) disappear, well, so far I can switch to Ledger or HLegder, the switch demand a bit of rg/sed works but it's doable. If let's say Firefly disappear I still have my data, but migrating them to something else it's a nightmare. Of course such events are slow, if the upstream suddenly disappear the local software still work, but after some times it will break due to environmental changes around it.
With classic FLOSS tools that's a limited problem, tools are simple without much dependencies and they are normally developed by a large spread community. Modern tools tend to be the opposite: with gazillion of deps often in https://xkcd.com/2347/ mode.
My digital life is almost entirely in Emacs, the chances Emacs disappear are objectively low and even if it happen even if it have a very big codebase there are not much easy-to-break deps BUT if I decide to go the modern path, let's say instead of org-attaching most of my files I decide to put them on Paperless and use them instead of via org-mode notes links with Dokuwiki or something else I get much more chances something break and even if I own anything my workflow cease to exists quickly. Recover would be VERY hard. Yes, paperless in the end store files on a file system, I can browse them manually, Zim have essentially the same Dokuwiki markup so I can import the Wiki, but all links will be broken and there is no direct quick-text-tweaking I can apply to reconstruct http links to the filesystem. With org-attach I can, even if it use a cache-like tree not really human readable.
Anyway to have personal guarantees of ownership of our digital life local-first and sync are not the only main points. The corollary is that we need the old desktop model "an OS like a single program, indefinitively extensible" to be safe, because it's much more fragile at small potatoes level, but it's much more resilient in the long run.
A repository as a file is self-contained, tracks changes by itself, and is, therefore, free from vendor lock-in. Here is my draft RFC https://docs.google.com/document/d/1sma0kYRlmr4TavZGa4EFiNZA...
It also means it would be fairly trivial to allow users/orgs to host their own “backend” as well.
Is the goal to sell mainframes? Then tell customers than thin clients powered by a mainframe allow for easy collaboration, centralized backups and administration, and lower total cost of ownership.
Do you want recurring SaaS revenue? Then tell customers that they don't want the hassle of maintaining a complicated server architecture, that security updates mean servers need constant maintenance, and that integrating with many 3rd party SaaS apps makes cloud hosting the logical choice.
We're currently working on an Local First (and E2EE) app that syncs with CRDTs. The server has been reduced to a single go executable that more or less broadcasts the mutation messages to the different clients when they come online. The tech is very cool and it's what we think makes the most sense for the user. But what we've also realized is that by architecting our software like this we have torpedoed our business model. Nobody is going to pay $25 per user per seat per month when it's obvious that the app runs locally and not that much is happening on the server side.
Local First, Forever is good for the user. Open data formats are good for the user. Being able to self-host is good for the user. But I suspect it will be very difficult to make software like this profitably. Adobe's stock went 20x after they adopted a per seat subscription model. This Local First trend, if it is here to stay (and I hope it will be) might destroy a lot of SaaS business models.
Here's what I learned by doing that:
1. Firstly - and this is kinda obvious but often left unarticulated - this pattern more or less requires a desktop app. Most developers no longer have any experience of making these. In particular, distribution is harder than on the web. That experience is what eventually inspired me to make Conveyor, my current product, which makes deploying desktop apps waaaaay easier (see my bio for a link) and in particular lets you have web style updates (we call them "aggressive updates"), where the app updates synchronously on launch if possible.
2. Why do you need aggressive updates? Because otherwise you have to support the entire version matrix of every version you ever released interacting with every other version. That's very hard to test and keep working. If you can keep your users roughly up to date, it gets a lot simpler and tech debt grows less fast. There are no update engines except the one in Conveyor that offers synchronous updates, and Lighthouse predated Conveyor, so I had to roll my own update engine. Really a PITA.
3. Users didn't understand/like the file sharing pattern. Users don't like anything non-standard that they aren't used to, but they especially didn't like this particular pattern. Top feature request: please make a server. All that server was doing was acting as a little DropBox like thing specialized for this app, but users much preferred it even in the cryptocurrency/blockchain world where everyone pretends to want decentralized apps.
4. It splits your userbase (one reason they don't like it). If some users use DropBox and others use Google Drive and others use OneDrive, well, now everyone needs to have three different drive accounts and apps installed.
5. Users expect to be able to make state changes that are reflected immediately on other people's screens e.g. when working together on the phone. Drive apps aren't optimized for this and often buffer writes for many minutes.
You don't really need this pattern anyway. If you want to make an app that works well then programmer time is your biggest cost, so you need a business model to fund that and at that point you may as well throw in a server too. Lighthouse was funded by a grant so didn't have that issue.
Re: business models. You can't actually just sell people software once and let them use it forever anymore, that's a completely dead business model. It worked OK in a world where people bought software on a CD and upgraded their OS at most every five years. In that world you could sell one version with time-limited support for one year, because the support costs would tend to come right at the start when users were getting set up and learning the app. Plus, the expectation was that if you encountered a bug you just had to suck it up and work around it for a couple of years until you bought the app again.
In a world where everything is constantly changing and regressing in weird ways, and where people will get upset if something breaks and you tell them to purchase the product again, you cannot charge once for a program and let people keep it forever. They won't just shrug and say, oh OK, I upgraded my OS and now my $500 app is broken, guess I need to spend another $300 to upgrade to the latest version. They will demand you maintain a stream of free backported bugfixes forever, which you cannot afford to do. So you have to give them always the latest versions of things, which means a subscription.
Sorry, I know people don't want to hear that, but it's the nature of a world where people know software can be updated instantly and at will. Expectations changed, business models changed to meet them.
Talking about such things is like catnip on here though.
Yeah, the best world for multiple users is a database right?
So it would seem that if apps have stores to buy things one of those things should be a store?
If you could buy a database place then you can dispense tickets. Share the tickets with friends as photos. Then your local - first app can store take the tickets to the places your friends shared with you. Some of them go offline, fine.
I don't like the experience of having to setup a dropbox externally from apps. Why should a database place not be a one-off purchase like an item in a game?
CRDTs and local first are ideas that is perpetually in the hype cycle for the last decade or so, starting around the time Riak CRDTs became a thing and continuing all the way to today.
Niki's post is a perfect illustration: CRDTs offer this "magical" experience that seems perfectly good until you try building a product with them. Then it becomes a nightmare of tradeoffs:
- state-based or operation-based? do you trim state? how?
- is it truly a CRDT (no conflicts possible), a structure with explicit conflict detection, or last-writer-wins/arbitrary choice in a trench coat that will always lose data? Case in point: Automerge uses pseudo-random conflict resolution, so Niki's solution will drop data if it's added without a direct causal link between the edits. To learn this, you have to go to "under the hood" section in Automerge docs and read about merge rules very attentively. It might be acceptable for a particular use case, but very few people would even read that far!
- what is the worst case complexity? Case in point: yjs offers an interface that looks very much like JSON, but "arrays" are actually linked lists underneath, which makes it easy to accidentally become quadratic.
- how do you surface conflicts and/or lost data to the user in the interface? What are the user expectations?
- how do you add/remove replication nodes? What if they're offline at the time of removal? What if they come online after getting removed?
- what's user experience like for nodes with spotty connection and relatively constrained resources, like mobile phones? Do they have to sync everything after coming online before being able to commit changes?
- what's the authoritative node for side effects like email or push notifications?
- how do you handle data migrations as software gets updated? What about two nodes having wildly different software versions?
- how should search work on constrained devices, unless every device has the full copy of the entire state?
Those tradeoffs infect the entire system, top to bottom, from basic data structures (a CRDT "array" can be many different things with different behaviour) to storage to auth to networking to UI. Because of that, they can't be abstracted away — or more precisely, they can be pretend-abstracted for marketing purposes, until the reality of the problem becomes apparent in production.
From Muse [1] to Linear [2], everyone eventually hits the problems above and has to either abandon features (no need to have an authoritative email log if there are no email notifications), subset data and gradually move from local first to very elaborate caching, or introduce federation of some sort that gravitates towards centralisation anyway (very few people want to run their own persistent nodes).
I think this complexity, essential for local-first in practice, is important to contextualise both Niki's post and the original talk (which mostly brushed over it).
[1]: https://museapp.com/podcast/78-local-first-one-year-later/
Then came thick clients, where some processing happened on the local terminal and some on the server.
Technology advanced far enough to install programs locally and they ran only locally. The 80's and 90's were quite the decades.
Someone had a bright idea to build server-based applications that could be accessed via a dumb web browser.
That went on for a while, and people realized that things would be better if we did some processing in JavaScript.
Now the push continues to do more and more computation locally.
Can't wait for the next cycle.
- He writes about UX but there's no dark mode!
- The cursors are really distracting!
- The yellow background color makes it unreadable!
- Some comment about the actual content of the post
tonsky.me##div.pointers
to your uBlock Origin's custom filters list.