July 12th, 2024

Notion about their usage of WASM SQLite

Notion enhanced browser performance by integrating WebAssembly SQLite, OPFS, and Web Workers technologies. Overcoming challenges, they improved page navigation by 20%, optimizing SQLite usage for efficient cross-tab queries and compatibility.

Read original articleLink Icon
PerformanceSQLiteChallenges
Notion about their usage of WASM SQLite

The article discusses how Notion improved its performance in web browsers by implementing WebAssembly (WASM) SQLite. By using the OPFS and Web Workers technologies, Notion was able to enhance page navigation times by 20 percent in modern browsers. The architecture involved a SharedWorker-powered approach to manage SQLite queries across tabs efficiently. Initially, simpler approaches faced challenges like cross-origin isolation requirements and corruption issues due to concurrency problems. Notion overcame these obstacles by developing a SharedWorker architecture that allowed for efficient SQLite usage without the need for cross-origin isolation. Despite initial regressions like slower initial page loads and certain devices not benefiting from caching, Notion optimized the implementation to ensure a seamless user experience. The final choice of using OPFS SyncAccessHandle Pool VFS variant enabled Notion to deliver improved performance without compromising on compatibility across different browsers. The article highlights the importance of addressing concurrency issues and optimizing loading processes to successfully integrate SQLite for enhanced browser performance.

AI: What people are saying
The comments on the article about Notion's browser performance improvements with WebAssembly SQLite, OPFS, and Web Workers highlight several key points:
  • Many users appreciate the power and efficiency of using SQLite locally, especially with modern SSDs, and share their own successful implementations.
  • There are concerns about the lack of offline mode and issues with concurrency and multiple tabs when using OPFS.
  • Some users express frustration with Notion's current performance and usability, particularly on mobile devices and with loading times.
  • There is a discussion on the potential benefits and challenges of integrating SQLite natively into browsers versus using alternatives like IndexedDB.
  • Several comments reflect excitement about the future of WebAssembly and SQLite in the browser, despite some technical challenges and limitations.
Link Icon 24 comments
By @bearjaws - 3 months
Overall I think using SQLite locally to offload database work is incredibly powerful. Given most laptops have an SSD now a days, you can scan an entire 30-100MiB SQLite db in miliseconds.

Meanwhile in AWS you would pay $27k a month to have the same IOPS as a Lenovo Thinkpad X1.

I just got done with a side project using WASM SQLite as well, it's incredibly powerful, even supports full text search.

My project "cluttr" is a local first site that focuses on cleaning up your screen shots folder and making it searchable via OCR. All in browser.

It also supports using Ollama to run a vision model against your images to provide more context.

Really helpful if you take a lot of screenshots for clients or PMs and need to find them later. The best part is 99% of the hosting cost is offloaded to the client, so I am not really worried about a large server bill.

https://cluttr.pages.dev/

(working on deployment to https://cluttr.ai later today)

By @AlexErrant - 3 months
> OPFS doesn’t come with graceful handling of concurrency out of the box. Developers should be aware of this and design around it.

There's a multiple readers and writers proposal [0]. It's been "position: positive" by Firefox [1], implemented in Chrome [2], and ignored by Webkit [3] (of course).

    0: https://github.com/whatwg/fs/blob/main/proposals/MultipleReadersWriters.md
    1: https://github.com/mozilla/standards-positions/issues/861
    2: https://chromestatus.com/feature/5172892632875008
    3: https://github.com/WebKit/standards-positions/issues/238
Love the shoutout to Roy Hashimoto. He's been writing VFSs for SQLite-on-the-browser and perf testing them. He's recently wrote "IDBMirrorVFS", which "is a new example VFS that keeps all SQLite files in memory while persisting to IndexedDB". It has remarkable performance, of course. https://github.com/rhashimoto/wa-sqlite/discussions/189

Also, looks like Roy takes advantage of the aforementioned proposal when he wrote `OPFSPermutedVFS`, which takes 2nd pace behind `IDBMirrorVFS` w/r/t perf. https://github.com/rhashimoto/wa-sqlite/blob/master/src/exam...

By @remram - 3 months
Notion takes 15s to load to an empty page. Then another 5 to dismiss the popup about new AI features and the like.

I'm glad they are making their app faster, in the meantime I (browser user) have cancelled my team's subscription and will be using something else.

By @jakubmazanec - 3 months
Will the tables in the Notion mobile app still be so painfully slow? Every time I scroll few rows down (in a table with few than 100 rows with dozen columns), the app hangs for several seconds while trying to fetch/render (who knows what it's actually doing) few additional rows. When I edit a row and return back to the table view, scroll position is lost and I have to suffer through the fetching once again.
By @slightwinder - 3 months
They have an actual sql-datebase on the client as cache, but still no offline-mode? Damn it. This must be their most demanded feature for years.
By @mortenjorck - 3 months
All this wonderful optimization, and yet there's still no UI to warn that (perhaps due to a websocket connection being lost?) a comment won't be saved. I lost two Notion comments this way yesterday!
By @mrtksn - 3 months
Seeing things like that excite me a lot, I can't wait for a future where all the code is written in whatever language you like and runs through WASM in the browser. The HTML and JS can go back doing that thing they are built for: Displaying the UI and handling the interactions.
By @nafey - 3 months
I believe it's a missed opportunity not to bundle sqlite in major browsers. I understand that making SQLite part of the browser spec introduces all sorts of complications [1] down the line. But nevertheless, it would have proved to be quite a nifty tool.

[1] https://hacks.mozilla.org/2010/06/beyond-html5-database-apis...

By @cageface - 3 months
OPFS is only available in Web Workers.

I don't think this is correct:

https://developer.mozilla.org/en-US/docs/Web/API/File_System...

By @nikeee - 3 months
Sadly, SharedWorkers still are not supported on Chrome+Android. Hope that changes soon because it would be a game-changer for many applications to have this API broadly available.
By @jFriedensreich - 3 months
from my own experience what notion did is pretty much required to use wasm sqlite with persistence in production. on the one hand being used at notions scale really points to wasm sqlite being ready for primetime, on the other hand there are still a LOT of ugly parts that need to be worked around:

- tooling, support and debugging for opfs

- only really works with a single worker using sync api, but this is not available from service workers

- the official js releases are really not idiomatic js and require in depth understanding to pick out the right parts and then build a usable layer on top, afaik there is no existing npm package that just provides what is required

By @aragonite - 3 months
This is both silly and somewhat off-topic, but on the subject of doing database work on the client side ... has anyone experimented with using non-rendered DOM as 'table' and CSS selectors as 'query language'?

I'm not talking about 'using the DOM to store data' in the traditional sense. The idea is rather (roughly) to store each database 'table' as a child of a hidden DocumentFragment, with 'rows' as its children, etc. Then we'd query this data using CSS selectors (or XPath).

For example, instead of

  SELECT * FROM Employees WHERE gender = 'male' AND age > 30
you'd have something like:

  customersTbl.querySelectorAll('row:has(cell-gender[value="male"]):has(cell-age[value > "30"])')
And instead of

  SELECT * FROM Employees WHERE gender = 'male' OR age > 30
you'd have something like:

  employeesTbl.querySelectorAll('row:has(cell-gender[value="male"], cell-age[value > "30"])')
You can even have (truly) structured cells! Instead of

  SELECT * FROM Employees WHERE name->>'first_name' = 'John'
you can have:

  employeesTbl.querySelectorAll('row:has(cell-name:has(cell-first_name[value="John"]))')
And for querying array-like structures, instead of:

  SELECT * FROM employees WHERE skills::jsonb ? 'JavaScript';
you can have:

  employeesTbl.querySelectorAll('row:has(cell-skills > cell-skill[value="JavaScript"])')
I can certainly imagine the performance to be horrible, but having never tried out the idea (and having too meager a mental model of DOM performance to reason from) I can also imagine it being surprisingly decent for smaller datasets. Who knows? :-)

Edit: silly me indeed to think CSS attribute selectors already can do numeric comparisons! See here: https://github.com/w3c/csswg-drafts/issues/354

By @jillesvangurp - 3 months
We've been using wasm sqlite with kotlin-js and the kotlin sqldelight framework. This is admittedly a bit of an exotic stack to be running in a browser. But it actually works surprisingly well. Unfortunately the failure modes with opfs are kind of ugly and need some attention. A key issue is controlling what happens when you have multiple tabs open interacting with the same DB.

One interesting thing is that opfs maximum disk usage is kind of browser specific but tends to be a percentage of the available diskspace measured in GB rather than some lowish number like 5MB as is typical for e.g. browser local storage. This makes it suitable for locally caching and syncing large amounts of remote database content.

We've had a few challenges with opfs and particularly limited support for this on Safari it kind of works but with some caveats. Chrome/Firefox are fine. Our web app is packaged up as a PWA for use on mobile.

Another issue is that opfs creates issues with loading resources from external websites. E.g. we have HTML previews that may include images on external domains that the browser will render fine without opfs enabled but will refuse to render when using opfs, unless you set crossOrigin=anonymous on the img tag.

Relative to indexed-db, which is supported by most browsers, you gain a more sane API to access data and more flexible querying and support for things like joins. I've had some exposure to indexed-db and IMHO it's a case study in bad API design gone horribly wrong. Querying is very limited and the APIs are poorly documented and have weird failure modes. Sqlite is an absolute pleasure to use in comparison and probably faster and way more capable.

By @RcouF1uZ4gsC - 3 months
Honestly, we should just have native SQLite supported by the browser.

SQLite has become the defacto local single file database standard.

It has a very permissive license.

It has support contracts out to the 2050s (service lifetime of Airbus A350 airframe).

SQLite will probably have a longer lifetime than whatever browser standard we have now.

By @solarkraft - 3 months
This is cool because I haven’t seen WASM SQlite in production. I wasn’t sure whether it’d end up being more performant than something built in like IndexedDB. Seeing that it’s practically feasible is definitely cool.
By @ggregoire - 3 months
> Using SQLite improved page navigation times by 20 percent in all modern browsers.

Couldn't you cache the data in LocalStorage and get similar speed improvements?

I mean, how to run SQLite in the browser is a great topic and surely has a lot of very good use cases. But attributing the performance improvements to it sounds a bit misleading, when you had no client caching strategy before and you do now. Obviously loading data from a local cache is faster than loading data from a remote server.

Comparing WASM SQLite vs. other client caching solutions (LS, IndexDB, etc) could be interesting. In terms of perf, complexity, compatibility, storage capacity, query language, etc.

By @NelsonMinar - 3 months
Why is IndexedDB so bad that you have to load a second database in WASM instead? I've used the API successfully but never measured performance.
By @born-jre - 3 months
after conquering all smartphones, browser, edge db and what not now every web page will have own instance of SQLite running in wasm, lol
By @taf2 - 3 months
I don’t see why they need to have a worker and a shared worker .. couldn’t they run the sqlite in the sharedworker directly?
By @stpn - 3 months
Pretty neat! If notion folks are in this thread, what do you do about lack of SharedWorker in Android Chrome?
By @bedatadriven - 3 months
This honestly sounds like a nightmare: multimegabyte wasm downloads, data corruption in production and byzantine hacks to coordinate writes between tabs. I am very grateful that we chose IndexedDB instead for our application!
By @gnabgib - 3 months
Article title: How we sped up Notion in the browser with WASM SQLite

Discussion (19 points, yesterday) https://news.ycombinator.com/item?id=40931957