Notion about their usage of WASM SQLite
Notion enhanced browser performance by integrating WebAssembly SQLite, OPFS, and Web Workers technologies. Overcoming challenges, they improved page navigation by 20%, optimizing SQLite usage for efficient cross-tab queries and compatibility.
Read original articleThe article discusses how Notion improved its performance in web browsers by implementing WebAssembly (WASM) SQLite. By using the OPFS and Web Workers technologies, Notion was able to enhance page navigation times by 20 percent in modern browsers. The architecture involved a SharedWorker-powered approach to manage SQLite queries across tabs efficiently. Initially, simpler approaches faced challenges like cross-origin isolation requirements and corruption issues due to concurrency problems. Notion overcame these obstacles by developing a SharedWorker architecture that allowed for efficient SQLite usage without the need for cross-origin isolation. Despite initial regressions like slower initial page loads and certain devices not benefiting from caching, Notion optimized the implementation to ensure a seamless user experience. The final choice of using OPFS SyncAccessHandle Pool VFS variant enabled Notion to deliver improved performance without compromising on compatibility across different browsers. The article highlights the importance of addressing concurrency issues and optimizing loading processes to successfully integrate SQLite for enhanced browser performance.
Related
What Happens When You Put a Database in the Browser?
WebAssembly (Wasm) enhances browser capabilities, enabling high-performance apps like DuckDB for ad-hoc queries and Python environments. DuckDB Wasm boosts performance in interfaces like lakeFS, Evidence, and Count. MotherDuck enables local querying, emphasizing efficient data processing.
Why Google Sheets ported its calculation worker from JavaScript to WasmGC
Google Sheets transitioned its calculation worker to WasmGC from JavaScript for improved performance. Collaboration between Sheets and Chrome teams led to optimizations, overcoming challenges for near-native speed on the web.
Optimizing Large-Scale OpenStreetMap Data with SQLite
The article discusses optimizing large-scale OpenStreetMap data with SQLite. Converting OSMPBF to SQLite enhanced search functionalities. Indexing, full-text search, and compression improved query performance, despite some false positives.
Show HN: Simulating 20M Particles in JavaScript
This article discusses optimizing JavaScript performance for simulating 1,000,000 particles in a browser. It covers data access optimization, multi-threading with SharedArrayBuffers and web workers, and memory management strategies.
We sped up Notion in the browser with WASM SQLite
Notion improved web performance with WebAssembly SQLite, enhancing navigation by 20% in modern browsers. SharedWorker architecture managed SQLite queries efficiently, overcoming initial challenges for a seamless user experience.
- Many users appreciate the power and efficiency of using SQLite locally, especially with modern SSDs, and share their own successful implementations.
- There are concerns about the lack of offline mode and issues with concurrency and multiple tabs when using OPFS.
- Some users express frustration with Notion's current performance and usability, particularly on mobile devices and with loading times.
- There is a discussion on the potential benefits and challenges of integrating SQLite natively into browsers versus using alternatives like IndexedDB.
- Several comments reflect excitement about the future of WebAssembly and SQLite in the browser, despite some technical challenges and limitations.
Meanwhile in AWS you would pay $27k a month to have the same IOPS as a Lenovo Thinkpad X1.
I just got done with a side project using WASM SQLite as well, it's incredibly powerful, even supports full text search.
My project "cluttr" is a local first site that focuses on cleaning up your screen shots folder and making it searchable via OCR. All in browser.
It also supports using Ollama to run a vision model against your images to provide more context.
Really helpful if you take a lot of screenshots for clients or PMs and need to find them later. The best part is 99% of the hosting cost is offloaded to the client, so I am not really worried about a large server bill.
(working on deployment to https://cluttr.ai later today)
There's a multiple readers and writers proposal [0]. It's been "position: positive" by Firefox [1], implemented in Chrome [2], and ignored by Webkit [3] (of course).
0: https://github.com/whatwg/fs/blob/main/proposals/MultipleReadersWriters.md
1: https://github.com/mozilla/standards-positions/issues/861
2: https://chromestatus.com/feature/5172892632875008
3: https://github.com/WebKit/standards-positions/issues/238
Love the shoutout to Roy Hashimoto. He's been writing VFSs for SQLite-on-the-browser and perf testing them. He's recently wrote "IDBMirrorVFS", which "is a new example VFS that keeps all SQLite files in memory while persisting to IndexedDB". It has remarkable performance, of course. https://github.com/rhashimoto/wa-sqlite/discussions/189Also, looks like Roy takes advantage of the aforementioned proposal when he wrote `OPFSPermutedVFS`, which takes 2nd pace behind `IDBMirrorVFS` w/r/t perf. https://github.com/rhashimoto/wa-sqlite/blob/master/src/exam...
I'm glad they are making their app faster, in the meantime I (browser user) have cancelled my team's subscription and will be using something else.
[1] https://hacks.mozilla.org/2010/06/beyond-html5-database-apis...
I don't think this is correct:
https://developer.mozilla.org/en-US/docs/Web/API/File_System...
- tooling, support and debugging for opfs
- only really works with a single worker using sync api, but this is not available from service workers
- the official js releases are really not idiomatic js and require in depth understanding to pick out the right parts and then build a usable layer on top, afaik there is no existing npm package that just provides what is required
I'm not talking about 'using the DOM to store data' in the traditional sense. The idea is rather (roughly) to store each database 'table' as a child of a hidden DocumentFragment, with 'rows' as its children, etc. Then we'd query this data using CSS selectors (or XPath).
For example, instead of
SELECT * FROM Employees WHERE gender = 'male' AND age > 30
you'd have something like: customersTbl.querySelectorAll('row:has(cell-gender[value="male"]):has(cell-age[value > "30"])')
And instead of SELECT * FROM Employees WHERE gender = 'male' OR age > 30
you'd have something like: employeesTbl.querySelectorAll('row:has(cell-gender[value="male"], cell-age[value > "30"])')
You can even have (truly) structured cells! Instead of SELECT * FROM Employees WHERE name->>'first_name' = 'John'
you can have: employeesTbl.querySelectorAll('row:has(cell-name:has(cell-first_name[value="John"]))')
And for querying array-like structures, instead of: SELECT * FROM employees WHERE skills::jsonb ? 'JavaScript';
you can have: employeesTbl.querySelectorAll('row:has(cell-skills > cell-skill[value="JavaScript"])')
I can certainly imagine the performance to be horrible, but having never tried out the idea (and having too meager a mental model of DOM performance to reason from) I can also imagine it being surprisingly decent for smaller datasets. Who knows? :-)Edit: silly me indeed to think CSS attribute selectors already can do numeric comparisons! See here: https://github.com/w3c/csswg-drafts/issues/354
One interesting thing is that opfs maximum disk usage is kind of browser specific but tends to be a percentage of the available diskspace measured in GB rather than some lowish number like 5MB as is typical for e.g. browser local storage. This makes it suitable for locally caching and syncing large amounts of remote database content.
We've had a few challenges with opfs and particularly limited support for this on Safari it kind of works but with some caveats. Chrome/Firefox are fine. Our web app is packaged up as a PWA for use on mobile.
Another issue is that opfs creates issues with loading resources from external websites. E.g. we have HTML previews that may include images on external domains that the browser will render fine without opfs enabled but will refuse to render when using opfs, unless you set crossOrigin=anonymous on the img tag.
Relative to indexed-db, which is supported by most browsers, you gain a more sane API to access data and more flexible querying and support for things like joins. I've had some exposure to indexed-db and IMHO it's a case study in bad API design gone horribly wrong. Querying is very limited and the APIs are poorly documented and have weird failure modes. Sqlite is an absolute pleasure to use in comparison and probably faster and way more capable.
SQLite has become the defacto local single file database standard.
It has a very permissive license.
It has support contracts out to the 2050s (service lifetime of Airbus A350 airframe).
SQLite will probably have a longer lifetime than whatever browser standard we have now.
Couldn't you cache the data in LocalStorage and get similar speed improvements?
I mean, how to run SQLite in the browser is a great topic and surely has a lot of very good use cases. But attributing the performance improvements to it sounds a bit misleading, when you had no client caching strategy before and you do now. Obviously loading data from a local cache is faster than loading data from a remote server.
Comparing WASM SQLite vs. other client caching solutions (LS, IndexDB, etc) could be interesting. In terms of perf, complexity, compatibility, storage capacity, query language, etc.
Discussion (19 points, yesterday) https://news.ycombinator.com/item?id=40931957
Related
What Happens When You Put a Database in the Browser?
WebAssembly (Wasm) enhances browser capabilities, enabling high-performance apps like DuckDB for ad-hoc queries and Python environments. DuckDB Wasm boosts performance in interfaces like lakeFS, Evidence, and Count. MotherDuck enables local querying, emphasizing efficient data processing.
Why Google Sheets ported its calculation worker from JavaScript to WasmGC
Google Sheets transitioned its calculation worker to WasmGC from JavaScript for improved performance. Collaboration between Sheets and Chrome teams led to optimizations, overcoming challenges for near-native speed on the web.
Optimizing Large-Scale OpenStreetMap Data with SQLite
The article discusses optimizing large-scale OpenStreetMap data with SQLite. Converting OSMPBF to SQLite enhanced search functionalities. Indexing, full-text search, and compression improved query performance, despite some false positives.
Show HN: Simulating 20M Particles in JavaScript
This article discusses optimizing JavaScript performance for simulating 1,000,000 particles in a browser. It covers data access optimization, multi-threading with SharedArrayBuffers and web workers, and memory management strategies.
We sped up Notion in the browser with WASM SQLite
Notion improved web performance with WebAssembly SQLite, enhancing navigation by 20% in modern browsers. SharedWorker architecture managed SQLite queries efficiently, overcoming initial challenges for a seamless user experience.