October 1st, 2024

SlateDB – An embedded database built on object storage

SlateDB is an embedded storage engine using object storage for high durability and scalability. It features a zero-disk architecture, tunable performance, and supports multiple readers with a single writer.

Read original articleLink Icon
ConfusionSkepticismDisappointment
SlateDB – An embedded database built on object storage

SlateDB is an embedded storage engine designed to leverage object storage, distinguishing itself from traditional LSM-tree storage engines. It offers virtually limitless storage capacity, high durability, and simplified replication processes. By utilizing the durability of the underlying object store, SlateDB achieves an impressive durability rate of 99.999999999%. Its zero-disk architecture eliminates concerns related to disk failures and corruption, enhancing reliability. SlateDB allows for tunable performance, enabling users to optimize for low latency, cost efficiency, or high durability based on their needs. It supports a single writer and multiple readers, with mechanisms to detect and manage inactive writers. Built in Rust, SlateDB can be integrated easily with various programming languages, making it accessible for developers looking to implement an efficient embedded database solution.

- SlateDB utilizes object storage for enhanced durability and scalability.

- It features a zero-disk architecture, eliminating risks associated with disk failures.

- Users can configure performance settings to prioritize latency, cost, or durability.

- The engine supports multiple readers and a single writer, managing inactive writers effectively.

- SlateDB is built in Rust, allowing for easy integration with different programming languages.

AI: What people are saying
The comments on SlateDB reveal several concerns and observations about its functionality and purpose.
  • Many users question the "embedded" nature of SlateDB, suggesting it relies heavily on external object storage services.
  • There are concerns about durability and data loss due to the in-memory write-ahead log, which could lead to lost writes if the writer fails.
  • Several commenters feel that SlateDB is a thin abstraction over existing object storage solutions, lacking unique features that justify its existence.
  • Users express confusion about the targeted use cases for SlateDB, questioning its advantages in various applications.
  • There is a demand for additional language bindings, particularly for C++ or C, as current usage requires Rust knowledge.
Link Icon 15 comments
By @nmca - 3 months
> Object storage is an amazing technology. It provides highly-durable, highly-scalable, highly-available storage at a great cost.

I don’t know if this was intended to be intentional funny, but there is a little ambiguity in the expression “great cost”, typically great cost means very expensive.

Very cool and useful shim otherwise :)

By @drodgers - 3 months
It looks like writes are buffered in an in-memory write ahead log before being written to object storage, which means that if the writer box dies, then you lose acknowledged writes.

I've built something similar for low-cost storage of infrequently accessed data, but it uses our DBMS (MySQL) for the WAL (+ cache of hot reads), so you get proper durability guarantees.

The other cool trick to use is to use Bε-trees (a relatively recent innovation from Microsoft Research) for the object storage compaction to minimise the number of write operations needed when flushing the WAL.

By @rehevkor5 - 3 months
I don't see how it's embedded if it relies on nonlocal services... on the contrary it says specifically, "no local state". It appears to be more analogous to a "lakehouse architecture" implementation (similar to, for example, Apache Iceberg), where your app includes a library that knows how to interact with the data in cloud object storage.
By @anon291 - 3 months
This seems to be a key value store built atop object storage. Which is to say, it seems completely redundant. Not sure if there's some feature I'm missing, but all of the six features mentioned on the front page are things you'd have if you used the key value store directly (actually, you get more because then you get multiple writers).

I was excited at first and thought this was SQL atop S3 et al. I've jerryrigged a solution to this using SQLite with a customized VFS backend, and would suggest that as an alternative to this particular project. You get the benefit of ACID transactions across multiple tables and a distributed backend.

By @jitl - 3 months
From the docs https://slatedb.io/docs/introduction/

> NOTE

> Snapshot isolation and transactions are planned but not yet implemented.

By @remon - 3 months
I've read the introduction and descriptions two times now and I still don't understand what this adds to the proceedings. It appears to be an extremely thin abstraction over object storage solutions rather than an actual DB which the name and their texts imply.
By @yawnxyz - 3 months
is this an easier to do the "store parquet on s3 > stream to duckdb" pattern that's popping up more and more?
By @shenli3514 - 3 months
Went thru the document: https://slatedb.io/docs/introduction/#use-cases I can not understand why are they targeting the following use cases with this architecture. * Stream processing * Serverless functions * Durable execution * Workflow orchestration * Durable caches * Data lakes
By @hantusk - 3 months
Since writes to object storage are going to be slow anyway, why not double down on read optimized B-trees rather than write optimized LSM's?
By @epolanski - 3 months
Not a db guy, just asking, what does it mean "embedded" database?

I'm confused here, because Google says it's a db bundled with the application, but that's not really what I get from the landing page.

What problem does it solve?

By @loxias - 3 months
Can I please, please, please, have C++ or at least C bindings? :) Or the desired way to call Rust from another runtime? I don't know any Rust.
By @demarq - 3 months
Embed cloud

Sounds like they just cancel each other out. Not sure what advantage embedding will yield here

By @goodpoint - 3 months
Despite the name this is not a database.
By @tgdn - 3 months
"It doesn't currently ship with any language bindings"

Rust is needed to use SlateDB at the moment