July 22nd, 2024

Is an All-in-One Database the Future?

Specialized databases are emerging to tackle complex data challenges, leading to intricate infrastructures. A universal, all-in-one database remains unfulfilled due to optimization issues and unique challenges of different database types.

As data challenges become more complex, specialized databases have emerged to address specific problems, leading to intricate data infrastructures often reliant on third-party data pipelines and event streaming products. Companies typically utilize multiple databases to meet diverse needs, resulting in a complicated network of interconnected systems. The pursuit of a generalist, all-in-one database that is scalable, efficient across various contexts, and commercially viable remains unfulfilled. Different database types, including relational OLTP, non-relational document stores, and memory-based caches, are optimized for particular use cases, each facing unique challenges that hinder effective consolidation. While Object-Relational Mappers (ORMs) aim to simplify database interactions, they struggle with non-relational types, highlighting the difficulty of creating a unified solution. Postgres is noted for its extensibility through plugins, making it a close contender for a comprehensive solution, yet it still does not fully meet the criteria of a one-stop-shop. The technical and practical challenges associated with developing an all-in-one database render it impractical, maintaining the modern data stack as a complex necessity. Theoretically, a universal database would face issues with optimization, data model overhead, and latency, as each type is designed for specific applications, making a singular solution inefficient. Although Postgres can be enhanced with plugins like pg_vector for vector search, it is not designed to address every problem effectively, as evidenced by the complex data stacks in organizations utilizing it.

Relational Algebra Primer

Relational algebra underpins databases like PostgreSQL and MySQL. Bmg, a Ruby tool, bridges relational algebra and SQL databases, aiding data manipulation. Understanding relational algebra enriches SQL database skills and Bmg utilization.

Just Use Postgres for Everything

The article promotes using Postgres extensively in tech stacks to simplify development, improve scalability, and reduce operational complexity. By replacing various technologies with Postgres, developers can enhance productivity, focus on customer value, and potentially cut costs.

Just Use Postgres for Everything

The blog post advocates for using PostgreSQL extensively in tech stacks to simplify development, improve productivity, and reduce complexity. It highlights benefits like scalability, efficiency, and cost-effectiveness, promoting a consolidated approach.

Postgres vs. Pinecone

Postgres and Pinecone differ in performance and cost. Pinecone criticizes Postgres for index issues, while Postgres showcases superior performance with tweaks, specialized indexes, and cost-effectiveness, offering transparency and customization.

Difference between running Postgres for yourself and for others

The post compares self-managed PostgreSQL with managing it for others, focusing on provisioning, backup/restore, HA, and security. It addresses complexities in provisioning, backup strategies, HA setup, and security measures for external users.

5 comments

By @iamlucaswolf - 10 months

I think this question mixes up two orthogonal dimensions: transactionality (i.e., OLTP vs. OLAP) and data model (relational, document-based, graph, vector, ...).

Regarding transactionality: There is an entire area of research on "hybrid transactional and analytical processing" (HTAP) systems that unifies OLAP and OLTP systems. Hyper [1] pioneered this path at TU Munich, it's successor Umbra [2] recently incorporated as CedarDB [3]. There are lots of others. Most of these systems, AFAIK, are relational.

Regarding data model: What we've seen in the past few decades is that non-relational DBMS (excluding key-value stores) only make sense in rare edge cases that require huge scale. There has, e.g. been research [4] that shows that graph databases are still, well, lacking, compared to relational systems. The common pattern seems to be: unless you need to service very specific workloads at huge scales, SQL is probably enough [5]. Then again, it really comes down to intrinsics. If you were to, for example, implement distributed locking using Postgres, you would likely run into problems with MVCC and Xids very quickly.

So, as you already mentioned, there is no silver bullet. But even today, unless you are Meta or Google, SQL is probably enough for a long time and lots of use cases.

(Full disclosure: I'm working on Hyper full-time).

[1]: https://hyper-db.de/ [2]: https://umbra-db.com/ [3]: https://cedardb.com/ [4]: https://homepages.cwi.nl/~boncz/edbt2022.pdf [5]: https://www.youtube.com/watch?v=VxKt245X_ws

By @Etai - 10 months

Here's my full thoughts: https://www.secoda.co/blog/is-an-all-in-one-database-the-fut...

By @lifeinthevoid - 10 months

It all depends on the use-case, a generalist database can never achieve the performance of specialized databases once the workload gets large enough. It's like asking for a car that's great off-road but also beats an F1 car on the circuit.

By @tacostakohashi - 10 months

Hmm, ever used Oracle?

I'm not here to defend it, but in practical terms, it does everything anyone is likely to need out of the box. Relational, XML, JSON, replication, document...

By @yen223 - 10 months

I would be curious to know if anyone tried using one of the newer data-warehouse databases (Clickhouse, Snowflake, etc) as an OLTP database, to run their crud app.

Is an All-in-One Database the Future?

Related

Relational Algebra Primer

Just Use Postgres for Everything

Just Use Postgres for Everything

Postgres vs. Pinecone

Difference between running Postgres for yourself and for others

Related

Relational Algebra Primer

Just Use Postgres for Everything

Just Use Postgres for Everything

Postgres vs. Pinecone

Difference between running Postgres for yourself and for others