March 24th, 2025

Open-source Rust database tops JSONBench using DataFusion

GreptimeDB excelled in the JSONBench benchmark, outperforming ClickHouse and VictoriaLogs, achieving top query speed for 1 billion JSON documents, and offering cost-effective, efficient solutions for large-scale observability data.

Read original article

Open-source Rust database tops JSONBench using DataFusion

GreptimeDB has demonstrated its capabilities in handling large-scale datasets by outperforming competitors like ClickHouse and VictoriaLogs in the JSONBench benchmark, which focuses on analytical queries over JSON documents. The benchmark involved executing queries on datasets ranging from 1 to 1 billion JSON documents. GreptimeDB achieved the top rank in query speed during the cold run of 1 billion documents and showed superior performance in storage efficiency. Its cloud-native architecture allows it to leverage object storage for primary data storage, significantly reducing costs while maintaining high performance. GreptimeDB also features a built-in Pipeline (ETL) engine for native JSON support, enhancing its usability for observability data. The database's ability to perform in-database streaming allows for efficient real-time analytics, making it suitable for complex queries. Overall, GreptimeDB's performance in the benchmark highlights its potential as a cost-effective solution for enterprises dealing with large volumes of observability data.

- GreptimeDB outperformed ClickHouse and VictoriaLogs in the JSONBench benchmark.

- It ranked first in query speed for 1 billion JSON documents during cold runs.

- The database utilizes object storage to reduce costs while maintaining performance.

- GreptimeDB features a built-in ETL engine for efficient JSON data handling.

- Its in-database streaming capabilities enhance performance for real-time analytics.

Show HN: Storing and Analyzing 160 billion Quotes in ClickHouse

ClickHouse is effective for managing large financial datasets, offering fast query execution, efficient compression, and features like data deduplication and date partitioning, while alternatives like KDB and Shakti are also considered.

I spent 5 hours learning how ClickHouse built their internal data warehouse

ClickHouse developed an internal data warehouse processing 470 TB from 19 sources, utilizing ClickHouse Cloud, Airflow, and AWS S3, supporting batch and real-time analytics, enhancing user experience and sales integration.

Use Cases for ChDB, a Powerful In-Memory OLAP SQL Engine

chDB is an in-memory OLAP SQL engine that outperforms DuckDB, designed for lightweight analytics, enabling local data pipelines and serverless SQL analytics, with potential future enhancements for real-time processing.

Postgres Just Cracked the Top Fastest Databases for Analytics

PostgreSQL has been optimized for analytics with pg_mooncake, achieving a Top 10 ClickBench ranking. It uses a columnstore format and DuckDB for improved performance, rivaling specialized databases.

DiceDB

DiceDB is an open-source, reactive in-memory database optimized for modern hardware, outperforming Redis in throughput and latency, and encouraging community contributions under the BSD 3-Clause License.

4 comments

By @killme2008 - about 1 month

In the past, I also had doubts about whether a Rust database built on open-source components would be performance-limited, but this evaluation has dispelled our concerns. Apache DataFusion + Arrow + Parquet + OpenDAL, as a new data stack, have proven their potential.

By @k_bx - about 1 month

Big question I have is: should I invest in DeltaLake/Iceberg based solution, or something like this?

Show HN: Storing and Analyzing 160 billion Quotes in ClickHouse

I spent 5 hours learning how ClickHouse built their internal data warehouse

Use Cases for ChDB, a Powerful In-Memory OLAP SQL Engine

Postgres Just Cracked the Top Fastest Databases for Analytics

PostgreSQL has been optimized for analytics with pg_mooncake, achieving a Top 10 ClickBench ranking. It uses a columnstore format and DuckDB for improved performance, rivaling specialized databases.

Open-source Rust database tops JSONBench using DataFusion

Related

Show HN: Storing and Analyzing 160 billion Quotes in ClickHouse

I spent 5 hours learning how ClickHouse built their internal data warehouse

Use Cases for ChDB, a Powerful In-Memory OLAP SQL Engine

Postgres Just Cracked the Top Fastest Databases for Analytics

DiceDB

Related

Show HN: Storing and Analyzing 160 billion Quotes in ClickHouse

I spent 5 hours learning how ClickHouse built their internal data warehouse

Use Cases for ChDB, a Powerful In-Memory OLAP SQL Engine

Postgres Just Cracked the Top Fastest Databases for Analytics

DiceDB