October 2nd, 2024

Alert Evaluations: Incremental Merges in ClickHouse

Highlight improved alert evaluation performance by implementing incremental merges with ClickHouse, reducing processing time from 1.24 seconds to 0.11 seconds and memory usage from 7.6 GB to 82 MB.

Read original article

Alert Evaluations: Incremental Merges in ClickHouse

Highlight, an open-source app monitoring platform, has been utilizing ClickHouse, a columnar database, to manage large datasets and real-time analytics. Recently, they faced performance challenges with their alert system, particularly in processing large time windows for alert evaluations. The traditional method of recalculating alerts every minute for a one-hour window was inefficient, leading to excessive computational overhead. To address this, Highlight implemented incremental merges using ClickHouse's aggregate functions, allowing them to compute and store smaller partial results that could be merged later. This approach significantly improved performance, particularly for simple aggregate functions like Count and Sum. For more complex calculations, such as the median, ClickHouse's -State and -Merge functions were employed, enabling efficient memory usage and faster processing. By calculating intermediate states for new data every minute and merging them with existing states, Highlight achieved a tenfold speed increase in alert evaluations, reducing processing time from 1.24 seconds to 0.11 seconds and memory usage from 7.6 GB to 82 MB. This optimization has enhanced their alert evaluation process, demonstrating the effectiveness of ClickHouse's advanced features in handling real-time data analytics.

- Highlight uses ClickHouse for managing large datasets and real-time analytics.

- Incremental merges improved alert evaluation performance significantly.

- The approach reduced processing time from 1.24 seconds to 0.11 seconds.

- Memory usage decreased from 7.6 GB to 82 MB.

- ClickHouse's -State and -Merge functions are key to efficient data processing.

Materialized views in ClickHouse: The data transformation Swiss Army knife

Materialized views in ClickHouse enhance query performance by storing results on disk and updating automatically. They improve efficiency but increase storage use and risk insert errors. Incremental updates optimize performance.

ClickHouse acquires PeerDB to expand its Postgres support

ClickHouse has acquired PeerDB to enhance Postgres support, improving speed and capabilities for enterprise customers. PeerDB's team will expand change data capture, while existing services remain available until July 2025.

Show HN: Storing and Analyzing 160 billion Quotes in ClickHouse

ClickHouse is effective for managing large financial datasets, offering fast query execution, efficient compression, and features like data deduplication and date partitioning, while alternatives like KDB and Shakti are also considered.

ClickHouse Data Modeling for Postgres Users

ClickHouse acquired PeerDB to enhance PostgreSQL data replication. The article offers data modeling tips, emphasizing the ReplacingMergeTree engine, duplicate management, ordering key selection, and the use of Nullable types.

I spent 5 hours learning how ClickHouse built their internal data warehouse

ClickHouse developed an internal data warehouse processing 470 TB from 19 sources, utilizing ClickHouse Cloud, Airflow, and AWS S3, supporting batch and real-time analytics, enhancing user experience and sales integration.

2 comments

By @hodgesrm - 7 months

It sounds as if you used your own algorithm for pre-aggregation into the schema you showed. Did you consider using a materialized view to populate it?

By @iampims - 7 months

At a certain scale, exact computations (p50 for instance) become impractical. I’ve had great luck switching to approximate calculations with guaranteed error bounds.

An approachable paper on the topic is "Effective Computation of Biased Quantiles over Data Streams" http://dimacs.rutgers.edu/%7Egraham/pubs/papers/bquant-icde....

Alert Evaluations: Incremental Merges in ClickHouse

Related

Materialized views in ClickHouse: The data transformation Swiss Army knife

ClickHouse acquires PeerDB to expand its Postgres support

Show HN: Storing and Analyzing 160 billion Quotes in ClickHouse

ClickHouse Data Modeling for Postgres Users

I spent 5 hours learning how ClickHouse built their internal data warehouse

Related

Materialized views in ClickHouse: The data transformation Swiss Army knife

ClickHouse acquires PeerDB to expand its Postgres support

Show HN: Storing and Analyzing 160 billion Quotes in ClickHouse

ClickHouse Data Modeling for Postgres Users

I spent 5 hours learning how ClickHouse built their internal data warehouse