October 2nd, 2024

Alert Evaluations: Incremental Merges in ClickHouse

Highlight improved alert evaluation performance by implementing incremental merges with ClickHouse, reducing processing time from 1.24 seconds to 0.11 seconds and memory usage from 7.6 GB to 82 MB.

Read original articleLink Icon
Alert Evaluations: Incremental Merges in ClickHouse

Highlight, an open-source app monitoring platform, has been utilizing ClickHouse, a columnar database, to manage large datasets and real-time analytics. Recently, they faced performance challenges with their alert system, particularly in processing large time windows for alert evaluations. The traditional method of recalculating alerts every minute for a one-hour window was inefficient, leading to excessive computational overhead. To address this, Highlight implemented incremental merges using ClickHouse's aggregate functions, allowing them to compute and store smaller partial results that could be merged later. This approach significantly improved performance, particularly for simple aggregate functions like Count and Sum. For more complex calculations, such as the median, ClickHouse's -State and -Merge functions were employed, enabling efficient memory usage and faster processing. By calculating intermediate states for new data every minute and merging them with existing states, Highlight achieved a tenfold speed increase in alert evaluations, reducing processing time from 1.24 seconds to 0.11 seconds and memory usage from 7.6 GB to 82 MB. This optimization has enhanced their alert evaluation process, demonstrating the effectiveness of ClickHouse's advanced features in handling real-time data analytics.

- Highlight uses ClickHouse for managing large datasets and real-time analytics.

- Incremental merges improved alert evaluation performance significantly.

- The approach reduced processing time from 1.24 seconds to 0.11 seconds.

- Memory usage decreased from 7.6 GB to 82 MB.

- ClickHouse's -State and -Merge functions are key to efficient data processing.

Link Icon 2 comments
By @hodgesrm - 7 months
It sounds as if you used your own algorithm for pre-aggregation into the schema you showed. Did you consider using a materialized view to populate it?
By @iampims - 7 months
At a certain scale, exact computations (p50 for instance) become impractical. I’ve had great luck switching to approximate calculations with guaranteed error bounds.

An approachable paper on the topic is "Effective Computation of Biased Quantiles over Data Streams" http://dimacs.rutgers.edu/%7Egraham/pubs/papers/bquant-icde....