April 16th, 2025

KIP-1150: Diskless Kafka Topics

KIP-1150 proposes Diskless Topics in Apache Kafka to optimize storage and reduce costs by using object storage, enabling multi-region active-active topics and automatic failover, enhancing Kafka's market competitiveness.

Read original articleLink Icon
KIP-1150: Diskless Kafka Topics

KIP-1150 proposes the introduction of Diskless Topics in Apache Kafka to optimize storage and reduce operational costs, particularly in cloud environments. The motivation behind this proposal stems from the increasing workloads on Kafka and the need for cost-effective solutions, as existing replication methods are expensive. Diskless Topics would allow Kafka operators to utilize object storage instead of block storage, eliminating inter-zone data transfer costs and enabling multi-region active-active topics with automatic failover. This feature aims to enhance Kafka's competitiveness against alternative protocols that already leverage object storage. The proposal does not require immediate changes to the codebase or documentation but seeks consensus on the necessity of this feature. Future KIPs will detail the implementation of related functionalities, such as producer rack-awareness and garbage collection for diskless objects. The proposal emphasizes that maintaining this feature under the Apache 2.0 license will benefit the community and ensure Kafka's relevance in the market. The KIP also outlines potential follow-up features that could enhance Kafka's capabilities further. Overall, KIP-1150 aims to position Apache Kafka as a versatile streaming engine that balances cost and performance across diverse workloads.

- KIP-1150 introduces Diskless Topics to optimize storage in Apache Kafka.

- The proposal aims to reduce operational costs by utilizing object storage instead of block storage.

- Diskless Topics will enable features like multi-region active-active topics and automatic failover.

- Future KIPs will detail the implementation of related functionalities.

- The proposal seeks to maintain Apache Kafka's competitiveness in the market against alternative protocols.

Link Icon 4 comments
By @NortySpock - 9 days
I presume this is part of the knock-on effects of Confluent (managed Kafka) buying WarpStream (Kafka emulated on S3 object storage).

Also a shout-out to Bento, the fork of Benthos after RedPanda acquired it.

By @chtefi - 9 days
Full disaggregation of compute and storage is the right direction. Let storage handle replication, it's getting good, global, low latency, cheaper (like with S3 Express). Kafka becomes a smart data ingester and router: it moves bytes, enforces ordering, does minimal buffering. That's it. Do one thing well.

You get a system simpler to operate, to scale, and more flexible; data could be consumed outside of Kafka itself (in a batch way typically), without duplicating the data, that's a big win.