Blogs on OLake
Conflict-Free CDC into Apache Iceberg: Architecting Temporal Memory for Autonomous Agents
Learn how to build 'Temporal Memory' for Agentic AI. Bypassing the Iceberg Read Amplification Wall using OLake CDC and ClickHouse for sub-second network analytics.
Apache Iceberg Table Maintenance Made Easier with OLake Fusion
OLake Fusion is an Apache Iceberg table maintenance solution for CDC tables, helping manage small files and delete files with tiered scheduling, metrics, and lower Spark costs.
50% Cheaper (2x Faster) Iceberg Compaction: OLake Fusion (Open Source) Beats Spark
We benchmark Spark rewrite_data_files against OLake Fusion compaction on Apache Iceberg by running a full TPCH lineitem load from Postgres to GCP, applying 200k-record CDC batches every 2 minutes, and tracking TPC-H Query 6 performance, runtime, resource usage, and infrastructure cost.
The Architect’s Guide to CDC with Apache Iceberg
Learn how to design reliable CDC pipelines into Apache Iceberg, covering ingestion patterns, delete handling, and architecture best practices.
Iceberg Compaction: How Much Faster Are TPC-H Queries?
We ran TPC-H queries on Iceberg tables with many small files, then compacted them and ran the same queries again. Here's how much faster compaction made them.
Apache Iceberg Observability: Monitoring & Metrics for Data Lake Tables
How Apache Iceberg turns table metadata into a first-class observability layer, enabling proactive monitoring, anomaly detection, and automated maintenance for modern data lakes.










