3 posts tagged with "Compaction"
Blogs on the topic Iceberg table compaction
View All TagsApache Iceberg Table Maintenance Made Easier with OLake Fusion
OLake Fusion is an Apache Iceberg table maintenance solution for CDC tables, helping manage small files and delete files with tiered scheduling, metrics, and lower Spark costs.
50% Cheaper (2x Faster) Iceberg Compaction: OLake Fusion (Open Source) Beats Spark
We benchmark Spark rewrite_data_files against OLake Fusion compaction on Apache Iceberg by running a full TPCH lineitem load from Postgres to GCP, applying 200k-record CDC batches every 2 minutes, and tracking TPC-H Query 6 performance, runtime, resource usage, and infrastructure cost.
Iceberg Compaction: How Much Faster Are TPC-H Queries?
We ran TPC-H queries on Iceberg tables with many small files, then compacted them and ran the same queries again. Here's how much faster compaction made them.




