Use Cases for OLake Fusion
1. Improving Query Performance on Iceberg Tablesβ
As Iceberg tables grow through continuous ingestion, updates, and deletes, they accumulate small files and delete files which silently degrade query performance over time.
OLake Fusion addresses this by compacting these files and rewriting your Iceberg tables keeping them compact and query-ready at all times.
This approach provides:
-
Faster query execution β Fewer files per scan means less I/O overhead for query engines.
-
Reduced planning time β Cleaner metadata speeds up query planning in engines like Trino, Spark, and DuckDB.
-
Delete file resolution β Positional and equality delete files are applied and removed, eliminating redundant overhead at read time.
With OLake Fusion, your Iceberg tables stay performant as they scale β without requiring manual maintenance scripts or custom tooling.
2. Observability of Iceberg Tablesβ
Understanding the health and state of your Iceberg tables is critical for diagnosing performance issues and planning maintenance. Without visibility into table internals, teams are often left guessing why queries are slow or why storage costs are climbing.
OLake Fusion provides deep observability into your Iceberg table, exposing key metrics that help you understand the exact state of the table.
Key benefits:
-
File-level insights β Track the number of data files, delete files, total file count and average file size per table.
-
Health Score β Use it to quickly identify which tables need compaction without having to dig into individual file-level metrics.
This gives data engineers and platform teams the clarity needed to make informed decisions about when and where to run maintenance β instead of running it blindly on a schedule.
3. Table-Level Optimizationβ
Different Iceberg tables have different maintenance needs. A high-frequency CDC table accumulates small files rapidly, while a large historical table might only need periodic snapshot cleanup. A one-size-fits-all maintenance strategy wastes compute and can introduce unnecessary churn.
OLake Fusion enables fine-grained, table-level optimization, allowing teams to configure and apply maintenance operations independently for individual tables.
Key benefits:
-
Per-table configuration β Define compaction strategies and configurations independently for each table.
-
Selective operation control β Choose which type of compaction to apply, not all tables need a complete rewrite always.
This granular control ensures that compute is spent where it matters most, keeping critical tables optimized without over-maintaining stable ones.