What is Iceberg Maintenance?β
In Apache Iceberg, table maintenance is the process of keeping tables efficient and performant as data is continuously added, updated, or deleted. As an Iceberg table evolves, it can accumulate small data files, delete files, and additional metadata. Over time, this can make the table less efficient for querying and increase storage and compute overhead.
Maintenance operations helps address this by keeping the table well-organized, compact, and optimized for query engines, ensuring consistent performance as data grows.
When is Table Maintenance Required?β
In Apache Iceberg, table maintenance should be performed periodically to ensure consistent query performance and efficient storage as data evolves.
You should consider running maintenance in the following scenarios:
-
Frequent Data Ingestion or Updates
-
Accumulation of Small Files
-
Presence of Delete Files
-
High Partition Cardinality
-
Degrading Query Performance
-
Growing Table Size Over Time
Regular maintenance ensures that Iceberg tables remain optimized, scalable, and performant for analytical workloads.
Iceberg maintenance (Optimization) is available starting from v0.4.0. Upgrade OLake UI to access the Maintenance module.
- Existing users: If you are already using OLake for Ingestion follow the upgrade guide to accesss Maintenance module.
- New users: Follow the quickstart guide to get started.