Mastering Iceberg Maintenance: From Compaction to Cost Optimization
Details
- Date:
January 15, 2025
- Time:
11:00 AM EST, 08:30 PM IST
- Duration:
60 mins
Summary
Apache Iceberg has quickly become the backbone of modern data lakes, but maintaining tables efficiently is just as critical as building them. This session dives into the art of Iceberg table maintenance, from compaction strategies to metadata cleanup, with a focus on balancing query performance and compute cost. Attendees will walk away with actionable strategies and best practices to keep their Iceberg tables lean, fast, and future-proof.
- ✅ Introduction & The Maintenance Challenge - Why Iceberg table maintenance is critical for production data lakes
- ✅ Compaction Strategies Deep Dive - Bin-packing vs. Sorting vs. Z-ordering and when to use each approach
- ✅ Metadata & Snapshot Management - Snapshot expiration policies, orphan file cleanup, and manifest rewrites
- ✅ File Layout Optimization - Solving the small file problem and right-sizing files for optimal performance
- ✅ Cost-Performance Optimization Framework - Measuring ROI of maintenance operations and scheduling strategies
- ✅ Q&A and Best Practices - Interactive session with actionable insights for data engineering teams
Hosted By

Amit Gilad
CTO @ Lakeops
Apache Iceberg enthusiast and ex-data engineer at Cloudinary. Amit focus revolves around leveraging the power of data lakes and diving into Apache Iceberg. He's deeply focused on harnessing the potential of data lakes, orchestrating the organization and management of vast datasets to extract valuable insights.
Ready to Join our next webinar?
Secure your spot by registering below.