Summary
In this in-depth technical webinar, Viktor Kessler—co-founder of Vakamo and former tech leader at MongoDB and Dremio—dissects the engineering principles that make Apache Iceberg a cornerstone of the modern lakehouse. Expect a hands-on look at Iceberg’s architecture, REST catalog, metadata optimizations, security model, and a live Lakekeeper demo, plus a glimpse into forthcoming OLake integrations.
Chapters & Topics
Welcome and Speaker Introduction
Host frames the session goals and introduces Viktor’s background in distributed data systems, setting the stage for a code-heavy, architecture-first discussion.
Iceberg Lakehouse Architecture Foundations
Breakdown of table formats, manifest lists, snapshots, and how Iceberg reconciles data-lake flexibility with data-warehouse performance guarantees.
REST Catalog Deep Dive
Examination of distributed metadata management, multi-engine interoperability, and how REST Catalog decouples compute from storage at scale.
Metadata Optimization Techniques
Strategies for pruning, partition evolution, and manifest compaction that yield dramatic query-latency reductions in production clusters.
Enterprise-Grade Governance & Security
Fine-grained access control patterns, audit logging, and compliance features that satisfy strict CISO and regulatory requirements.
Live Demo: Lakekeeper in Action
Real-world performance benchmarks and implementation patterns using Lakekeeper, highlighting best practices for ingestion, maintenance, and recovery.
Future Directions: OLake Write-Compute Layer
Preview of planned OLake integration as a high-throughput write layer, covering optimization strategies and roadmap milestones.
Audience Q&A and Closing Thoughts
Open floor for engineering questions on deployment, tuning, and migration; Viktor shares actionable takeaways and next steps.
Action Items
- Distribute session recording, slide deck, and demo code snippets to all registrants.
- Share benchmark scripts and configuration files used in the Lakekeeper demo.
- Provide a follow-up resource list (docs, blogs, OSS repos) for deep diving into Iceberg and OLake.
Key Questions
- What are the trade-offs between REST Catalog and Hive/Glue catalogs in large enterprises?
- How do manifest compaction and snapshot expiration impact long-term storage costs?
- What design considerations should teams weigh when introducing OLake as a write layer atop existing Iceberg tables?