24 posts tagged with "OLake"
Blogs on the topic OLake
View All TagsApache Iceberg Table Maintenance Made Easier with OLake Fusion
OLake Fusion is an Apache Iceberg table maintenance solution for CDC tables, helping manage small files and delete files with tiered scheduling, metrics, and lower Spark costs.
50% Cheaper (2x Faster) Iceberg Compaction: OLake Fusion (Open Source) Beats Spark
We benchmark Spark rewrite_data_files against OLake Fusion compaction on Apache Iceberg by running a full TPCH lineitem load from Postgres to GCP, applying 200k-record CDC batches every 2 minutes, and tracking TPC-H Query 6 performance, runtime, resource usage, and infrastructure cost.
IBM Db2 LUW to Lakehouse: Sync to Apache Iceberg Using OLake
A practical guide to syncing IBM Db2 for LUW databases to Apache Iceberg using OLake, covering setup, configuration, sync modes, troubleshooting, and DB2-specific considerations like RUNSTATS and REORG.
How to Compact Apache Iceberg Tables: Small Files + Automation with Apache Amoroâ„¢
A practical guide to fixing small-file bloat in Apache Iceberg, showing when and how to run compaction, the performance gains you can expect, and how Apache Amoroâ„¢ automates it to turn Iceberg tables into self-optimizing lakehouses.
Sync MSSQL to Your Lakehouse with OLake
A practical guide to syncing Microsoft SQL Server (MSSQL) into Apache Iceberg using OLake, covering sync modes, CDC setup, schema changes, data type mapping, and troubleshooting.
Ingesting Files from S3 with OLake: Turn Buckets into Reliable Streams (AWS + MinIO + LocalStack)
A comprehensive guide to ingesting data from Amazon S3 and S3-compatible storage using OLake, covering stream discovery, format support, incremental sync, and best practices for AWS, MinIO, and LocalStack.









