Blogs on OLake
Building a Complete Open Data Lakehouse from Scratch with OLake, PrestoDB and MinIO
Learn how to build a complete open data lakehouse from scratch using MySQL, OLake, PrestoDB and MinIO. Get it running on your local machine in just a few steps with real-time CDC and analytics.
Apache Iceberg vs Delta Lake: Comparison for Batch Analytics & ML Pipelines
Compare Apache Iceberg and Delta Lake for batch analytics and ML pipelines. Learn about performance, ecosystem integration, and when to choose each format.
OLake Ingestion Filters Explained: SQL‑style WHERE‑Clause Support for Postgres, MySQL & MongoDB
Learn how OLake's new filter feature enables selective data ingestion across Postgres, MySQL, and MongoDB, improving efficiency and reducing processing overhead.
Building Modern Lakehouse with Iceberg, OLake, Lakekeeper & Trino
Iceberg is the storage "brain," OLake is the real-time "pipeline," and Trino is the fast "question-answering" engine. Together they turn raw object-storage files into a governed, low-latency analytics platform.
Running OLake on EC2 with Your Existing Airflow, Sync Your Data Effortlessly
At OLake, we are building tools to make data integration seamless. Today, we are excited to show you how to leverage your existing Apache Airflow...
What makes OLake fast?
OLake is engineered for high-throughput ELT workloads, leveraging a combination of adaptive chunking strategies & parallelized execution for historical load, and change data capture (CDC) techniques to optimize data ingestion performance...