Skip to main content

OLake (v0.1.9 – v0.1.11)

August 15 – August 27, 2025

🎯 What's New

Sources

  1. Multi-Cursor support for MongoDB -
    Added multi-cursor capability for MongoDB, enabling OLake to parallelize reads by splitting the workload across multiple cursors, improving read performance.

  2. Incremental Sync: MySQL and Postgres -
    Added incremental synchronization for MySQL and Postgres. This adds change‑only replication for both the sources so OLake transfers new/updated documents since the last run, reducing latency and data volume for recurring pipelines.

Platform Features

  1. Consistent Batch Size -
    Introduced consistent batch size across all drivers, which standardizes the number of records processed in each chunk or transaction across all database drivers. This ensures that each batch uses predictable resources (memory, CPU, S3 requests), leading to uniform throughput regardless of source.

  2. Default Normalisation (Relational DBs) -
    Set default normalisation for relational databases. Applies a canonical structure to raw table data by default—e.g., converting TIMESTAMP WITH TIME ZONE to a uniform ISO format, normalizing nested JSON columns into separate columns, and ensuring consistent column naming. This “normalization” simplifies downstream consuming systems.

  3. Discover Timeout Override -
    Allows you to extend or shorten the default timeout for schema discovery operations. If your database catalog is very large or slow to respond, you can override the discovery timeout to prevent premature failures or to speed up quick checks.

🔧 Bug Fixes & Stability

  1. Oracle Incremental Fallback Cursor Deletion -
    Fixes the logic around deleting fallback cursors used for incremental reads from Oracle. When an incremental cursor fails or reaches its window, OLake now correctly closes and cleans up those temporary cursors, preventing session leaks and ensuring reliability.

  2. Spark Compatibility for Timestamps -
    Fixed OLake-generated timestamps for Spark version compatibility.

  3. Default Columns in Sync -
    Corrected OLake to ensure essential default columns (keys and metadata) are always included consistently in the catalog for both full refresh and incremental sync modes



💡 Join the OLake Community!

Got questions, ideas, or just want to connect with other data engineers?
👉 Join our Slack Community to get real-time support, share feedback, and shape the future of OLake together. 🚀

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!