Skip to main content

OLake (v0.1.6 – v0.1.8)

July 17 – July 30, 2025

🎯 What's New

Sources

  1. Incremental Sync: MongoDB and Oracle -
    Added incremental synchronisation support for MongoDB and Oracle sources. This adds change‑only replication for both the sources so OLake transfers new/updated documents since the last run, reducing latency and data volume for recurring pipelines.

  2. Oracle Connector Filter & Chunking -
    Added filter support and optimised chunking strategy for the Oracle connector. This ensures query-level filtering and an optimized chunking strategy to the Oracle connector, ensuring only relevant rows are fetched and evenly sized data chunks maximize parallel throughput.

  3. Oracle multi cursor support for incremental Sync -
    OLake incremental sync can now be configured with a primary and secondary cursor column of the same datatype, where the secondary is used only if the primary cursor value is NULL, reducing missed changes in sparse or null‑heavy tables.

  4. MySQL Binlog Permissions Check -
    Automatically validates that the MySQL user has the required binlog privileges before CDC starts, preventing mid‑run failures due to missing permissions.

  5. Postgres CDC Improvement -
    The core improvement ensures that when Postgres CDC detects an LSN position problem requiring a full reload, it properly repositions the replication slot rather than attempting to read from outdated cached WAL data.

  6. Universal Filter Option -
    Offers a consistent filter parameter across all source drivers, letting you apply the same include/exclude rules in MongoDB, Oracle, MySQL, Postgres, and more without driver-specific syntax.

Destinations

  1. Clear Destination Flag -
    Provides a flag to clear destination datasets before a full refresh, ensuring the target only contains the latest snapshot. This is useful when resetting tables or removing stale records ahead of a new load.

  2. Added support for custom s3_endpoint in Parquet writer config -
    Added an optional s3_endpoint configuration in the Parquet writer allowing users to specify a custom S3-compatible endpoint for writing Parquet files to S3.

🔧 Bug Fixes & Stability

  1. Credential Parsing Fix -
    Corrects parsing of complex connection strings for Postgres and MongoDB so special characters and URI parameters are handled reliably. This reduces connection errors during job setup and discovery.

  2. Discovery Cursor Fix -
    Fixes merging of cursor fields in the new discover flow so schema and cursor metadata are recorded consistently. This avoids missing or duplicated cursor information when building stream definitions.

  3. Postgres CDC Reliability -
    Improved Postgres CDC behaviour by advancing LSN during full load when cache is invalid.



💡 Join the OLake Community!

Got questions, ideas, or just want to connect with other data engineers?
👉 Join our Slack Community to get real-time support, share feedback, and shape the future of OLake together. 🚀

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!