Skip to main content

OLake Go (v0.7.0)

April 21, 2026 – April 30, 2026

🎯 What's New​

Sources​

  1. MySQL chunking optimisation -
    Replaced repeated database lookups during chunk discovery with mathematical range splitting β€” arithmetic progression for numeric primary keys and Unicode-encoded range splitting for string keys β€” significantly reducing chunk generation time for large tables while ensuring correct collation-aware ordering.

  2. SSH tunnel support for DB2 and MSSQL -
    Added SSH tunnel configuration for the DB2 and MSSQL drivers. DB2 uses a local TCP proxy on localhost:0 forwarded through the SSH client (since go_ibm_db has no Go-level dial hook), while MSSQL routes connections via go-mssqldb's Connector.Dialer and HostDialer interfaces with remote-side DNS resolution.

  3. Schema filtering for PostgreSQL discovery -
    Added an optional schemas config field to restrict the discover operation to user-specified PostgreSQL schemas. When omitted, existing behaviour is preserved and all non-system schemas are discovered.

πŸ”§ Bug Fixes & Stability​

  1. Upgrade pgx/v5 to v5.9.2 for security fixes -
    Upgraded github.com/jackc/pgx/v5 from v5.7.3 to v5.9.2 to remediate two security vulnerabilities: a critical memory-safety flaw (CVE-2026-33816) that could allow memory corruption and a low-severity SQL injection advisory (GHSA-j88v-2chj-qfwx). No existing functionality is affected by this upgrade.

  2. Oracle chunk boundary query optimisation -
    Replaced N+1 sequential database round trips in splitViaTableIteration with a single NTILE-based query to fetch all chunk boundaries at once, with a fallback to the original loop when table stats are unavailable.

  3. Iceberg positional delete file fix for CDC upserts -
    Compaction was failing when multiple changes for the same _olake_id arrived in a single batch, caused by a positional delete file referencing multiple data files. Fixed by creating one positional delete file per data file reference.

  4. PostgreSQL primary key discovery fix via pg_catalog -
    information_schema.key_column_usage incorrectly included foreign key columns as primary keys, causing wrong _olake_id hashes, missed equality deletes, and duplicate rows in Iceberg on CDC upserts. Replaced with a pg_catalog-based query that returns only true primary keys and works correctly for read-only roles on managed databases like RDS, Supabase, and Render.



πŸ’‘ Join the OLake Community!

Got questions, ideas, or just want to connect with other data engineers?
πŸ‘‰ Join our Slack Community to get real-time support, share feedback, and shape the future of OLake together. πŸš€

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!