28-04-2025 Release notes
Breaking Changes
catalog.json
renamed tostreams.json
#246 — As part of improved terminology and consistency, OLake now expects stream configuration to be defined in a file namedstreams.json
instead ofcatalog.json
.
⚠️ This is a backward-incompatible change. Existing pipelines or DAGs must update the file name to continue functioning correctly.
Core OLake Changes
🔧 Enhancements & Features
-
Hive Catalog Support
#226 — OLake now supports syncing to Apache Iceberg tables using Hive catalog integration. -
Lakekeeper Integration for REST Catalogs
REST-based catalogs can now be orchestrated with Lakekeeper for better sync lifecycle control. -
Iceberg Partitioning Logic
#227 — Native partitioning support added to Iceberg writer for optimized data layout. -
Fallback ID Generation (
olakeID
)
#225 — OLake now auto-generates a consistentolakeID
if no primary key is present in a record.
🛠 Improvements & Fixes
-
Updated to Iceberg Writer
v1.7.2
#252 — Switched to Iceberg 1.7.2 to resolve an issue with S3 connections not closing promptly when using the Hive catalog. -
Handle Special Double Values (NaN, Infinity, -Infinity)
#251 — Fixed ingestion failures due to unsupported float values in source data.
Fixes: #109 -
Improved Logging for Iceberg Sync Failures
#239 — Enhanced logging output with clearer and color-coded error messages for debugging sync failures. -
Pre-commit Hook Integration
#178 — Added automated formatting and linting checks via pre-commit hooks to enforce code standards.
🔌 Connector Releases
🐘 Postgres Connector — v0.0.4
- Resolved data types issues. Now more consistent, accurate and destination column data types are identical to DB ones.
🍃 MongoDB Connector — v0.0.10
- Addition of primitive data type support: #247
- Improved performance in syncing high-volume collections.