Overview | OLake

Last updated:6/23/2025|... min read

What is the Iceberg Writer?

The Iceberg Writer syncs data from databases (MySQL, MongoDB, PostgreSQL) into Apache Iceberg. Apache Iceberg is a table format that offers a number of benefits over traditional table formats like Parquet and ORC. Iceberg tables are designed to be efficient for both reads and writes, and they support schema evolution, ACID transactions, and time travel.

Supported Catalogs

Catalog type	Docs / example	Comments
REST – Lakekeeper	LINK	[Officially Supported] Rust-native catalog with optimistic locking; Helm chart available for K8s.
REST - Unity	LINK	[Supported and tested] Unity Catalog (Databricks) with Personal Access Token authentication.
REST – Gravitino	LINK	[Supported, Yet to be tested] Uses standard Iceberg REST; Gravitino adds multi-cloud routing.
REST – Nessie	LINK	[Supported, Yet to be tested] Time-travel branches/tags; supply `nessie.endpoint` in `destination.json`.
REST – Polaris	LINK	[Supported, Yet to be tested]
Hive Metastore	Hive catalog config	Classic HMS; good fit for on-prem Hadoop or EMR.
JDBC Catalog (Postgres/MySQL)	JDBC catalog sample	Stores Iceberg metadata in an RDBMS(Postgres); easiest to spin up locally with Postgres.
AWS Glue Catalog	Glue catalog IAM & config	Best choice on AWS; lets Athena, EMR, and Redshift query the same tables instantly.
Azure Purview	Not Planned, submit a request
BigLake Metastore	Not Planned, submit a request

For more catalog options, please refer to the OLake roadmap.

Quick Start Guide

info

OLake UI is live (beta)! You can now use the UI to configure your Iceberg Destination, manage and configure various catalogs. Check it out at OLake UI regarding how to setup using Docker Compose and running it locally.

olake-destination-iceberg

Use OLake UI for Apache Iceberg
Use OLake CLI for Apache Iceberg

Create a Iceberg Destination in OLake UI

Follow the steps below to get started with Iceberg Destination using the OLake UI (assuming the OLake UI is running locally on localhost:8000):

Navigate to Destinations Tab.
Click on + Create Destination.
Select Apache Iceberg as the Destination type from Connector drop down, select the Catalog type (AWS Glue, REST, JDBC, Hive).
Fill in the required connection details in the form. For details regarding the connection details, refer to the Iceberg Destination Configuration docs section on the right side of UI.
Click on Create ->
OLake will test the destination connection and display the results. If the connection is successful, you will see a success message. If there are any issues, OLake will provide error messages to help you troubleshoot.

This will create a Iceberg destination in OLake, now you can use this destination in your Jobs Pipeline to sync data from any Database to Apache Iceberg.

Edit Iceberg Destination in OLake UI

To edit an existing Iceberg destination in OLake UI, follow these steps:

Navigate to the Destinations Tab.
Locate the Iceberg destination you want to edit from Active destination or Inactive destination tabs or using the search destination bar.
Click on the Edit button next to the Destinations from the Actions tab (3 dots).
Update the connection details as needed in the form and Click on Save Changes.

caution

Editing a destination can break pipeline.

You will see a notification saying "Due to editing, the jobs are going to get affected".

Editing this destination will affect the following jobs that are associated with this destination and as a result will fail immediately. Do you still want to edit the destination?

olake-destination-edit-2

OLake will test the updated destination connection once you hit confirm on the destination Editing Caution Modal. If the connection is successful, you will see a success message. If there are any issues, we will provide error messages to help you troubleshoot.

Jobs Associated with Iceberg Destination

In the Destination Edit page, you can see the list of jobs that are associated with this destination. You can also see the status of each job, whether it is running, failed, or completed and can pause the job from the same screen as well.

olake-destination-associated-job-1

Delete Iceberg Destination in OLake UI

To delete an existing Iceberg Destination in OLake UI, follow these steps:

Navigate to the Destination Tab.
Locate the destination you want to delete from Active Destinations or Inactive Destinations tabs or using the search destination bar.
Click on the Delete button next to the destinations from the Actions tab (3 dots).

olake-destination-delete-2

A confirmation dialog will appear asking you to confirm the deletion.
Click on Delete to confirm the deletion.

olake-destination-delete-1

This will remove the Iceberg Destination from OLake.

note

You can also delete a Destination from the Destination Edit page by clicking on the Delete button at the bottom of the page.

Its a simple 3 step process:

Create a source config file and lets name it source.json,
Create another config file named destination.json and
Run the discover and sync commands to fetch the schema and start syncing the data respectively.

info

source.json - holds the source database information like host, port, username, password, database name, etc.
destination.json - holds the iceberg destination configurations like iceberg table name, iceberg database name, catalog information, etc.

Now, depending upon from where (source) to where (destination) you would like to sync the data, you can choose the below configurations.

PostgreSQL to Iceberg | Postgres Source Config
MongoDB to Iceberg | MongoDB Source Config
MySQL to Iceberg | MySQL Source Config

Now that you have the source configuration set, lets move on to the destination configuration.

OLake supports 4 types of catalogs for Iceberg writer, JDBC, AWS Glue, REST and Hive Meta Store. You can choose any of these destination configurations based on your requirement.

Run Sync Commands:
To replicate data from the source database to the Iceberg table, you need to run the sync commands. The sync command will read the data from the source database and write it to the Iceberg table.
- Discover Command: <DISCOVER_COMMAND>
- Sync Command: <SYNC_COMMAND>
- Sync with State Command: <SYNC_WITH_STATE_COMMAND>

Refer to respective Database docs to use the command for discover schema and sync the data.

A sample disover & sync command would look like this:

docker run --pull=always  \
  -v /Users/USERNAME/Desktop/projects/OLAKE_DIRECTORY:/mnt/config \
  olakego/source-mongodb:latest \
  discover \
  --config /mnt/config/source.json

info

The olakego/source-mongodb is the OLake image for MongoDB source. You can replace it with the respective source image for PostgreSQL (source-postgres) or MySQL (source-mysql) or can build one locally.

OLake System Columns (Iceberg storage layer)

Column name	Iceberg data type	Description
`data`	`string` (materialised as a single JSON string in Parquet)	Snapshot of the source row at the time of the event, if used with normalization turned off. • Contains every source-table column as key → value pairs. • If the pipeline is running in “normalised” mode, individual keys may also appear as first-class columns in the Iceberg table.
`_olake_id`	`string`	Deterministic, content-addressable identifier for the record. Computed as a hash of the source table’s primary-key value(s).
`_olake_timestamp`	`timestamp` (stored as `INT64` epoch µs in Parquet)	Ingestion timestamp generated by OLake at write time (always UTC). Useful for auditing, incremental reads, and late-arrival handling.
`_op_type`	`string` (one-char code)	Change-event type emitted by the connector: `r` = historical back-fill/read `c` = insert/create `u` = update `d` = delete.
`_cdc_timestamp`	`timestamp` (stored as `INT64` epoch µs in Parquet)	Exact commit timestamp captured from the source database’s change-data-capture stream (e.g., WAL, binlog). Represents when the mutation happened on the upstream system, independent of when OLake processed it. Defaults to a epoch start time if CDC operation not performed.

note

If any columns from the source database does not have value in it, it won't get created in the iceberg tables.

Need Assistance?

If you have any questions or uncertainties about setting up OLake, contributing to the project, or troubleshooting any issues, we’re here to help. You can:

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!

What is the Iceberg Writer?

Supported Catalogs

Quick Start Guide

Create a Iceberg Destination in OLake UI

Edit Iceberg Destination in OLake UI

Jobs Associated with Iceberg Destination

Delete Iceberg Destination in OLake UI

OLake System Columns (Iceberg storage layer)

Need Assistance?

Join our growing community

GitHub

Slack

Twitter

LinkedIn

YouTube

What is the Iceberg Writer?​

Supported Catalogs​

Quick Start Guide​

Create a Iceberg Destination in OLake UI​

Edit Iceberg Destination in OLake UI​

Jobs Associated with Iceberg Destination​

Delete Iceberg Destination in OLake UI​

OLake System Columns (Iceberg storage layer)​

Need Assistance?

Join our growing community

GitHub

Slack

Twitter

LinkedIn

YouTube

What is the Iceberg Writer?

Supported Catalogs

Quick Start Guide

Create a Iceberg Destination in OLake UI

Edit Iceberg Destination in OLake UI

Jobs Associated with Iceberg Destination

Delete Iceberg Destination in OLake UI

OLake System Columns (Iceberg storage layer)