destination.json
Before proceeding with either configuration, please ensure you have completed the getting started instructions for the database you want to connect from. For more background details, refer to the README section.
- Iceberg
- S3
- Local
Apache Iceberg
The Iceberg Writer syncs data from databases (MySQL, MongoDB, PostgreSQL) into Apache Iceberg.
- AWS Glue
- JDBC Catalog
- REST Catalog
- Hive Catalog
{
"type": "ICEBERG",
"writer": {
"catalog_type": "glue",
"iceberg_s3_path": "s3://<BUCKET_NAME>/<S3_PREFIX_VALUE>",
"aws_region": "ap-south-1",
"aws_access_key": "XXX",
"aws_secret_key": "XXX",
"iceberg_db": "ICEBERG_DATABASE_NAME",
"grpc_port": 50051,
"server_host": "localhost"
}
}
{
"type": "ICEBERG",
"writer": {
"catalog_type": "jdbc",
"jdbc_url": "jdbc:postgresql://host.docker.internal:5432/iceberg",
"jdbc_username": "iceberg",
"jdbc_password": "password",
"iceberg_s3_path": "s3a://warehouse",
"s3_endpoint": "http://host.docker.internal:9000",
"s3_use_ssl": false,
"s3_path_style": true,
"aws_access_key": "admin",
"aws_region": "ap-south-1",
"aws_secret_key": "password",
"iceberg_db": "ICEBERG_DATABASE_NAME"
}
}
{
"type": "ICEBERG",
"writer": {
"catalog_type": "rest",
"rest_catalog_url": "http://localhost:8181/catalog",
"iceberg_s3_path": "warehouse",
"iceberg_db": "ICEBERG_DATABASE_NAME"
}
}
{
"type": "ICEBERG",
"writer": {
"catalog_type": "hive",
"iceberg_s3_path": "s3a://warehouse/",
"aws_region": "us-east-1",
"aws_access_key": "admin",
"aws_secret_key": "password",
"s3_endpoint": "http://localhost:9000",
"hive_uri": "thrift://localhost:9083",
"s3_use_ssl": false,
"s3_path_style": true,
"hive_clients": 5,
"hive_sasl_enabled": false,
"iceberg_db": "ICEBERG_DATABASE_NAME"
}
}
For sample configuration and other details regarding Apache Iceberg, refer to Iceberg writer docs.
S3 Writer
OLake’s Parquet S3 writer allows you to write your data directly into an Amazon S3 bucket in Parquet format. This mode is ideal for users who want to leverage S3’s scalable storage for their data outputs.
{
"type": "PARQUET",
"writer": {
"s3_bucket": "olake-s3-test",
"s3_region": "ap-south-1",
"s3_access_key": "xxxxxxxxxxxxxxxxxxxx",
"s3_secret_key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"s3_path": "/data"
}
}
Key | Description | Data Type | Probable Values |
---|---|---|---|
type | Specifies the output file format for writing data. Currently, only the "PARQUET" format is supported. | string | "PARQUET" |
s3_bucket | The name of the Amazon S3 bucket where your output files will be stored. Ensure that the bucket exists and that you have proper access. | string | A valid S3 bucket name (e.g. "olake-s3-test" ) |
s3_region | The AWS region where the specified S3 bucket is hosted. | string | AWS region codes such as "ap-south-1" , "us-west-2" , etc. |
s3_access_key | The AWS access key used for authenticating S3 requests. This is typically a 20-character alphanumeric string. | string | A valid AWS access key |
s3_secret_key | The AWS secret key used for S3 authentication. This key is generally longer (often 40+ characters) and should be kept secure. | string | A valid AWS secret key |
s3_path | The specific path (or prefix) within the S3 bucket where data files will be written. This is typically a folder path that starts with a / (e.g. "/data" ). | string | A valid path string |
For sample configuration and other details regarding S3, refer to S3 writer docs.
Local Parquet Writer
The local writer mode is used to write Parquet files directly to a local directory inside your Docker container. The local directory is mapped to your host file system via a Docker volume. To run OLake via docker, follow getting started guide.
- Using Dockerized OLake
- Build OLake
Sample Configuration
{
"type": "PARQUET",
"writer": {
"local_path": "/mnt/config"
}
}
Configuration Key Details
Key | Data Type | Example Value | Description & Possible Values |
---|---|---|---|
type | string | "PARQUET" | Specifies the output file format. Currently, only the Parquet format is supported. |
writer.local_path | string | "/mnt/config" | The local directory inside the Docker container where Parquet files will be stored. This path is mapped to your host file system via a Docker volume. |
Note: This configuration enables the Parquet local writer. For more details, check out the README section.
Sample Configuration
{
"type": "PARQUET",
"writer": {
"local_path": "./mnt/config"
}
}
Configuration Key Details
Key | Data Type | Example Value | Description & Possible Values |
---|---|---|---|
type | string | "PARQUET" | Specifies the output file format. Currently, only the Parquet format is supported. |
writer.local_path | string | "/mnt/config" | The local directory inside the Docker container where Parquet files will be stored. This path is mapped to your host file system via a Docker volume. |
Note: This configuration enables the Parquet local writer. For more details, check out the README section.
For sample configuration and other details regarding local writer, refer to S3 writer docs.