Skip to main content

AWS S3 + Glue Configuration

writer.json
{
"type": "ICEBERG",
"writer": {
"catalog_type": "glue",
"normalization": false,
"iceberg_s3_path": "s3://<BUCKET_NAME>/<S3_PREFIX_VALUE>",
"aws_region": "ap-south-1",
"aws_access_key": "XXX",
"aws_secret_key": "XXX",
"iceberg_db": "ICEBERG_DATABASE_NAME",
"grpc_port": 50051,
"server_host": "localhost"
}
}

AWS S3 + Glue Configuration Parameters

ParameterSample ValueDescription
normalizationfalseFlag to enable or disable data normalization.
iceberg_s3_paths3://<BUCKET_NAME>/<S3_PREFIX_VALUE>S3 path where the Iceberg data is stored in AWS.
aws_regionap-south-1AWS region where the S3 bucket and Glue catalog are located.
aws_access_keyXXXAWS access key with sufficient permissions for S3 and Glue.
aws_secret_keyXXXAWS secret key corresponding to the access key.
iceberg_dbolake_icebergName of the database to be created in AWS Glue.
grpc_port50051Port on which the gRPC server listens.
server_hostlocalhostHost address of the gRPC server.

Required IAM Permissions

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Action": [
"glue:CreateTable",
"glue:CreateDatabase",
"glue:GetTable",
"glue:GetDatabase",
"glue:GetDatabases",
"glue:SearchTables",
"glue:UpdateDatabase",
"glue:UpdateTable"
],
"Resource": [
"arn:aws:glue:ap-south-1:956834174860:catalog",
"arn:aws:glue:ap-south-1:956834174860:database/{AWS_GLUE_DATABASE_NAME}",
"arn:aws:glue:ap-south-1:956834174860:table/{AWS_GLUE_DATABASE_NAME}/*"
]
}
]
}
info

The above IAM policy is a sample policy that allows the OLake Glue Catalog to create and update tables in the Glue Catalog. You can modify the Resource section to match your specific Glue Catalog ARN. If you already have a database and table created in the Glue Catalog, you can use the GetTable and GetDatabase permissions to read the existing metadata and remove the CreateDatabase and CreateTable permissions. The SearchTables permission is optional and is used to search for tables in the Glue Catalog.

For S3 related permissions which is needed to write data to S3, refer to the AWS S3 Permissions documentation.


Need Assistance?

If you have any questions or uncertainties about setting up OLake, contributing to the project, or troubleshooting any issues, we’re here to help. You can:

  • Email Support: Reach out to our team at hello@olake.io for prompt assistance.
  • Join our Slack Community: where we discuss future roadmaps, discuss bugs, help folks to debug issues they are facing and more.
  • Schedule a Call: If you prefer a one-on-one conversation, schedule a call with our CTO and team.

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!