Skip to main content

Hive Catalog

writer.json
{
"type": "ICEBERG",
"writer": {
"catalog_type": "hive",
"normalization": false,
"iceberg_s3_path": "s3a://warehouse/",
"aws_region": "us-east-1",
"aws_access_key": "admin",
"aws_secret_key": "password",
"s3_endpoint": "http://localhost:9000",
"hive_uri": "thrift://localhost:9083",
"s3_use_ssl": false,
"s3_path_style": true,
"hive_clients": 5,
"hive_sasl_enabled": false,
"iceberg_db": "ICEBERG_DATABASE_NAME"
}
}

Hive Configuration Parameters

ParameterSample ValueDescription
catalog_typehiveIndicates the catalog type used by the writer. "hive" means that the writer uses the Hive Metastore for catalog operations.
normalizationfalseSpecifies whether data normalization is applied. "false" means that normalization is disabled.
iceberg_s3_paths3a://warehouse/Determines the S3 path or storage location for Iceberg data. The value "s3a://warehouse/" represents the designated S3 bucket or directory.
aws_regionus-east-1Specifies the AWS region associated with the S3 bucket where the data is stored.
aws_access_keyadminProvides the AWS access key used for authentication when connecting to S3.
aws_secret_keypasswordProvides the AWS secret key used for authentication when connecting to S3.
s3_endpointhttp://localhost:9000Specifies the endpoint URL for the S3 service. This may be used when connecting to an S3-compatible storage service like MinIO running on localhost.
hive_urithrift://localhost:9083Defines the URI of the Hive Metastore service that the writer will connect to for catalog interactions.
s3_use_sslfalseIndicates whether SSL is enabled for S3 connections. "false" means that SSL is disabled for these communications.
s3_path_styletrueDetermines if path-style access is used for S3. "true" means that the writer will use path-style addressing instead of the default virtual-hosted style.
hive_clients5Specifies the number of Hive clients allocated for managing interactions with the Hive Metastore.
hive_sasl_enabledfalseIndicates whether SASL authentication is enabled for the Hive connection. "false" means that SASL is disabled.
iceberg_dbolake_icebergSpecifies the name of the Iceberg database to be used by the writer configuration.

You can query the data via:

SELECT * FROM CATALOG_NAME.ICEBERG_DATABASE_NAME.TABLE_NAME;
  • CATALOG_NAME can be: jdbc_catalog, hive_catalog, rest_catalog, etc.
  • ICEBERG_DATABASE_NAME is the name of the Iceberg database you created / added as a value in writer.json file.

For S3 related permissions which is needed to write data to S3, refer to the AWS S3 Permissions documentation.


Need Assistance?

If you have any questions or uncertainties about setting up OLake, contributing to the project, or troubleshooting any issues, we’re here to help. You can:

  • Email Support: Reach out to our team at hello@olake.io for prompt assistance.
  • Join our Slack Community: where we discuss future roadmaps, discuss bugs, help folks to debug issues they are facing and more.
  • Schedule a Call: If you prefer a one-on-one conversation, schedule a call with our CTO and team.

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!