4. Hive | OLake

Last updated:6/18/2025|... min read

Hive Catalog

info

Use iceberg_s3_path with s3a prefix if your Hive is configured so. This will work for most use cases. Otherwise, use iceberg_path with s3 prefix.

OLake UI
OLake CLI

Parameter	Sample Value	Description
Iceberg S3 Path	s3a://warehouse/	Determines the S3 path or storage location for Iceberg data. The value "s3a://warehouse/" represents the designated S3 bucket or directory.
AWS Region	us-east-1	Specifies the AWS region associated with the S3 bucket where the data is stored.
AWS Access Key	admin	Provides the AWS access key used for authentication when connecting to S3.
AWS Secret Key	password	Provides the AWS secret key used for authentication when connecting to S3.
S3 Endpoint	http://localhost:9000	Specifies the endpoint URL for the S3 service. This may be used when connecting to an S3-compatible storage service like MinIO running on localhost.
Hive URI	thrift://localhost:9083	Defines the URI of the Hive Metastore service that the writer will connect to for catalog interactions.
Use SSL for S3	false	Indicates whether SSL is enabled for S3 connections. "false" means that SSL is disabled for these communications.
Use Path Style for S3	true	Determines if path-style access is used for S3. "true" means that the writer will use path-style addressing instead of the default virtual-hosted style.
Hive Clients	5	Specifies the number of Hive clients allocated for managing interactions with the Hive Metastore.
Enable SASL for Hive	false	Indicates whether SASL authentication is enabled for the Hive connection. "false" means that SASL is disabled.
Iceberg Database	olake_iceberg	Specifies the name of the Iceberg database to be used by the destination configuration.

destination.json
{
    "type": "ICEBERG",
    "writer": {
        "catalog_type": "hive",
        "iceberg_s3_path": "s3a://warehouse/",
        "aws_region": "us-east-1",
        "aws_access_key": "admin",
        "aws_secret_key": "password",
        "s3_endpoint": "http://localhost:9000",
        "hive_uri": "thrift://localhost:9083",
        "s3_use_ssl": false,
        "s3_path_style": true,
        "hive_clients": 5,
        "hive_sasl_enabled": false,
        "iceberg_db": "ICEBERG_DATABASE_NAME"
    }
}

Hive Configuration Parameters

Parameter	Sample Value	Description
catalog_type	hive	Indicates the catalog type used by the writer. "hive" means that the writer uses the Hive Metastore for catalog operations.
iceberg_s3_path	s3a://warehouse/	Determines the S3 path or storage location for Iceberg data. The value "s3a://warehouse/" represents the designated S3 bucket or directory.
aws_region	us-east-1	Specifies the AWS region associated with the S3 bucket where the data is stored.
aws_access_key	admin	Provides the AWS access key used for authentication when connecting to S3.
aws_secret_key	password	Provides the AWS secret key used for authentication when connecting to S3.
s3_endpoint	http://localhost:9000	Specifies the endpoint URL for the S3 service. This may be used when connecting to an S3-compatible storage service like MinIO running on localhost.
hive_uri	thrift://localhost:9083	Defines the URI of the Hive Metastore service that the writer will connect to for catalog interactions.
s3_use_ssl	false	Indicates whether SSL is enabled for S3 connections. "false" means that SSL is disabled for these communications.
s3_path_style	true	Determines if path-style access is used for S3. "true" means that the writer will use path-style addressing instead of the default virtual-hosted style.
hive_clients	5	Specifies the number of Hive clients allocated for managing interactions with the Hive Metastore.
hive_sasl_enabled	false	Indicates whether SASL authentication is enabled for the Hive connection. "false" means that SASL is disabled.
iceberg_db	olake_iceberg	Specifies the name of the Iceberg database to be used by the destination configuration.

You can query the data via:

SELECT * FROM CATALOG_NAME.ICEBERG_DATABASE_NAME.TABLE_NAME;

CATALOG_NAME can be: jdbc_catalog, hive_catalog, rest_catalog, etc.
ICEBERG_DATABASE_NAME is the name of the Iceberg database you created / added as a value in destination.json file.

For S3 related permissions which is needed to write data to S3, refer to the AWS S3 Permissions documentation.

info

If you wish to test out the REST Catalog locally, you can use the docker-compose setup. The local test setup uses Minio as an S3-compatible storage and other all supported catalog types.

You can then setup local spark to run queries on the iceberg tables created in the local test setup.

Need Assistance?

If you have any questions or uncertainties about setting up OLake, contributing to the project, or troubleshooting any issues, we’re here to help. You can:

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!

Hive Catalog

Hive Configuration Parameters

Need Assistance?

Join our growing community

GitHub

Slack

Twitter

LinkedIn

YouTube

Hive Catalog​

Hive Configuration Parameters​

Need Assistance?

Join our growing community

GitHub

Slack

Twitter

LinkedIn

YouTube

Hive Catalog

Hive Configuration Parameters