Last updated:|... min read
Parameter | Sample Value | Description |
---|---|---|
Iceberg S3 Path (Warehouse) | s3://warehouse/ or gs://hive-dataproc-generated-bucket/hive-warehouse | Determines the S3 path or storage location for Iceberg data. The value s3://warehouse/ represents the designated S3 bucket or directory. If using GCP, use dataproc hive metastore bucket. Use s3a:// if using Minio |
AWS Region | us-east-1 | Specifies the AWS region associated with the S3 bucket where the data is stored. |
AWS Access Key | admin | AWS access key with sufficient permissions for S3. Optional if using IAM role attached to running instance/pod. |
AWS Secret Key | password | AWS secret key with sufficient permissions for S3. Optional if using IAM role attached to running instance/pod. |
S3 Endpoint | http://S3_ENDPOINT | Specifies the endpoint URL for the S3 service. This may be used when connecting to an S3-compatible storage service like MinIO running on localhost. |
Hive URI | thrift://<hostname>:9083 or thrift://METASTORE_IP:9083 | Defines the URI of the Hive Metastore service that the writer will connect to for catalog interactions. METASTORE_IP will be provided by GCP's Hive dataproc metastore or thrift://localhost:9083 if you are using local setup using docker compose. If you are using separate docker for hive then it can be thrift://host.docker.internal:9083 |
Use SSL for S3 | false | Indicates whether SSL is enabled for S3 connections. "false" means that SSL is disabled for these communications. |
Use Path Style for S3 | true | Determines if path-style access is used for S3. "true" means that the writer will use path-style addressing instead of the default virtual-hosted style. |
Hive Clients | 5 | Specifies the number of Hive clients allocated for managing interactions with the Hive Metastore. |
Enable SASL for Hive | false | Indicates whether SASL authentication is enabled for the Hive connection. "false" means that SASL is disabled. |
Iceberg Database | iceberg_db | Specifies the name of the Iceberg database to be used by the destination configuration. |