Last updated:|... min read
S3 Configuration
- OLake UI
- OLake CLI
OLake supports writing data to S3 or S3 compatible storage systems using the Parquet format. This allows you to leverage the scalability and durability of S3 for your data storage needs.
Key | Description | Data Type | Probable Values |
---|---|---|---|
S3 Bucket | The name of the Amazon S3 bucket where your output files will be stored. Ensure that the bucket exists and that you have proper access. | string | A valid S3 bucket name (e.g. "olake-s3-test" ) |
S3 Region | The AWS region where the specified S3 bucket is hosted. | string | AWS region codes such as "ap-south-1" , "us-west-2" , etc. |
AWS Access Key | The AWS access key used for authenticating S3 requests. This is typically a 20-character alphanumeric string. | string | A valid AWS access key |
AWS Secret Key | The AWS secret key used for S3 authentication. This key is generally longer (often 40+ characters) and should be kept secure. | string | A valid AWS secret key |
S3 Path | The specific path (or prefix) within the S3 bucket where data files will be written. This is typically a folder path that starts with a / (e.g. "/data" ). | string | A valid path string |
To enable S3 or S3 compatible writes, you must create a destination.json
file with the configuration parameters listed below. The sample configuration provided here outlines the necessary keys and their expected values.
destination.json
{
"type": "PARQUET",
"writer": {
"s3_bucket": "olake-s3-test",
"s3_region": "ap-south-1",
"s3_access_key": "xxxxxxxxxxxxxxxxxxxx",
"s3_secret_key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"s3_path": "/data"
}
}
Configuration Key Details
Key | Description | Data Type | Probable Values |
---|---|---|---|
type | Specifies the output file format for writing data. Currently, only the "PARQUET" format is supported. | string | "PARQUET" |
s3_bucket | The name of the Amazon S3 bucket where your output files will be stored. Ensure that the bucket exists and that you have proper access. | string | A valid S3 bucket name (e.g. "olake-s3-test" ) |
s3_region | The AWS region where the specified S3 bucket is hosted. | string | AWS region codes such as "ap-south-1" , "us-west-2" , etc. |
s3_access_key | The AWS access key used for authenticating S3 requests. This is typically a 20-character alphanumeric string. | string | A valid AWS access key |
s3_secret_key | The AWS secret key used for S3 authentication. This key is generally longer (often 40+ characters) and should be kept secure. | string | A valid AWS secret key |
s3_path | The specific path (or prefix) within the S3 bucket where data files will be written. This is typically a folder path that starts with a / (e.g. "/data" ). | string | A valid path string |
info
- The generated
.parquet
files use SNAPPY compression (Read more). Note that SNAPPY is no longer supported by S3 Select when performing queries. - OLake creates a test folder named
olake_writer_test
containing a single text file (.txt
) with the content:This is used to verify that you have the necessary permissions to write to S3.S3 write test