Iceberg Partitioning Support
Partitioning groups rows that share common values at write-time, so that queries filtering on those values read only the relevant files. The result is fewer data files scanned, less I/O, and faster queries.
OLake allows seamless writing into partitioned Iceberg tables, supporting all built-in transformations in Apache Icebergβ’.
How Iceberg Handles Partitions?β
Apache Iceberg stores partition data in both its metadata and manifest files to enable efficient querying and file pruning. Rather than embedding partition information in file paths, Iceberg records the partition values for each file in manifests and maintains a snapshot-based metadata tree that tracks those manifests.
When you run a query like WHERE date > β¦
, Iceberg consults the metadata to skip over any data files whose recorded partition values cannot satisfy the filter.
Key Featuresβ
- Hidden partitioning
- Evolvable specs
- Rich partition transforms
- Manifest metadata
- Snapshot isolation
- Efficient pruning
Learn more about Iceberg Partitioning.
How to Add a Partition in OLake?β
When adding a partition, you need to configure two components:
Component | Meaning |
---|---|
field_name | Column in the table (created_at, user_id, etc.) |
transform | Iceberg transform that converts the value, e.g., identity |
Example 1: Time-based partitioning for event data
/{event_date, day}/{event_hour, hour}
- Partitions data by day and hour for efficient time-range queries
- Useful for analytics workloads with time-based filtering
Example 2: Geographic and temporal partitioning
/{region, identity}/{created_at, month}
- Partitions by geographic region and creation month
- Ideal for multi-region applications with time-based analysis
Check all the available transforms below.
Configurationβ
- OLake UI
- OLake CLI
- Before adding partitioning, make sure you have configured your destination.
- Then select your table.
- Keep Normalization enabled.
- Select Partitioning in the right tab.
- Add your partition field along with the transform.
- Then we can move forward to Schedule a Job.
Configure Partitioning in streams.json
β
- After running the Discover command, update the partition_regex field in your generated streams.json file to add your partition field and transform.
{
"selected_streams": {
"my_namespace": [
{
"stream_name": "my_stream",
"partition_regex": "/{created_at, year}",
"normalization": true
}
]
}
}
- Now finally run the Sync command to see the changes.
For hierarchical partitioning, you can have multiple /{field_name, transform}
entries in the Partition regex.
Example: /{created_at, year}/{user_id, bucket[32]}
You may use now()
as a pseudo-column that evaluates to the _olake_timestamp column which is the writer's current timestamp (useful when your records lack a suitable time field).
Supported Transformationsβ
All transforms create partition folders with predictable naming patterns for efficient querying:
Transform | Typical use-case | How It Works | Spec Snippet | Example Input β Partition Value |
---|---|---|---|---|
identity | Columns with only a few distinct values β e.g. country, status | Writes the raw value unchanged | /{status, identity} | status = 'active' β folder status=active/ |
year | Time-series data where queries slice whole years | Extracts the calendar year as an integer | /{created_at, year} | created_at = 2025-04-24 15:42 β year=2025/ |
month | Monthly reporting, rollups, retention | Returns the month number (1-12) of the timestamp | /{created_at, month} | created_at = 2025-04-24 15:42 β month=4/ |
day | Daily dashboards, 30-day TTL jobs | Returns the local calendar day (1-31) | /{event_date, day} | event_date = 2025-04-24 β day=24/ |
hour | High-volume logs where analysts drill into single hours | Returns hour-of-day (0-23) | /{event_time, hour} | event_time = 2025-04-24 15:42 β hour=15/ |
bucket[N] | Very high-cardinality keys (email, UUID) to distribute writes | Hashes the value and assigns it to N numbered buckets (0β¦N-1) | /{user_id, bucket[64]} | user_id = 'f47ac10b...' β user_id_bucket=17/ (one of 64) |
truncate[N] | Long strings where only the prefix matters (domains, URLs) | Keeps the first N UTF-8 characters | /{domain, truncate[4]} | domain = 'example.com' β domain=exam/ |
void | Dropping partition fields without removing from spec | Always produces null values | /{old_field, void} | old_field = 'any_value' β old_field=null/ |
The void transform in Apache Iceberg always produces null values regardless of input. It effectively drops a partition field without removing it from the partition specification. Starting from Iceberg V2, it is used when dropping a partition field.
Non-linear transformations like bucket
and void
do not preserve ordering, meaning there's no predictable relationship between input values and their transformed output positions.
Iceberg does not support redundant fields during partitioning, even with different transforms on the same column. Avoid applying multiple time transforms to the same column in a single partition specification.
Redundant transform combinations:
year(ts)
is redundant withmonth(ts)
,hour(ts)
,day(ts)
month(ts)
is redundant withhour(ts)
andday(ts)
day(ts)
is redundant withhour(ts)
For example, this is incorrect:/{timestamp_col, hour}/{timestamp_col, minute}
Putting it all togetherβ
Let's look at a real-world example of an e-commerce analytics table with three partitions for optimal query performance.
Example: E-commerce Events Tableβ
Partition Configuration:
/{event_date, day}/{region, identity}/{user_id, bucket[256]}
This creates a hierarchical partitioning strategy:
- Daily partitions for time-based queries
- Regional partitions for geographic analysis
- User bucketing to distribute high-cardinality data
Directory Structureβ
s3://warehouse/ecommerce_events/
βββ metadata/
β βββ v0.metadata.json
β βββ v1.metadata.json
β βββ snap-0001.avro
β βββ manifest-01.avro
βββ data/
βββ event_date_day=2025-01-15/
β βββ region=us-east/
β β βββ user_id_bucket=0/
β β βββ data-001.parquet
β βββ region=eu-west/
β βββ user_id_bucket=128/
β βββ data-002.parquet
βββ event_date_day=2025-01-16/
βββ region=ap-south/
βββ user_id_bucket=64/
βββ data-003.parquet
SQL Query Examplesβ
1. Daily Report (scans only 1 day's data):
SELECT event_type, COUNT(*)
FROM ecommerce_events
WHERE event_date = '2025-01-15'
GROUP BY event_type;
Iceberg scans only: event_date_day=2025-01-15/ directory
2. Regional Analysis (scans specific region across time):
SELECT DATE(event_date) as day, SUM(revenue)
FROM ecommerce_events
WHERE region = 'us-east'
AND event_date BETWEEN '2025-01-15' AND '2025-01-17'
GROUP BY DATE(event_date);
Iceberg scans only: region=us-east/ subdirectories within the date range
Performance Benefitsβ
- Time queries: Skip irrelevant days entirely
- Regional queries: Skip other geographic regions
- User queries: Skip 255 out of 256 user buckets
- Combined filters: Maximum pruning when filtering on multiple partition fields
This partitioning strategy enables sub-second queries on petabyte-scale datasets by reading only the relevant data files.
Real-World Examplesβ
1. Analytics Data Warehouse
Partition Regex:
/{event_year, year}/{event_month, month}/{event_day, day}
Configuration:
{
"stream_name": "analytics_events",
"partition_regex": "/{event_year, year}/{event_month, month}/{event_day, day}"
}
Folder Structure:
event_year_year=2025/event_month_month=4/event_day_day=24/part-00001.parquet
SQL Query:
SELECT event_type, COUNT(*)
FROM analytics.events
WHERE event_year = 2025
AND event_month = 4
GROUP BY event_type;
2. Financial Transactions
Partition Regex:
/{transaction_date, day}/{account_type, identity}
Configuration:
{
"stream_name": "transactions",
"partition_regex": "/{transaction_date, day}/{account_type, identity}"
}
Folder Structure:
transaction_date_day=2025-04-24/account_type=premium/part-00001.parquet
SQL Query:
SELECT SUM(amount)
FROM transactions
WHERE transaction_date = '2025-04-24'
AND account_type = 'premium';
3. IoT Sensor Data
Partition Regex:
/{timestamp, hour}/{device_id, bucket[256]}/{sensor_type, identity}
Configuration:
{
"stream_name": "sensor_readings",
"partition_regex": "/{timestamp, hour}/{device_id, bucket[256]}/{sensor_type, identity}"
}
Folder Structure:
timestamp_hour=2025-04-24-14/device_id_bucket=128/sensor_type=temperature/part-00001.parquet
SQL Query:
SELECT AVG(value)
FROM sensor_readings
WHERE timestamp >= '2025-04-24 14:00'
AND sensor_type = 'temperature';
4. E-commerce Orders
Partition Regex:
/{order_date, month}/{region, identity}/{customer_tier, identity}
Configuration:
{
"stream_name": "orders",
"partition_regex": "/{order_date, month}/{region, identity}/{customer_tier, identity}"
}
Folder Structure:
order_date_month=2025-04/region=us-west/customer_tier=gold/part-00001.parquet
SQL Query:
SELECT COUNT(*), SUM(total_amount)
FROM orders
WHERE order_date >= '2025-04-01'
AND region = 'us-west'
AND customer_tier = 'gold';
Checklist before you decideβ
Question | If Yes, Consider |
---|---|
Do your workloads frequently filter by date or time? | Use day() , month() , or year() transforms for time-based partitioning |
Do you have columns with many distinct values (user IDs, session IDs)? | Use bucket(n) to evenly distribute high-cardinality data into n buckets |
Do you have columns with few distinct values (country, status)? | Direct partitioning on the column using identity transform is sufficient |
Do your queries often target numeric ranges (price, timestamps)? | Use truncate(width) to group values into fixed-size ranges |
Do you frequently filter by region, department, or category? | Partition by identity on that column for geographic/categorical grouping |
Might your schema or query patterns change over time? | Choose hidden partitioning and allow spec evolution for flexibility |
Are you concerned about metadata overhead from too many partitions? | Balance granularity: avoid partitions smaller than a few MB to optimize performance vs. file count |
Do you have data skew or potential hot partitions? | Combine high-cardinality and low-cardinality transforms (e.g., bucket within day ) |
Do you prioritize flat directory structures? | Limit nested transforms to two or three levels for storage layout simplicity |
-
Cardinality β Aim for 100 β 10,000 files per partition folder.
-
Skew β Use
bucket[N]
when one value dominates (e.g., a single large tenant). -
Time range queries β Put the time transform first if nearly every query filters on a date.
-
Evolution β Start simple (e.g.,
{event_date, day}
) and add more fields later; Iceberg keeps old snapshots readable.