Skip to main content

DuckDB v1.3+

A light-weight, read-only analytics engine for Iceberg with SQL time travel, external file caching, and REST catalog support

Key Features

65
Partial Support

Catalog Support

Hadoop (file-system) and Iceberg REST catalogs supported via rest option with bearer/OAuth tokens; no native Hive/Glue catalog yet

Explore details
95
Full Support

Read-only Analytics Excellence

Full SELECT support with predicate evaluation, manifest pruning and external file-cache to avoid re-downloading S3/GCS objects

Explore details
100
SQL Syntax

Advanced Time Travel

Convenient SQL syntax: SELECT * FROM tbl AT (VERSION => 314159) or AT (TIMESTAMP => '2025-05-01 10:15:00'); older function-style still works

Explore details
90
Performance Boost

External File Caching

External file-cache via SET s3_cache_size='4GB'; halves cold-scan latency for Iceberg on S3/GCS with intelligent object reuse

Explore details
60
Parquet Only

Format Compatibility

Parquet remains the only supported data-file format; Avro/ORC data files are ignored, limiting compatibility with mixed-format tables

Explore details
85
Helper Functions

Metadata Operations

iceberg_snapshots() returns current snapshot first with summary JSON; iceberg_metadata() exposes file-size/row-count stats for planner optimization

Explore details
65
Basic Support

Security Integration

Uses DuckDB's standard S3/Azure credentials via httpfs extension; REST-catalog tokens per-session; no built-in RBAC/row-masking

Explore details
40
Known Issues

Current Limitations

Read-only engine with no write support; tables with deletes not supported; Format V3 capabilities absent; single-node execution constraints

Explore details

DuckDB Iceberg Feature Matrix

Comprehensive breakdown of Iceberg capabilities in DuckDB v1.3+

Dimension
Support Level
Implementation Details
Min Version
Catalog Types
PartialHadoop + REST
Hadoop (file-system), REST catalog with OAuth tokens; no native Hive/Glue support
1.3+
Read Operations
FullAnalytics Optimized
Complete SELECT support, predicate pushdown, manifest pruning, external file-cache
1.3+
Write Operations
NoneRead-Only
No INSERT/UPDATE/DELETE/CREATE TABLE AS ICEBERG support
N/A
Time Travel
FullSQL Syntax
New AT (VERSION/TIMESTAMP) syntax plus legacy function-style options
1.3+
Delete File Support
NoneCoW Only
Reading tables with deletes not yet supported; Copy-on-Write tables only
N/A
Format V3 Support
NoneV1/V2 Only
DuckDB 1.3 reads v1 & v2 tables only; V3 evaluation post-GA
N/A
Data File Formats
LimitedParquet Only
Parquet files only; Avro/ORC data files are ignored
1.3+
Streaming Support
NoneBatch Only
Batch-only analytics; no streaming ingestion or CDC capabilities
N/A
Metadata Operations
FullHelper Functions
iceberg_snapshots(), iceberg_metadata() with summary JSON and planner stats
1.3+
Cloud Storage Optimization
FullFile Cache
External file-cache via s3_cache_size reduces cold-scan latency by ~50%
1.3+
Security Integration
BasicCredential Delegation
S3/Azure creds via httpfs, REST tokens; no built-in RBAC/row-masking
1.3+
Scale Limitations
Single-NodeLocal Resources
Single-node execution; large lake queries constrained by local resources
1.3+

Showing 12 entries

Use Cases

Interactive Data Exploration

Fast, ad-hoc analytics on Iceberg tables for data scientists and analysts

  • Laptop-based data science with cloud data lakes
  • Quick data quality assessment and profiling
  • Prototyping data transformations and analysis
  • Educational and learning environments

Development & Testing

Lightweight engine for developing and testing data pipelines

  • Local development against production Iceberg tables
  • Testing query logic before deploying to production
  • Debugging data pipeline outputs and transformations
  • Schema validation and compatibility testing

Analytical Reporting

Read-only reporting and dashboard data preparation

  • Business intelligence report generation
  • Data extraction for external systems and tools
  • Historical trend analysis with time travel
  • Cross-functional data sharing and exploration

Data Lake Auditing

Compliance and audit scenarios leveraging time travel capabilities

  • Point-in-time data auditing and compliance
  • Data lineage investigation and debugging
  • Historical data comparison and validation
  • Regulatory reporting with specific timestamps

Need Assistance?

If you have any questions or uncertainties about setting up OLake, contributing to the project, or troubleshooting any issues, we’re here to help. You can:

  • Email Support: Reach out to our team at hello@olake.io for prompt assistance.
  • Join our Slack Community: where we discuss future roadmaps, discuss bugs, help folks to debug issues they are facing and more.
  • Schedule a Call: If you prefer a one-on-one conversation, schedule a call with our CTO and team.

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!