Skip to main content

DuckDB v1.3+

A light-weight, read-only analytics engine for Iceberg with SQL time travel, external file caching, and REST catalog support

Key Features

65
Partial Support

Catalog Support

Hadoop (file-system) and Iceberg REST catalogs supported via rest option with bearer/OAuth tokens; no native Hive/Glue catalog yet

Explore details
95
Full Support

Read-only Analytics Excellence

Full SELECT support with predicate evaluation, manifest pruning and external file-cache to avoid re-downloading S3/GCS objects

Explore details
100
SQL Syntax

Advanced Time Travel

Convenient SQL syntax: SELECT * FROM tbl AT (VERSION => 314159) or AT (TIMESTAMP => '2025-05-01 10:15:00'); older function-style still works

Explore details
90
Performance Boost

External File Caching

External file-cache via SET s3_cache_size='4GB'; halves cold-scan latency for Iceberg on S3/GCS with intelligent object reuse

Explore details
60
Parquet Only

Format Compatibility

Parquet remains the only supported data-file format; Avro/ORC data files are ignored, limiting compatibility with mixed-format tables

Explore details
85
Helper Functions

Metadata Operations

iceberg_snapshots() returns current snapshot first with summary JSON; iceberg_metadata() exposes file-size/row-count stats for planner optimization

Explore details
65
Basic Support

Security Integration

Uses DuckDB's standard S3/Azure credentials via httpfs extension; REST-catalog tokens per-session; no built-in RBAC/row-masking

Explore details
40
Known Issues

Current Limitations

Read-only engine with no write support; tables with deletes not supported; Format V3 capabilities absent; single-node execution constraints

Explore details

DuckDB Iceberg Feature Matrix

Comprehensive breakdown of Iceberg capabilities in DuckDB v1.3+

Dimension
Support Level
Implementation Details
Min Version
Catalog Types
PartialHadoop + REST
Hadoop (file-system), REST catalog with OAuth tokens; no native Hive/Glue support
1.3+
Read Operations
FullAnalytics Optimized
Complete SELECT support, predicate pushdown, manifest pruning, external file-cache
1.3+
Write Operations
NoneRead-Only
No INSERT/UPDATE/DELETE/CREATE TABLE AS ICEBERG support
N/A
Time Travel
FullSQL Syntax
New AT (VERSION/TIMESTAMP) syntax plus legacy function-style options
1.3+
Delete File Support
NoneCoW Only
Reading tables with deletes not yet supported; Copy-on-Write tables only
N/A
Format V3 Support
NoneV1/V2 Only
DuckDB 1.3 reads v1 & v2 tables only; V3 evaluation post-GA
N/A
Data File Formats
LimitedParquet Only
Parquet files only; Avro/ORC data files are ignored
1.3+
Streaming Support
NoneBatch Only
Batch-only analytics; no streaming ingestion or CDC capabilities
N/A
Metadata Operations
FullHelper Functions
iceberg_snapshots(), iceberg_metadata() with summary JSON and planner stats
1.3+
Cloud Storage Optimization
FullFile Cache
External file-cache via s3_cache_size reduces cold-scan latency by ~50%
1.3+
Security Integration
BasicCredential Delegation
S3/Azure creds via httpfs, REST tokens; no built-in RBAC/row-masking
1.3+
Scale Limitations
Single-NodeLocal Resources
Single-node execution; large lake queries constrained by local resources
1.3+

Showing 12 entries

Use Cases

Interactive Data Exploration

Fast, ad-hoc analytics on Iceberg tables for data scientists and analysts

  • Laptop-based data science with cloud data lakes
  • Quick data quality assessment and profiling
  • Prototyping data transformations and analysis
  • Educational and learning environments

Development & Testing

Lightweight engine for developing and testing data pipelines

  • Local development against production Iceberg tables
  • Testing query logic before deploying to production
  • Debugging data pipeline outputs and transformations
  • Schema validation and compatibility testing

Analytical Reporting

Read-only reporting and dashboard data preparation

  • Business intelligence report generation
  • Data extraction for external systems and tools
  • Historical trend analysis with time travel
  • Cross-functional data sharing and exploration

Data Lake Auditing

Compliance and audit scenarios leveraging time travel capabilities

  • Point-in-time data auditing and compliance
  • Data lineage investigation and debugging
  • Historical data comparison and validation
  • Regulatory reporting with specific timestamps


πŸ’‘ Join the OLake Community!

Got questions, ideas, or just want to connect with other data engineers?
πŸ‘‰ Join our Slack Community to get real-time support, share feedback, and shape the future of OLake together. πŸš€

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!