Last updated:6/30/2025|... min read

Apache Doris v2.1+

MPP analytical database with comprehensive Iceberg read/write capabilities, vectorized execution, materialized view acceleration, and multi-catalog support for lake ingestion and analytics

Key Features

100

6+ Catalog Types

Multi-Catalog Excellence

CREATE CATALOG supports hive metastore, glue, rest, hadoop, dlf, s3tables with metastore URIs/REST endpoints plus credentials for comprehensive catalog integration

Explore details

Full Write Capabilities

MPP Lake Ingestion Engine

Full SELECT and write-back: INSERT, INSERT OVERWRITE, CTAS. Doris writes Parquet/ORC files and commits Iceberg snapshots as lake-ingestion engine and analytics layer

Explore details

UPDATE/DELETE ✓ MERGE ✗

Evolving DML Support

INSERT INTO (append), INSERT OVERWRITE, UPDATE & DELETE via Iceberg-v2 delete files (v2.1+). MERGE not yet single statement but emulated with patterns

Explore details

MoR + CoW

Complete Storage Strategy

Reads: applies position and equality delete files (MoR) automatically. Writes: generates position/equality delete files for UPDATE/DELETE; INSERT OVERWRITE rewrites files (CoW)

Explore details

External Tools Required

No Native Streaming

No native streaming/CDC writer; use external tools (Flink-Iceberg) to land data, then query with sub-second latency. Routine Load only targets internal tables

Explore details

v1/v2 + Parquet/ORC

Stable Format Support

Reads & writes Parquet (v1/v2) + ORC (v1/v2). Supports Iceberg spec v1 & v2; equality-delete support for ORC arrives in v2.1.3+. No v3 support yet

Explore details

100

SQL + System Tables

Comprehensive Time Travel

Query historical data with FOR TIMESTAMP AS OF / FOR VERSION AS OF or iceberg_meta() function. System tables ($snapshots, $manifests, $history) exposed

Explore details

Doris RBAC + Catalog

Layered Security Model

Doris RBAC plus underlying catalog/storage IAM. Ranger/Lake Formation policies apply at metastore/storage; Doris adds row-policies & column masking on query

Explore details

Vectorized + MVs

Advanced Performance Features

Vectorized reader, manifest & data-file caching, partition-predicate push-down, materialized-view acceleration on Iceberg sources, CREATE MATERIALIZED VIEW … REFRESH

Explore details

Several Constraints

Known Limitations

No MERGE statement; no continuous streaming writes; Avro data files unsupported; concurrent multi-engine writes may need manual conflict retries

Explore details

Apache Doris Iceberg Feature Matrix

Comprehensive breakdown of Iceberg capabilities in Apache Doris v2.1+

Dimension	Support Level	Implementation Details	Since Version
Catalog Types	Full6+ Types	HMS, Glue, REST, Hadoop, DLF, S3Tables with unified CREATE CATALOG syntax	2.1+
SQL Analytics	FullMPP + Lake Ingestion	Full SELECT + write-back (INSERT, INSERT OVERWRITE, CTAS); Parquet/ORC writing	2.1+
DML Operations	PartialUPDATE/DELETE ✓ MERGE ✗	INSERT, INSERT OVERWRITE, UPDATE/DELETE via delete files; MERGE emulated with patterns	2.1+
Storage Strategy	FullMoR + CoW	Reads position/equality deletes automatically; generates delete files; INSERT OVERWRITE (CoW)	2.1.3+
Streaming Support	NoneExternal Tools	No native streaming/CDC; use Flink-Iceberg + query with sub-second latency	N/A
Format Support	Partialv1/v2 + Parquet/ORC	Parquet + ORC (v1/v2); Iceberg spec v1/v2; no Avro or v3 support	2.1.3+
Time Travel	FullSQL + System Tables	FOR TIMESTAMP/VERSION AS OF; iceberg_meta() function; system tables ($snapshots, etc.)	2.1+
Schema Evolution	FullMetadata-only	ADD/DROP/RENAME columns, type evolution; automatic schema discovery	2.1+
Security & Governance	PartialLayered Model	Doris RBAC + catalog IAM; row/column policies; Ranger/LF limited enforcement	2.1+
Performance Features	FullVectorized + MVs	Vectorized reader, caching, predicate pushdown, materialized views with auto-refresh	2.1+
Known Limitations	SeveralClear Constraints	No MERGE; no streaming; no Avro; concurrent write conflicts need manual handling	2.1+
Iceberg Library	Currentv1.6.1	Bundled Iceberg client 1.6.1; follows upstream roadmap for v3 support	2.1+

Showing 12 entries

Live data

For issues, click here (GitHub)

Use Cases

MPP Lake Analytics

High-performance analytics on Iceberg data lakes

Complex analytical queries with vectorized execution
Multi-catalog data lake analytics and federation
High-performance OLAP workloads on lake data
Real-time analytics on externally ingested streaming data

Lake Ingestion and ETL

Data ingestion and transformation into Iceberg tables

ETL processes writing directly to Iceberg tables
Data warehouse modernization with lake storage
Batch data processing with INSERT OVERWRITE patterns
Data transformation pipelines with CTAS operations

Unified Analytics Platform

Single engine for both ingestion and analytics

Organizations wanting unified lake architecture
Teams requiring both transformation and querying capabilities
Materialized view acceleration for frequently accessed aggregates
Cross-catalog analytics with comprehensive catalog support

Hybrid Streaming-Batch Architecture

Batch analytics layer with external streaming ingestion

Lambda architectures with Flink streaming + Doris analytics
Real-time ingestion via external tools, sub-second query latency
Batch processing layer in streaming architectures
Historical analysis on continuously updated datasets

Resources & Documentation

Official Documentation

Complete API reference and guides

Getting Started Guide

Quick start tutorials and examples

Iceberg Catalog Documentation

Documentation

Iceberg Data Building Guide

Documentation

Iceberg Catalog Configuration

Documentation

ICEBERG_META Function

Documentation

Doris and Iceberg Best Practices

Documentation

Metadata Cache Documentation

Documentation

Built-in Authorization

Documentation

Next-Generation Data Lakehouse

Documentation

Need Assistance?

If you have any questions or uncertainties about setting up OLake, contributing to the project, or troubleshooting any issues, we’re here to help. You can:

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!

Apache Doris v2.1+

Key Features

Multi-Catalog Excellence

MPP Lake Ingestion Engine

Evolving DML Support

Complete Storage Strategy

No Native Streaming

Stable Format Support

Comprehensive Time Travel

Layered Security Model

Advanced Performance Features

Known Limitations

Apache Doris Iceberg Feature Matrix

Use Cases

MPP Lake Analytics

Lake Ingestion and ETL

Unified Analytics Platform

Hybrid Streaming-Batch Architecture

Resources & Documentation

Official Documentation

Getting Started Guide

Iceberg Catalog Documentation

Iceberg Data Building Guide

Iceberg Catalog Configuration

ICEBERG_META Function

Doris and Iceberg Best Practices

Metadata Cache Documentation

Built-in Authorization

Next-Generation Data Lakehouse

Need Assistance?

Join our growing community

GitHub

Slack

Twitter

LinkedIn

YouTube