Skip to main content

Apache Doris v2.1+

MPP analytical database with comprehensive Iceberg read/write capabilities, vectorized execution, materialized view acceleration, and multi-catalog support for lake ingestion and analytics

Key Features

100
6+ Catalog Types

Multi-Catalog Excellence

CREATE CATALOG supports hive metastore, glue, rest, hadoop, dlf, s3tables with metastore URIs/REST endpoints plus credentials for comprehensive catalog integration

Explore details
90
Full Write Capabilities

MPP Lake Ingestion Engine

Full SELECT and write-back: INSERT, INSERT OVERWRITE, CTAS. Doris writes Parquet/ORC files and commits Iceberg snapshots as lake-ingestion engine and analytics layer

Explore details
80
UPDATE/DELETE ✓ MERGE ✗

Evolving DML Support

INSERT INTO (append), INSERT OVERWRITE, UPDATE & DELETE via Iceberg-v2 delete files (v2.1+). MERGE not yet single statement but emulated with patterns

Explore details
95
MoR + CoW

Complete Storage Strategy

Reads: applies position and equality delete files (MoR) automatically. Writes: generates position/equality delete files for UPDATE/DELETE; INSERT OVERWRITE rewrites files (CoW)

Explore details
10
External Tools Required

No Native Streaming

No native streaming/CDC writer; use external tools (Flink-Iceberg) to land data, then query with sub-second latency. Routine Load only targets internal tables

Explore details
70
v1/v2 + Parquet/ORC

Stable Format Support

Reads & writes Parquet (v1/v2) + ORC (v1/v2). Supports Iceberg spec v1 & v2; equality-delete support for ORC arrives in v2.1.3+. No v3 support yet

Explore details
100
SQL + System Tables

Comprehensive Time Travel

Query historical data with FOR TIMESTAMP AS OF / FOR VERSION AS OF or iceberg_meta() function. System tables ($snapshots, $manifests, $history) exposed

Explore details
85
Doris RBAC + Catalog

Layered Security Model

Doris RBAC plus underlying catalog/storage IAM. Ranger/Lake Formation policies apply at metastore/storage; Doris adds row-policies & column masking on query

Explore details
95
Vectorized + MVs

Advanced Performance Features

Vectorized reader, manifest & data-file caching, partition-predicate push-down, materialized-view acceleration on Iceberg sources, CREATE MATERIALIZED VIEW … REFRESH

Explore details
75
Several Constraints

Known Limitations

No MERGE statement; no continuous streaming writes; Avro data files unsupported; concurrent multi-engine writes may need manual conflict retries

Explore details

Apache Doris Iceberg Feature Matrix

Comprehensive breakdown of Iceberg capabilities in Apache Doris v2.1+

Dimension
Support Level
Implementation Details
Since Version
Catalog Types
Full6+ Types
HMS, Glue, REST, Hadoop, DLF, S3Tables with unified CREATE CATALOG syntax
2.1+
SQL Analytics
FullMPP + Lake Ingestion
Full SELECT + write-back (INSERT, INSERT OVERWRITE, CTAS); Parquet/ORC writing
2.1+
DML Operations
PartialUPDATE/DELETE ✓ MERGE ✗
INSERT, INSERT OVERWRITE, UPDATE/DELETE via delete files; MERGE emulated with patterns
2.1+
Storage Strategy
FullMoR + CoW
Reads position/equality deletes automatically; generates delete files; INSERT OVERWRITE (CoW)
2.1.3+
Streaming Support
NoneExternal Tools
No native streaming/CDC; use Flink-Iceberg + query with sub-second latency
N/A
Format Support
Partialv1/v2 + Parquet/ORC
Parquet + ORC (v1/v2); Iceberg spec v1/v2; no Avro or v3 support
2.1.3+
Time Travel
FullSQL + System Tables
FOR TIMESTAMP/VERSION AS OF; iceberg_meta() function; system tables ($snapshots, etc.)
2.1+
Schema Evolution
FullMetadata-only
ADD/DROP/RENAME columns, type evolution; automatic schema discovery
2.1+
Security & Governance
PartialLayered Model
Doris RBAC + catalog IAM; row/column policies; Ranger/LF limited enforcement
2.1+
Performance Features
FullVectorized + MVs
Vectorized reader, caching, predicate pushdown, materialized views with auto-refresh
2.1+
Known Limitations
SeveralClear Constraints
No MERGE; no streaming; no Avro; concurrent write conflicts need manual handling
2.1+
Iceberg Library
Currentv1.6.1
Bundled Iceberg client 1.6.1; follows upstream roadmap for v3 support
2.1+

Showing 12 entries

Use Cases

MPP Lake Analytics

High-performance analytics on Iceberg data lakes

  • Complex analytical queries with vectorized execution
  • Multi-catalog data lake analytics and federation
  • High-performance OLAP workloads on lake data
  • Real-time analytics on externally ingested streaming data

Lake Ingestion and ETL

Data ingestion and transformation into Iceberg tables

  • ETL processes writing directly to Iceberg tables
  • Data warehouse modernization with lake storage
  • Batch data processing with INSERT OVERWRITE patterns
  • Data transformation pipelines with CTAS operations

Unified Analytics Platform

Single engine for both ingestion and analytics

  • Organizations wanting unified lake architecture
  • Teams requiring both transformation and querying capabilities
  • Materialized view acceleration for frequently accessed aggregates
  • Cross-catalog analytics with comprehensive catalog support

Hybrid Streaming-Batch Architecture

Batch analytics layer with external streaming ingestion

  • Lambda architectures with Flink streaming + Doris analytics
  • Real-time ingestion via external tools, sub-second query latency
  • Batch processing layer in streaming architectures
  • Historical analysis on continuously updated datasets


💡 Join the OLake Community!

Got questions, ideas, or just want to connect with other data engineers?
👉 Join our Slack Community to get real-time support, share feedback, and shape the future of OLake together. 🚀

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!