Amazon Athena (Engine v3)
Serverless AWS-native query engine with complete DML operations, Lake Formation governance, time travel, and deep AWS ecosystem integration for Iceberg tables
Key Features
AWS Glue Catalog Integration
Only AWS Glue Data Catalog supported for Iceberg. Hive, REST, Nessie, or JDBC catalogs not recognized - tight AWS ecosystem integration
Serverless Query Engine
SELECT, CREATE TABLE STORED AS ICEBERG, CTAS, INSERT INTO. Serverless auto-scaling Presto-based execution with snapshot isolation
Complete DML Operations
Engine v3 supports INSERT INTO, UPDATE, DELETE, and MERGE INTO. UPDATE/DELETE/MERGE write position-delete files (Iceberg v2) for row-level changes
Merge-on-Read Only
Athena operates Iceberg tables in merge-on-read mode only; DML produces delete files, not full rewrites. Copy-on-write is not configurable
No Streaming Support
No built-in streaming ingestion or CDC APIs. External tools (Glue ETL, Flink) must land data in Iceberg; Athena queries latest committed snapshot
Format V2 Only
Creates/writes only Iceberg spec v2 tables; can read v1 but DML blocked until upgrade. Uses Iceberg 1.2.x libraries, no spec v3 features available
Advanced Time Travel
FOR TIMESTAMP AS OF and FOR VERSION AS OF clauses let you query historical snapshots with millisecond precision for audit and analysis
Lake Formation Governance
Access enforced through IAM plus AWS Lake Formation policies (column-, row-, and cell-level). Lake Formation filters govern metadata table visibility
Built-in Optimization
OPTIMIZE ... REWRITE DATA performs bin-pack compaction; VACUUM handles snapshot expiration and orphan cleanup with configurable table properties
AWS Ecosystem Integration
Seamless integration with QuickSight, Glue ETL, CloudTrail audit, transparent metadata caching, and full AWS service mesh connectivity
Amazon Athena Iceberg Feature Matrix
Comprehensive breakdown of Iceberg capabilities in Amazon Athena (Engine v3)
Dimension | Support Level | Implementation Details | Engine Version |
---|---|---|---|
Catalog Types | PartialGlue Only | Only AWS Glue Data Catalog; no Hive, REST, Nessie, or JDBC catalog support | v3 |
SQL Analytics | FullServerless | SELECT, CREATE TABLE STORED AS ICEBERG, CTAS, INSERT INTO with auto-scaling | v3 |
DML Operations | FullComplete CRUD | INSERT, UPDATE, DELETE, MERGE INTO with position-delete files (v2) | v3 |
Storage Strategy | PartialMoR Only | Merge-on-read mode only; DML produces delete files; Copy-on-write not configurable | v3 |
Streaming Support | NoneExternal Tools | No streaming/CDC APIs; external tools (Glue ETL, Flink) required for ingestion | N/A |
Format Support | Limitedv2 Writes Only | Creates/writes v2 only; reads v1 (DML blocked); Parquet/ORC/Avro support | v3 |
Time Travel | FullMillisecond Precision | FOR TIMESTAMP AS OF and FOR VERSION AS OF with millisecond precision | v3 |
Schema Evolution | FullMetadata-only | ALTER TABLE ADD/DROP/RENAME/REPLACE COLUMNS; metadata-only operations | v3 |
Security & Governance | FullLake Formation | IAM + Lake Formation fine-grained access (column/row/cell); CloudTrail audit | v3 |
Optimization Features | FullOPTIMIZE + VACUUM | OPTIMIZE REWRITE DATA (bin-pack); VACUUM (snapshot expiry + orphan cleanup) | v3 |
AWS Integration | FullNative Ecosystem | QuickSight, Glue ETL, CloudTrail, transparent caching, service mesh connectivity | v3 |
Architecture Model | ServerlessZero Infrastructure | Serverless auto-scaling Presto; pay-per-query; zero infrastructure management | v3 |
Showing 12 entries
Use Cases
AWS-Native Data Lake Analytics
Serverless analytics on AWS Glue-managed Iceberg tables
- Business intelligence with QuickSight integration
- Ad-hoc data exploration and analysis
- Cost-effective analytics with pay-per-query model
- Zero-infrastructure analytical workloads
Enterprise Governance and Compliance
Fine-grained security and comprehensive audit trails
- Multi-tenant data lake with Lake Formation policies
- Compliance-heavy industries requiring detailed audit
- Column, row, and cell-level access control
- Regulatory reporting with time travel capabilities
Data Maintenance and Quality
Complete DML operations for data correction workflows
- GDPR compliance data deletion and correction
- Data quality improvement and cleansing
- CDC processing with MERGE operations
- Historical data correction with audit trails
Serverless Query Layer
Auto-scaling query engine for variable workloads
- Unpredictable and variable query patterns
- Development and testing environments
- Cost-conscious deployments with sporadic usage
- Lambda architecture serving layer
Resources & Documentation
Official Documentation
Complete API reference and guides
Getting Started Guide
Quick start tutorials and examples
Query Apache Iceberg Tables
Documentation
Create Iceberg Tables
Documentation
DML Operations Guide
Documentation
Apache Iceberg on AWS
Documentation
Lake Formation Fine-grained Access
Documentation
Athena Engine v3 Reference
Documentation
Serverless CDC with Iceberg
Documentation
Athena Iceberg Tutorial
Documentation