Last updated:6/30/2025|... min read

Google BigQuery

Serverless Google Cloud data warehouse with managed Iceberg tables, automatic optimization, Storage Write API streaming, and deep GCP ecosystem integration

Key Features

Managed + External

Dual Table Model

BigQuery-managed Iceberg (internal catalog, full DML) and BigLake external Iceberg (Dataplex/HMS/Glue via GCS, query + limited writes)

Explore details

100

Zero Maintenance

Automatic Optimization

Fully automatic file-size tuning, clustering, metadata compaction & orphan-file GC. No user-issued OPTIMIZE or VACUUM commands required

Explore details

Managed Full, External Limited

Asymmetric DML Support

Managed tables: INSERT, UPDATE, DELETE, MERGE with GoogleSQL semantics. External tables: limited INSERT support via Dataflow/Spark

Explore details

Auto MoR + CoW

Intelligent Storage Strategy

Operations generate position/equality delete files (MoR). Automatic compaction, clustering, and garbage collection (CoW) in background

Explore details

High-Throughput Preview

Storage Write API Streaming

High-throughput streaming via Storage Write API (Preview) - Dataflow, Beam, Spark. No built-in CDC apply; use Datastream + Dataflow patterns

Explore details

Limited File Formats

Parquet-Only Format

Parquet only (preview). ORC/Avro not yet supported. v2 required for managed tables, v3 evaluation planned for 2025

Explore details

Managed vs External

Differential Time Travel

Managed tables: FOR SYSTEM_TIME AS OF syntax translating to snapshots. External BigLake tables: no SQL time travel currently

Explore details

IAM + Column Masking

BigQuery-Native Security

IAM permissions like native BigQuery tables. Column-level security & masking on managed Iceberg. External via BigLake/Dataplex policy tags

Explore details

Pre-GA Features

Preview Status Limitations

Feature is Pre-GA (behavior may change). No table rename/clone, limited concurrency, external writes rely on Dataflow/Spark

Explore details

100

Native Services

GCP Ecosystem Integration

BigQuery ML on Iceberg data, Dataform transformations, BigQuery Omni cross-cloud queries, end-to-end lineage through Dataplex

Explore details

Google BigQuery Iceberg Feature Matrix

Comprehensive breakdown of Iceberg capabilities in Google BigQuery

Dimension	Support Level	Implementation Details	Status
Catalog Types	PartialDual Model	BigQuery-managed (internal) + BigLake external (Dataplex/HMS/Glue); no REST/Nessie	Preview
SQL Analytics	PartialTable-dependent	Managed: full CREATE/CTAS/INSERT/DML; External: SELECT + limited INSERT via Dataflow	Preview
DML Operations	PartialManaged Full	Managed: INSERT/UPDATE/DELETE/MERGE; External: limited INSERT via external tools	Preview
Storage Strategy	FullAuto Optimization	MoR operations + automatic CoW optimization; background compaction/clustering/GC	Preview
Streaming Support	PartialStorage Write API	High-throughput via Storage Write API; Dataflow/Beam/Spark; no built-in CDC	Preview
Format Support	LimitedParquet Only	Parquet only; no ORC/Avro; v2 required; v3 evaluation planned 2025	Preview
Time Travel	PartialManaged Only	Managed: FOR SYSTEM_TIME AS OF; External: no SQL time travel currently	Preview
Schema Evolution	FullMetadata-only	ADD/DROP/RENAME columns; type widening; instant reflection in information_schema	Preview
Security & Governance	FullIAM + Column Masking	BigQuery IAM + column-level security/masking; BigLake/Dataplex policy tags	Preview
GCP Integration	FullNative Ecosystem	BigQuery ML, Dataform, Omni, Dataplex lineage; multi-engine access to managed tables	Preview
Automatic Optimization	FullZero Maintenance	Automatic file compaction, clustering, metadata optimization, garbage collection	Preview
Preview Limitations	SeveralPre-GA	No table rename/clone; concurrency limits; external DML limited; behavior may change	Preview

Showing 12 entries

Live data

For issues, click here (GitHub)

Use Cases

Serverless Data Warehouse

Fully managed Iceberg tables with automatic optimization

Modern data warehouse with zero maintenance overhead
Analytics workloads requiring automatic optimization
Teams wanting BigQuery's serverless benefits on Iceberg
High-frequency update scenarios with background optimization

GCP-Native Analytics Platform

Deep integration with Google Cloud ecosystem services

BigQuery ML on Iceberg data for machine learning
Dataform transformations on Iceberg tables
Cross-cloud analytics with BigQuery Omni
End-to-end data lineage through Dataplex

Streaming Analytics with Storage Write API

High-throughput streaming ingestion for real-time analytics

Real-time streaming data analysis
High-volume ingestion via Dataflow/Beam/Spark
Near real-time dashboard and reporting
CDC processing with Datastream integration

Multi-Engine Data Lake

Iceberg tables accessible from multiple GCP services

Data shared between BigQuery and Dataproc Spark/Flink
Multi-engine analytical workloads
Hybrid batch and streaming architectures
Open format data lake with BigQuery performance

Resources & Documentation

Official Documentation

Complete API reference and guides

Getting Started Guide

Quick start tutorials and examples

BigQuery Iceberg Tables Documentation

Documentation

Announcing BigQuery Iceberg Support

Documentation

Create External Iceberg Tables

Documentation

DML Operations Guide

Documentation

Column-level Security

Documentation

Time Travel Documentation

Documentation

BigQuery Iceberg Limitations

Documentation

Metadata Caching for External Tables

Documentation

Need Assistance?

If you have any questions or uncertainties about setting up OLake, contributing to the project, or troubleshooting any issues, we’re here to help. You can:

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!

Google BigQuery

Key Features

Dual Table Model

Automatic Optimization

Asymmetric DML Support

Intelligent Storage Strategy

Storage Write API Streaming

Parquet-Only Format

Differential Time Travel

BigQuery-Native Security

Preview Status Limitations

GCP Ecosystem Integration

Google BigQuery Iceberg Feature Matrix

Use Cases

Serverless Data Warehouse

GCP-Native Analytics Platform

Streaming Analytics with Storage Write API

Multi-Engine Data Lake

Resources & Documentation

Official Documentation

Getting Started Guide

BigQuery Iceberg Tables Documentation

Announcing BigQuery Iceberg Support

Create External Iceberg Tables

DML Operations Guide

Column-level Security

Time Travel Documentation

BigQuery Iceberg Limitations

Metadata Caching for External Tables

Need Assistance?

Join our growing community

GitHub

Slack

Twitter

LinkedIn

YouTube