What is Apache Iceberg?
Apache Iceberg is an open table format for huge analytic datasets that brings reliability and performance to data lakes. Created by Netflix, Iceberg provides ACID transactions, schema evolution, partition evolution, and time travel queries on data lakes, bridging the gap between the flexibility of data lakes and the reliability of data warehouses.
ACID Transactions
Full ACID compliance for data lake operations
Schema Evolution
Safe schema changes without rewriting data
Time Travel
Query historical data at any point in time
Apache Iceberg Performance Calculator
Getting Started with Apache Iceberg
Apache Iceberg integrates with popular data processing engines like Spark, Flink, Trino, and Hive.
Spark Integration
// Add Iceberg to Spark
spark.conf.set("spark.sql.extensions",
"org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
spark.conf.set("spark.sql.catalog.spark_catalog",
"org.apache.iceberg.spark.SparkSessionCatalog")
spark.conf.set("spark.sql.catalog.spark_catalog.type", "hive")
spark.conf.set("spark.sql.catalog.local",
"org.apache.iceberg.spark.SparkCatalog")
spark.conf.set("spark.sql.catalog.local.type", "hadoop")
spark.conf.set("spark.sql.catalog.local.warehouse", "s3://my-bucket/warehouse")
// Create Iceberg table
spark.sql("""
CREATE TABLE local.db.events (
id bigint,
timestamp timestamp,
event_type string,
user_id string,
properties map<string, string>
) USING iceberg
PARTITIONED BY (days(timestamp))
""")
Basic Operations
-- Insert data
INSERT INTO local.db.events
VALUES (1, current_timestamp(), 'login', 'user123',
map('device', 'mobile', 'location', 'NYC'));
-- Query with time travel
SELECT * FROM local.db.events
TIMESTAMP AS OF '2023-12-01 10:00:00';
-- Schema evolution
ALTER TABLE local.db.events
ADD COLUMN session_id string;
-- Partition evolution
ALTER TABLE local.db.events
DROP PARTITION FIELD days(timestamp);
ALTER TABLE local.db.events
ADD PARTITION FIELD hours(timestamp);
Real-World Examples
Netflix - Original Creator
Netflix created Iceberg to solve data correctness issues in their petabyte-scale data lake.
- • 300+ PB of data across thousands of tables
- • 100+ million table commits per day
- • 99.9% reduction in data corruption incidents
- • 50% faster queries due to advanced pruning
Apple - Event Analytics
Apple uses Iceberg for large-scale event analytics and user behavior tracking.
- • 100+ TB daily event ingestion
- • Sub-second schema evolution deployments
- • Time travel queries for A/B test analysis
- • 40% storage savings with optimized file layouts
Tabular - Cloud Data Platform
Tabular provides managed Iceberg service for enterprises migrating from traditional warehouses.
- • 10x faster data ingestion vs traditional formats
- • Zero-copy branching for dev/test environments
- • Automatic optimization and compaction services
- • Multi-engine compatibility (Spark, Trino, Flink)
Advanced Features
Time Travel & Branching
-- Time travel to specific timestamp
SELECT * FROM events
FOR SYSTEM_TIME AS OF '2023-12-01 10:00:00';
-- Time travel to specific snapshot
SELECT * FROM events
FOR SYSTEM_VERSION AS OF 12345678;
-- Create branch for experimentation
ALTER TABLE events
CREATE BRANCH feature_branch;
-- Switch to branch
USE BRANCH feature_branch IN events;
-- Merge branches
ALTER TABLE events
REPLACE BRANCH main WITH feature_branch;
Partition Evolution
-- Start with date partitioning
CREATE TABLE events (...)
PARTITIONED BY (days(event_time));
-- Evolve to hourly partitioning
ALTER TABLE events
DROP PARTITION FIELD days(event_time);
ALTER TABLE events
ADD PARTITION FIELD hours(event_time);
-- Add bucket partitioning for user_id
ALTER TABLE events
ADD PARTITION FIELD bucket(16, user_id);
-- Query optimizer uses new partitioning automatically
SELECT * FROM events
WHERE event_time >= '2023-12-01'
AND user_id = 'user123';
Best Practices
✅ Do
- •Use appropriate partitioning strategies for query patterns
- •Enable file compaction and optimize regularly
- •Use Z-ordering for multi-dimensional data
- •Set appropriate snapshot retention policies
- •Monitor table metadata growth
❌ Don't
- •Over-partition tables (creates too many small files)
- •Ignore file size optimization (keep files 128MB-1GB)
- •Use Iceberg for small, frequently changing datasets
- •Skip regular maintenance operations
- •Mix Iceberg and non-Iceberg formats in same pipeline
Architecture Deep Dive
Three-Layer Architecture
Metadata Management
{
"format-version": 2,
"table-uuid": "fb072c92-a02b-11e9-ae26-...",
"location": "s3://bucket/warehouse/table",
"last-sequence-number": 1,
"last-updated-ms": 1515100955770,
"last-column-id": 1,
"schemas": [...],
"partition-specs": [...],
"properties": {
"write.format.default": "parquet",
"write.parquet.compression-codec": "zstd"
},
"snapshots": [...]
}
Snapshot Isolation
Readers: Always see consistent snapshots
Writers: Create new snapshots atomically
Isolation: Serializable transaction isolation
Rollback: Instant rollback to any snapshot