Schema Structure
A schema is a list of named columns with types. The top-level schema is a struct type, and each field has:- Name - the column name
- Field ID - a unique integer ID (never reused)
- Type - primitive or nested type
- Required/Optional - whether values can be null
- Doc - optional documentation string
- Default values - initial and write defaults (v3+)
Primitive Types
Iceberg supports a rich set of primitive types:Numeric Types
Numeric Types
- boolean - True or false
- int - 32-bit signed integers (can promote to long)
- long - 64-bit signed integers
- float - 32-bit IEEE 754 floating point (can promote to double)
- double - 64-bit IEEE 754 floating point
- decimal(P,S) - Fixed-point decimal with precision P and scale S (max precision 38)
Date and Time Types
Date and Time Types
- date - Calendar date without timezone or time
- time - Time of day without date or timezone (microsecond precision)
- timestamp - Timestamp without timezone (microsecond precision)
- timestamptz - Timestamp with timezone (stored as UTC, microsecond precision)
- timestamp_ns - Timestamp without timezone (nanosecond precision, v3+)
- timestamptz_ns - Timestamp with timezone (nanosecond precision, v3+)
String and Binary Types
String and Binary Types
- string - Arbitrary-length UTF-8 encoded character sequences
- uuid - Universally unique identifiers (stored as 16-byte fixed)
- fixed(L) - Fixed-length byte array of length L
- binary - Arbitrary-length byte array
Special Types (v3+)
Special Types (v3+)
- variant - Semi-structured JSON-like data with flexible schema
- geometry(C) - Geospatial features with linear edge interpolation
- geography(C, A) - Geospatial features with specified edge algorithm
- unknown - Placeholder type for undetermined columns (must be optional)
Nested Types
Iceberg supports three nested types:Structs
A struct is a tuple of typed, named fields:- Has a unique field ID
- Can be required or optional
- Can be any type (including other structs)
- Can have a default value (v3+)
Lists
A list is a collection of values with a single element type:- Have a unique field ID for the element
- Can be required or optional
- Can be any type (including nested types)
Maps
A map is a collection of key-value pairs:Field IDs: The Key to Evolution
Every field in an Iceberg schema has a unique integer ID that:- Never changes - the ID follows the field through renames
- Is never reused - even if a field is deleted, its ID is retired
- Identifies the column in data files - not the name or position
Column Projection
Iceberg reads data files using field IDs, not names or positions:- Read the data file’s schema (embedded in Parquet/ORC/Avro)
- Map field IDs from the read schema to the data file schema
- Project columns by matching IDs
- Handle missing fields with defaults or nulls
- Renaming columns doesn’t require rewriting data
- Reordering columns is a metadata-only operation
- Adding columns doesn’t affect existing files
- Dropping columns doesn’t break old data files
Type Promotion
Iceberg supports safe type promotions:| From Type | To Type (v1, v2) | To Type (v3+) |
|---|---|---|
| int | long | long |
| float | double | double |
| decimal(P,S) | decimal(P’,S) where P’ > P | decimal(P’,S) where P’ > P |
| date | - | timestamp, timestamp_ns |
| unknown | - | any type |
Promotion from
timestamp to timestamptz is not allowed as it would change the semantic meaning of values.Default Values (v3+)
Format version 3 adds support for default values:- initial-default - Used for rows written before the field was added
- write-default - Used for new rows if the writer doesn’t supply a value
Identifier Fields
Schemas can declare which fields identify unique entities (though uniqueness is not enforced):- Must be primitive types (not float or double)
- Cannot be optional
- Cannot be nested in maps or lists
- Define row “sameness” for merge operations
Reserved Field IDs
Field IDs above 2147483447 are reserved for metadata columns:| Field ID | Name | Type | Description |
|---|---|---|---|
| 2147483646 | _file | string | Path of the file containing the row |
| 2147483645 | _pos | long | Row position in the source file |
| 2147483644 | _deleted | boolean | Whether the row is deleted |
| 2147483543 | _change_type | string | Change type in changelog (INSERT, DELETE, etc.) |
| 2147483540 | _row_id | long | Unique row identifier for lineage (v3+) |
Schema Evolution
Iceberg supports comprehensive schema evolution. See the Evolution guide for details on:- Adding, dropping, and renaming columns
- Reordering fields
- Type promotion
- Modifying nested structures
- Default value management
Working with Schemas
View Current Schema
Access Schema History
Find Fields by Name
Evolve Schema
Learn More
Schema Evolution
Learn how to safely evolve schemas over time
Table Format
Understand how schemas fit into the overall table format