Installation
The latest version of Iceberg is .Using Spark Shell
To use Iceberg in a Spark shell, add the runtime JAR using the--packages option:
If you want to include Iceberg in your Spark installation permanently, add the
iceberg-spark-runtime JAR to Spark’s jars folder.Using Spark SQL
For Spark SQL with catalog configuration:Configuring Catalogs
Iceberg catalogs enable SQL commands to manage tables and load them by name. Configure catalogs using properties underspark.sql.catalog.(catalog_name).
Hadoop Catalog Example
Create a path-based catalog namedlocal for tables under a warehouse directory:
Hive Metastore Catalog
Configure a Hive-based catalog with session catalog support:Creating Your First Table
Row-Level Updates
Iceberg adds row-level SQL updates to Spark:MERGE INTO
Update existing rows and insert new ones in a single operation:DELETE FROM
Remove rows matching a condition:Writing with DataFrames
Iceberg supports the v2 DataFrame write API for programmatic writes:Reading with DataFrames
Load tables by name usingspark.table:
Inspecting Tables
Use metadata tables to inspect table history and snapshots:View Snapshots
View History
View Files
Type Conversion
Spark to Iceberg
When creating tables or writing data, Spark types are automatically converted:| Spark Type | Iceberg Type | Notes |
|---|---|---|
| boolean | boolean | |
| byte, short, integer | integer | Promoted to integer |
| long | long | |
| float | float | |
| double | double | |
| decimal | decimal | |
| timestamp | timestamp with timezone | |
| timestamp_ntz | timestamp without timezone | |
| string, char, varchar | string | |
| binary | binary | Assertion on length for fixed type |
Numeric types support promotion during writes. For example, you can write Spark
integer to Iceberg long.Iceberg to Spark
When reading from Iceberg tables:| Iceberg Type | Spark Type | Supported |
|---|---|---|
| timestamp with timezone | timestamp | ✔️ |
| timestamp without timezone | timestamp_ntz | ✔️ |
| uuid | string | ✔️ |
| time | - | ❌ Not supported |
| variant | variant | ✔️ Spark 4.0+ |
| unknown | null | ✔️ Spark 4.0+ |
Next Steps
DDL Commands
Learn about CREATE, ALTER, and DROP operations
Query Data
Explore SELECT queries and metadata tables
Write Data
Master INSERT INTO and MERGE INTO operations
Procedures
Maintain tables with stored procedures