IncrementalAppendScan interface provides an API for reading only the data appended between two snapshots in an Iceberg table.
Overview
IncrementalAppendScan is designed for incremental processing workflows where you need to read only the new data added since the last processed snapshot. This is useful for streaming pipelines, change data capture (CDC), and incremental ETL jobs.Interface
Core Methods
IncrementalAppendScan inherits all methods from IncrementalScan:fromSnapshotInclusive()
Instructs this scan to look for changes starting from a particular snapshot (inclusive).fromSnapshotId- The start snapshot ID (inclusive)
IllegalArgumentException if the start snapshot is not an ancestor of the end snapshot
Example:
fromSnapshotInclusive() with Reference
Instructs this scan to look for changes starting from a particular snapshot reference (inclusive).ref- The start ref name that points to a particular snapshot ID (inclusive)
IllegalArgumentException if the start snapshot is not an ancestor of the end snapshot
fromSnapshotExclusive()
Instructs this scan to look for changes starting from a particular snapshot (exclusive).fromSnapshotId- The start snapshot ID (exclusive)
IllegalArgumentException if the start snapshot is not an ancestor of the end snapshot
Example:
fromSnapshotExclusive() with Reference
Instructs this scan to look for changes starting from a particular snapshot reference (exclusive).ref- The start ref name that points to a particular snapshot ID (exclusive)
toSnapshot()
Instructs this scan to look for changes up to a particular snapshot (inclusive).toSnapshotId- The end snapshot ID (inclusive)
toSnapshot() with Reference
Instructs this scan to look for changes up to a particular snapshot reference (inclusive).ref- The end snapshot ref (inclusive)
useBranch()
Specifies the branch to use for the incremental scan.branch- The branch name
Examples
Basic Incremental Scan
Streaming Incremental Processing
Processing Appends in Time Range
Branch-Based Incremental Scan
Incremental Scan with Filters
Reading All Historical Appends
Inclusive vs Exclusive Boundaries
Checkpointing Pattern
Use Cases
IncrementalAppendScan is ideal for:- Streaming ETL Pipelines - Process only new data since last run
- Change Data Capture - Track appended records over time
- Incremental Analytics - Update aggregations with new data only
- Event Processing - Process new events in chronological order
- Data Replication - Sync only new data to downstream systems