ComputePartitionStats

The ComputePartitionStats action computes and writes partition statistics for an Iceberg table. This action helps optimize query planning by providing metadata about partition-level data characteristics.

Interface

public interface ComputePartitionStats extends Action<ComputePartitionStats, ComputePartitionStats.Result>

Overview

Partition statistics provide valuable metadata that query engines can use to optimize execution plans. The action:

Computes statistics for partitions in a table snapshot
Writes statistics to a dedicated statistics file
Uses the current snapshot by default
Can target a specific snapshot if needed

Methods

snapshot

Choose a specific table snapshot to compute partition statistics.

ComputePartitionStats snapshot(long snapshotId)

Parameters:

snapshotId - The ID of the snapshot for which stats need to be computed

Returns: this for method chaining Example:

action.snapshot(1234567890L);

If not specified, the action uses the current snapshot of the table.

Result

The Result interface provides information about the computed statistics.

Methods

interface Result {
  PartitionStatisticsFile statisticsFile();
}

statisticsFile

Returns the statistics file that was written, or null if no statistics were collected. Returns: PartitionStatisticsFile or null

Usage Examples

Compute Stats for Current Snapshot

// Compute partition statistics for the current snapshot
ComputePartitionStats.Result result = actions
  .computePartitionStats(table)
  .execute();

PartitionStatisticsFile statsFile = result.statisticsFile();
if (statsFile != null) {
  System.out.println("Statistics file: " + statsFile.path());
  System.out.println("Snapshot ID: " + statsFile.snapshotId());
}

Compute Stats for Specific Snapshot

// Compute partition statistics for a specific snapshot
long targetSnapshotId = table.currentSnapshot().parentId();

ComputePartitionStats.Result result = actions
  .computePartitionStats(table)
  .snapshot(targetSnapshotId)
  .execute();

if (result.statisticsFile() != null) {
  System.out.println("Computed stats for snapshot: " + targetSnapshotId);
}

Check Statistics File Details

// Compute stats and examine the results
ComputePartitionStats.Result result = actions
  .computePartitionStats(table)
  .execute();

PartitionStatisticsFile statsFile = result.statisticsFile();
if (statsFile != null) {
  System.out.println("Statistics Details:");
  System.out.println("  Path: " + statsFile.path());
  System.out.println("  Snapshot: " + statsFile.snapshotId());
  System.out.println("  Size: " + statsFile.fileSizeInBytes() + " bytes");
} else {
  System.out.println("No statistics were collected");
}

Best Practices

Compute after significant data changes: Run this action after major writes or rewrites to keep statistics current
Use with query optimization: Ensure your query engine is configured to use partition statistics
Monitor statistics freshness: Outdated statistics may lead to suboptimal query plans
Consider snapshot selection: For historical analysis, compute stats on specific snapshots

Partition statistics are most beneficial for partitioned tables with selective query patterns.

ComputeTableStats - Compute table-level statistics
ExpireSnapshots - Manage old snapshots
RewriteDataFiles - Optimize data file layout

Core API

Catalog API

Scan API

Write API

Actions API

REST Catalog API

Types & Expressions

ComputePartitionStats Action

ComputePartitionStats

Interface

Overview

Methods

snapshot

Result

Methods

statisticsFile

Usage Examples

Compute Stats for Current Snapshot

Compute Stats for Specific Snapshot

Check Statistics File Details

Best Practices

​ComputePartitionStats

​Interface

​Overview

​Methods

​snapshot

​Result

​Methods

​statisticsFile

​Usage Examples

​Compute Stats for Current Snapshot

​Compute Stats for Specific Snapshot

​Check Statistics File Details

​Best Practices

​Related

ComputePartitionStats

Interface

Overview

Methods

snapshot

Result

Methods

statisticsFile

Usage Examples

Compute Stats for Current Snapshot

Compute Stats for Specific Snapshot

Check Statistics File Details

Best Practices

Related