Documentation Index
Fetch the complete documentation index at: https://mintlify.com/apache/iceberg/llms.txt
Use this file to discover all available pages before exploring further.
DeleteReachableFiles
TheDeleteReachableFiles action deletes all files referenced by a table metadata file. This action is used to completely clean up table storage after a table is dropped and no longer needed.
Interface
Overview
When a table is dropped, its metadata is removed from the catalog, but the underlying data files remain in storage. This action provides a way to irreversibly delete all reachable files including:- Data files
- Delete files (both equality and position deletes)
- Manifest files
- Manifest list files
- Metadata JSON files
- Version hint files
Methods
deleteWith
Provides a custom delete function for file removal.deleteFunc- A function that accepts a file path and performs the deletion
this for method chaining
Example:
executeDeleteWith
Provides an executor service for parallel file deletion.executorService- The executor service to use for parallel deletes
this for method chaining
This executor service is only used if a custom delete function is provided via
deleteWith() or if the FileIO doesn’t support bulk operations. Otherwise, parallelism is controlled by the FileIO’s bulk delete implementation.io
Sets the FileIO to use for file removal.io- The FileIO instance to use for file operations
this for method chaining
Example:
Result
TheResult interface provides statistics about the deletion operation.
Methods
deletedDataFilesCount
Returns the number of deleted data files.deletedEqualityDeleteFilesCount
Returns the number of deleted equality delete files.deletedPositionDeleteFilesCount
Returns the number of deleted position delete files.deletedManifestsCount
Returns the number of deleted manifest files.deletedManifestListsCount
Returns the number of deleted manifest list files.deletedOtherFilesCount
Returns the number of deleted metadata JSON and version hint files.Usage Examples
Basic Table Cleanup
With Custom Delete Function
With Parallel Execution
With Custom FileIO
Complete Cleanup with Verification
Best Practices
- Verify before deletion: Ensure the table is truly no longer needed before running this action
- Check catalog state: Confirm the table has been dropped from the catalog
- Backup metadata: Keep a backup of the metadata file if recovery might be needed
- Use correct metadata file: Point to the latest metadata file for the dropped table
- Monitor progress: Track deletion results to ensure all files are removed
- Test in non-production: Verify the action works as expected before running in production
Important Notes
- Metadata file required: You must provide the path to a metadata file, not a table identifier
- All snapshots deleted: All data from all snapshots will be removed
- No catalog interaction: This action only deletes files; it does not interact with the catalog
- Parallel execution: FileIOs that support bulk operations will handle parallelism automatically
Related
- DeleteOrphanFiles - Remove unreferenced orphan files
- ExpireSnapshots - Remove old snapshots while keeping the table
- RemoveDanglingDeleteFiles - Remove obsolete delete files