Skip to content

Commit c199c73

Browse files
authored
HADOOP-19576. S3A: Disable Purging Pending MPUs Before Directory Purge (#7722)
Contributed by Syed Shameerur Rahman
1 parent f93aff5 commit c199c73

File tree

2 files changed

+24
-6
lines changed

2 files changed

+24
-6
lines changed

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -684,9 +684,8 @@ public void initialize(URI name, Configuration originalConf)
684684
s3ExpressStore = isS3ExpressStore(bucket, endpoint);
685685

686686
// should the delete also purge uploads?
687-
// happens if explicitly enabled, or if the store is S3Express storage.
688687
dirOperationsPurgeUploads = conf.getBoolean(DIRECTORY_OPERATIONS_PURGE_UPLOADS,
689-
s3ExpressStore);
688+
DIRECTORY_OPERATIONS_PURGE_UPLOADS_DEFAULT);
690689

691690
this.isMultipartUploadEnabled = conf.getBoolean(MULTIPART_UPLOADS_ENABLED,
692691
DEFAULT_MULTIPART_UPLOAD_ENABLED);

hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1218,10 +1218,29 @@ java.io.FileNotFoundException: Completing multi-part upload on fork-5/test/multi
12181218
This can happen when all outstanding uploads have been aborted, including the
12191219
active ones.
12201220

1221-
If the bucket has a lifecycle policy of deleting multipart uploads, make sure
1222-
that the expiry time of the deletion is greater than that required for all open
1223-
writes to complete the write,
1224-
*and for all jobs using the S3A committers to commit their work.*
1221+
When working with S3A committers and multipart uploads (MPUs), consider these important guidelines:
1222+
1223+
1. **Bucket Lifecycle Policies:**
1224+
- If your bucket has a lifecycle policy for deleting multipart uploads
1225+
- Set the deletion expiry time long enough to:
1226+
- Complete all open write operations
1227+
- Allow S3A committers to finish their commit process
1228+
1229+
2. **Directory Operations and MPUs:**
1230+
- Setting `fs.s3a.directory.operations.purge.uploads=true` will abort all pending MPUs before directory cleanup
1231+
- For jobs using S3A committers:
1232+
- Set `fs.s3a.directory.operations.purge.uploads=false` when directories need to be overwritten before job completion
1233+
- This prevents accidental abortion of active uploads during the commit phase
1234+
1235+
1236+
### S3 Express Store directory object not getting deleted
1237+
1238+
When working with S3 Express store buckets (unlike standard S3 buckets), follow these steps to purge a directory object:
1239+
1240+
1. Set `fs.s3a.directory.operations.purge.uploads=true` if you need to delete a directory object that has pending multipart uploads (MPUs).
1241+
1242+
2. This setting ensures that all pending MPUs are aborted before the directory object is deleted, which is a requirement specific to S3 Express store buckets.
1243+
12251244

12261245
### Application hangs after reading a number of files
12271246

0 commit comments

Comments
 (0)