Skip to content

Enabling Support for Conditional Multi-Part Upload #18093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

x-INFiN1TY-x
Copy link

@x-INFiN1TY-x x-INFiN1TY-x commented Apr 28, 2025

Enabling Support for Conditional Multi-Part Upload (#18093)

At a Glance:
Implement both synchronous and asynchronous conditional-upload methods in S3BlobContainer, backed by logical functions in AsyncTransferManager

Method Location Purpose/Functionality
executeMultipartUploadConditionally S3BlobContainer Conditional multipart upload (sync)
asyncBlobUploadConditionally S3BlobContainer Conditional streamed upload (async)
uploadObjectConditionally AsyncTransferManager Core async conditional upload implementation
Helpers in AsyncTransferManager   uploadInOneChunkConditionally
Helpers in AsyncTransferManager   uploadInPartsConditionally
Helpers in AsyncTransferManager   doUploadInPartsConditionally
Helpers in AsyncTransferManager   handleConditionalException

Summary of Changes

  • Add executeMultipartUploadConditionally(...) to S3BlobContainer

  • Add asyncBlobUploadConditionally(...) to S3BlobContainer

  • Implement uploadObjectConditionally(...) in AsyncTransferManager

  • Introduce helper routines: uploadInOneChunkConditionally, uploadInPartsConditionally, doUploadInPartsConditionally, and handleConditionalException

  • Define ConditionalWriteOptions to express “If-Match” and “If-None-Match” conditions

  • Define ConditionalWriteResponse to wrap returned ETag or propagate failures

  • Extend tests to cover conditional success, precondition failures, data integrity, and size boundaries


Core Workflow

  1. Validate part size (≥ 5 MB) and total upload size (≤ 5 TB).

  2. Initialize upload by creating a multipart request with bucket, key, ACL, storage class, metadata, and SSE (AES256 or KMS).

  3. Transfer Data

    • Sync: iterate parts sequentially using AWS SDK, buffering if retries are enabled.

    • Async: delegate to AsyncTransferManager.uploadObjectConditionally, which chooses single-chunk or multipart flow via its helper methods.

  4. Complete Conditionally by setting either If-Match or If-None-Match: * on the complete-upload request based on ConditionalWriteOptions.

  5. Respond with ConditionalWriteResponse.success(etag) on success or translate failures:

    • HTTP 412OpenSearchException("stale_primary_shard") + abort multipart.

    • Other errorsIOException (sync) or failed CompletableFuture/listener (async) + abort if uploadId exists.

    • Abort failures are logged but do not mask the original exception.


Test Coverage

executeMultipartUploadConditionally Test Suite

  • testExecuteMultipartUploadConditionallyWithEtagMatchSuccess

  • testExecuteMultipartUploadConditionallyWithMetadataAndSSE

  • testExecuteMultipartUploadConditionallyContentIntegrity

  • testExecuteMultipartUploadConditionallySizeValidation

  • testExecuteMultipartUploadConditionallyPreconditionFailed

AsyncTransferManager Conditional Upload Tests

  • testConditionalOneChunkUpload

  • testConditionalOneChunkUploadPreconditionFailed

  • testConditionalOneChunkUploadCorruption

  • testConditionalMultipartUploadPreconditionFailed

  • testConditionalMultipartUploadCorruption

Container-Level and Retry Tests

  • testFailureWhenLargeFileRedirectedConditional

  • testLargeFileRedirectedConditional

  • testLargeFileRedirectedConditionalPreconditionFailed

  • testLargeFileRedirectedConditionalConflict

  • testWriteBlobByStreamsConditionalFailure


Related Issue

Concerned RFC : RFC #17763
Parent Meta Issue : Meta #17859

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Flow Diagram for executeMultipartUploadConditionally:

Mermaid Chart - Create complex, visual diagrams with text  A smarter way of creating diagrams -2025-05-29-085837

Copy link
Contributor

❌ Gradle check result for 2c31c83: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@x-INFiN1TY-x
Copy link
Author

Flaky Test :: #17540 | #17551

  • classMethod – org.opensearch.repositories.s3.S3BlobContainerRetriesTests

@x-INFiN1TY-x x-INFiN1TY-x changed the title Support Conditional Writes for Multi-Part S3 Uploads Enabling Support for Conditional Multi-Part Upload [AWS S3] Apr 28, 2025
@x-INFiN1TY-x x-INFiN1TY-x marked this pull request as ready for review April 28, 2025 21:54
Copy link
Contributor

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also support async multipart upload(AsyncTransferManager has the logic). Do you want to extend that capability as well?
cc: @ashking94

@@ -509,6 +514,169 @@ private String buildKey(String blobName) {
return keyPath + blobName;
}

public void executeMultipartUploadIfEtagMatches(
Copy link
Contributor

@Bukhtawar Bukhtawar May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this code getting consumed? I don't see a corresponding change in the generic blob store interface

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diag

Sorry missed this, The executeMultipartUploadIfEtagMatches method (now renamed to executeMultipartUploadConditionally) is meant to be consumed via the writeBlobWithMetadataConditionally and writeBlobConditionally methods in the generic blob store interface. These interfaces are introduced in this PR currently under review.

@ashking94
Copy link
Member

We also support async multipart upload(AsyncTransferManager has the logic). Do you want to extend that capability as well? cc: @ashking94

Check #18064 (review). I had the same question. Will let this be addressed at both places.

@x-INFiN1TY-x x-INFiN1TY-x requested a review from a team as a code owner May 28, 2025 12:59
Copy link
Contributor

❌ Gradle check result for 9fdefaa: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for d8d8f9a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@x-INFiN1TY-x
Copy link
Author

Flaky Test :: #17551

classMethod – org.opensearch.repositories.s3.S3BlobContainerRetriesTests

@x-INFiN1TY-x x-INFiN1TY-x changed the title Enabling Support for Conditional Multi-Part Upload [AWS S3] Enabling Support for Conditional Multi-Part Upload May 29, 2025
Copy link
Contributor

❌ Gradle check result for 7a7e240: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 5cfeb6e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for cbc0188: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 0eedd2c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@x-INFiN1TY-x
Copy link
Author

Flaky Test :: #14293

ResourceAwareTasksTests #14293

@x-INFiN1TY-x
Copy link
Author

We also support async multipart upload(AsyncTransferManager has the logic). Do you want to extend that capability as well? cc: @ashking94

Check #18064 (review). I had the same question. Will let this be addressed at both places.

Hi @Bukhtawar (and thanks again @ashking94 for the nudge)—I’ve now implemented the full async conditional-upload support alongside the sync flow:

  • S3BlobContainer: introduced asyncBlobUploadConditionally(...) as the async entry point.
  • AsyncTransferManager: implemented uploadObjectConditionally(...), dynamically selecting single-chunk or multipart flows.
  • Helper Methods in AsyncTransferManager: added uploadInOneChunkConditionally, uploadInPartsConditionally, doUploadInPartsConditionally, and handleConditionalException to encapsulate the core logic and error translation.

Comprehensive tests have been added to validate every path:

Test suite

  • testConditionalOneChunkUpload
  • testConditionalOneChunkUploadPreconditionFailed
  • testConditionalOneChunkUploadCorruption
  • testConditionalMultipartUploadPreconditionFailed
  • testConditionalMultipartUploadCorruption
  • testFailureWhenLargeFileRedirectedConditional
  • testLargeFileRedirectedConditional
  • testLargeFileRedirectedConditionalPreconditionFailed
  • testLargeFileRedirectedConditionalConflict
  • testWriteBlobByStreamsConditionalFailure

CC : @ashking94 @Bukhtawar @astute-decipher

Mermaid Chart - Create complex, visual diagrams with text  A smarter way of creating diagrams -2025-05-28-172238

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants