-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-37021][state/forst] Fix incorrect paths when reusing files for checkpoints. #26040
Conversation
@flinkbot run azure |
565b881
to
7230e0c
Compare
7230e0c
to
b62907e
Compare
|
||
// Try path-copying first. If failed, fallback to bytes-copying | ||
StreamStateHandle targetStateHandle = | ||
tryPathCopyingToCheckpoint(sourceStateHandle, checkpointStreamFactory, stateScope); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tryPathCopyingToCheckpoint can fail with an IOException as well as returning null. It would be better to always return null for a failure I think - otherwise for an IOException we will not try bytesCopyingToCheckpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments. I've modified the code as suggested. PTAL
filePath.getName()); | ||
LOG.trace("Write file to checkpoint: {}, {}", filePath, handleAndLocalPath.getHandle()); | ||
return handleAndLocalPath; | ||
return HandleAndLocalPath.of(targetStateHandle, dbFilePath.getName()); | ||
} | ||
|
||
/** | ||
* Duplicate file to checkpoint storage by calling {@link CheckpointStreamFactory#duplicate} if | ||
* possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: javadoc @params etc are missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added params and returns in the javadoc
@Nullable ForStFlinkFileSystem forStFlinkFileSystem, | ||
boolean isDbPathUnderCheckpointPathForSnapshot) { | ||
DataTransferStrategy strategy; | ||
if (forStFlinkFileSystem == null || isDbPathUnderCheckpointPathForSnapshot) { | ||
if (sharingFilesStrategy != SnapshotType.SharingFilesStrategy.FORWARD_BACKWARD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for ForStIncrementalSnapshotStrategy
, the SnapshotType.SharingFilesStrategy.FORWARD
is not supported since it breaks the precondition of file sharing between CP and DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. I've added a check in ForStIncrementalSnapshotStrategy#asyncSnapshot()
. It throws an IllegalArgumentException
when encountering SharingFilesStrategy.FORWARD
.
c0637a0
to
7d2656e
Compare
Reviewed by Chi on 23/01/2025 Need a committer to review. Requested minor change has been made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, LGTM.
...kend-forst/src/main/java/org/apache/flink/state/forst/fs/filemapping/FileMappingManager.java
Show resolved
Hide resolved
7d2656e
to
a2bd9a9
Compare
What is the purpose of the change
This pull request addresses several bugs in ForSt that may prevent proper file reuse, leading to slower or failed snapshot and restoration operations.
Brief change log
DataTransferStrategyBuilder
function properly.URI
to tell whether twoFileSystem
are the same.SharingFilesStrategy
is 'FORWARD_BACKWARD'dbFilePath
andrealSourcePath
.Verifying this change
This change is already covered by existing tests, such as (please describe tests).
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation