Skip to content

feat(writer): the RollingFileWriter closes files in background when concurrency is set#1571

Closed
fvaleye wants to merge 2 commits intoapache:mainfrom
fvaleye:feat/improve-rolling-writer-closing-in-background
Closed

feat(writer): the RollingFileWriter closes files in background when concurrency is set#1571
fvaleye wants to merge 2 commits intoapache:mainfrom
fvaleye:feat/improve-rolling-writer-closing-in-background

Conversation

@fvaleye
Copy link
Copy Markdown
Contributor

@fvaleye fvaleye commented Jul 31, 2025

Which issue does this PR close?

What changes are included in this PR?

My general approach:

  • Non-blocking rollover: new files start immediately while old ones close concurrently, if we reach the limit
  • Controlled concurrency: uses max_concurrent_closes to prevent memory exhaustion. By default, the default behavior is kept by having only one background task running.
  • Resource management: Automatic cleanup and proper error propagation when calling close()

The main idea suggested by the issue is to improve the RollingFileWriter to close files in the background, improving performance by allowing writes to continue without waiting for the previous file to be fully closed.

Key changes:

  • A JoinSet of all closing futures is now used to manage background closing tasks.
  • When rolling over, the writer spawns a task to close the previous file and immediately starts writing to a new one.
  • The close method now waits for all background tasks to complete before returning.
  • Concurrency can be configured via a parameter to control how many files are closed in parallel.

Are these changes tested?

Yes, with unit-testing and local benchmark tests

…oncurrency is set

Improve the RollingFileWriter close files in the background, improving performance by allowing writes to continue without waiting for the previous file to be fully closed.

Key changes:
- A JoinSet of all closing futures is now used to manage background closing tasks.
- When rolling over, the writer spawns a task to close the previous file and immediately starts writing to a new one.
- The close method now waits for all background tasks to complete before returning.
- Concurrency can be configured via a parameter to control how many files are closed in parallel.
@fvaleye fvaleye force-pushed the feat/improve-rolling-writer-closing-in-background branch from 522c786 to c7d051c Compare July 31, 2025 14:13
@fvaleye fvaleye force-pushed the feat/improve-rolling-writer-closing-in-background branch from c7d051c to 377674b Compare July 31, 2025 14:36
@liurenjie1024
Copy link
Copy Markdown
Contributor

Thanks @fvaleye for this pr, but I don't think we should do this now. As I said in previous discussion, such kind of optimization typically consumes more memory since an unclosed writer will hold memory in it. This optimization make things difficult to predict and manage when integrated with compute engine.

@fvaleye
Copy link
Copy Markdown
Contributor Author

fvaleye commented Aug 1, 2025

Thanks @fvaleye for this pr, but I don't think we should do this now. As I said in previous discussion, such kind of optimization typically consumes more memory since an unclosed writer will hold memory in it. This optimization make things difficult to predict and manage when integrated with compute engine.

Thanks for your feedback @liurenjie1024 !
I understand, that's exactly why I took a conservative approach, using default settings and flexible concurrency parameters with safeguards.
I will close this PR for now.

@fvaleye fvaleye closed this Aug 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Discussion: Optimize RollingFileWriter by closing asynchronously

2 participants