-
Notifications
You must be signed in to change notification settings - Fork 303
eagle3 cb impl with top-1 proposal #2740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements Eagle speculative decoding functionality for top-1 proposal generation. The implementation adds support for Eagle3 mode, which enables accelerated text generation through speculative decoding with hidden state sharing between main and draft models.
Key changes include:
- Added Eagle decoding implementation with model transformation pipelines for hidden state extraction
- Integrated safetensor parsing for Eagle3 configuration data (d2t mappings)
- Extended continuous batching pipeline to support Eagle mode with hidden state management
Reviewed Changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/continuous_batching/accuracy/continuous_batching_eagle_decoding.cpp | New Eagle decoding accuracy test tool |
| src/cpp/src/speculative_decoding/speculative_decoding_impl.hpp | Added Eagle decoding class definitions and model transformation passes |
| src/cpp/src/speculative_decoding/speculative_decoding_impl.cpp | Core Eagle decoding implementation with model transformations |
| src/cpp/src/continuous_batching/pipeline.cpp | Integration of Eagle mode into pipeline construction |
| src/cpp/src/continuous_batching/model_runner.hpp | Added hidden state management functionality |
| samples/cpp/text_generation/eagle_speculative_lm.cpp | New Eagle speculative decoding sample |
| src/cpp/src/safe_tensor_wrapper.hpp | New safetensor parsing utilities |
Comments suppressed due to low confidence (1)
src/cpp/src/continuous_batching/model_runner.hpp:1
- This appears to be modifying the token index without bounds checking on the d2t array. Add bounds checking to prevent potential buffer overflow.
// Copyright (C) 2023-2025 Intel Corporation
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
src/cpp/src/speculative_decoding/continuous_batching_for_speculative_decoding_impl.cpp
Outdated
Show resolved
Hide resolved
src/cpp/src/speculative_decoding/continuous_batching_for_speculative_decoding_impl.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: fishbell <[email protected]>
Signed-off-by: fishbell <[email protected]>
Signed-off-by: fishbell <[email protected]>
Signed-off-by: fishbell <[email protected]>
Signed-off-by: fishbell <[email protected]>
Signed-off-by: fishbell <[email protected]>
Signed-off-by: fishbell <[email protected]>
Signed-off-by: fishbell <[email protected]>
tools/continuous_batching/accuracy/continuous_batching_speculative_decoding.cpp
Show resolved
Hide resolved
tools/continuous_batching/benchmark/continuous_batching_benchmark.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 28 out of 29 changed files in this pull request and generated 10 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
tools/continuous_batching/accuracy/continuous_batching_speculative_decoding.cpp
Outdated
Show resolved
Hide resolved
src/cpp/src/speculative_decoding/continuous_batching_for_speculative_decoding_impl.cpp
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp
Outdated
Show resolved
Hide resolved
|
Please do NOT merge until we will have more-less clear vision about solution in optimum-intel side. The model preparation can affect IRs and their structure so that the proposed solution in this PR will be incompatible. Now we have no the full architectural proposal for eagle3 and we need to pass it for model preparation in arch review. @peterchen-intel, did you have internal code-review here? Is the code clean enough? Best regards, |
@rkazants 50 comments from all reviewers (including 12 from copilot) have been resolved, the code is clean now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp
Outdated
Show resolved
Hide resolved
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…genai into bell/eagle_cb_impl
…vino.genai into bell/eagle_cb_impl
Signed-off-by: fishbell <[email protected]>
Signed-off-by: fishbell <[email protected]>
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp
Outdated
Show resolved
Hide resolved
src/cpp/src/speculative_decoding/speculative_decoding_eagle3_impl.cpp
Outdated
Show resolved
Hide resolved
This is not a blocking comment and will be addressed separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: xuchen-intel <[email protected]>
Co-authored-by: Copilot <[email protected]>
|
replace with #3055 |
…genai into bell/eagle_cb_impl
Signed-off-by: xuchen-intel <[email protected]>
Uh oh!
There was an error while loading. Please reload this page.