-
Notifications
You must be signed in to change notification settings - Fork 433
main add ascend scheduler support multimodal #2844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for multimodal models in the AscendScheduler by implementing the scheduling of encoder inputs. The changes correctly remove the previous restriction and add the necessary logic for handling encoder inputs in both prefill and decode paths. However, I've identified a significant code duplication for the encoder cache allocation logic, which is present in both the prefill and decode scheduling loops. I've left a comment suggesting to refactor this duplicated code into a helper method to improve maintainability.
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
please rebase to main to fix the merge conflict |
fix the merge conflict |
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (52.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #2844 +/- ##
==========================================
+ Coverage 74.76% 75.02% +0.26%
==========================================
Files 150 154 +4
Lines 20891 21297 +406
==========================================
+ Hits 15620 15979 +359
- Misses 5271 5318 +47
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: fan2956 <zhoufan53@huawei.com>
Signed-off-by: fan2956 <zhoufan53@huawei.com>
Signed-off-by: fan2956 <zhoufan53@huawei.com>
Signed-off-by: fan2956 <zhoufan53@huawei.com>
Signed-off-by: fan2956 <zhoufan53@huawei.com>
Signed-off-by: fan2956 <zhoufan53@huawei.com>
What this PR does / why we need it?
On main, AscendScheduler does not support Multimodels, becuse of lacking of scheduled_encoder_inputs which is need on multimodels inference
Does this PR introduce any user-facing change?
No
How was this patch tested?
vLLM version: main@93e28e6862669e3b5cf47cea9f782a65ec47e155