File tree Expand file tree Collapse file tree 3 files changed +1
-50
lines changed Expand file tree Collapse file tree 3 files changed +1
-50
lines changed Original file line number Diff line number Diff line change 104104# Future Plan:
105105# Remove this patch when vllm merged them.
106106#
107- # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
108- # 1. `vllm.v1.sample.sampler.Sampler.gather_logprobs`
109- # Why:
110- # We need to patch gather_logprobs to make sure call batched_count_greater_than
111- # with backend=current_platform.simple_compile_backend
112- # How:
113- # Patch gather_logprobs call new batched_count_greater_than
114- # Related PR (if no, explain why):
115- # - https://github.com/vllm-project/vllm/pull/21591
116- # Future Plan:
117- # Revert it when vLLM merge #21591 and release new version
118- # ** File: worker/patch_logits.py **
119- # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
120- # 1. `vllm._custom_ops.apply_repetition_penalties`
121- # Why:
122- # apply_repetition_penalties in vLLM use tensor.is_cuda to check if tensor is on cuda. But the value is always True
123- # on ascend, thus we need to patch apply_repetition_penalties.
124- # How:
125- # Remove the related cuda check in apply_repetition_penalties.
126- # Related PR (if no, explain why):
127- # - this is a bug by Ascend only. It can' be fixed in vLLM.
128- # Future Plan:
129- # Fix this bug in torch-npu, bump torch-npu version and remove this patch.
107+ # ** File: worker/patch_roberta.py **
130108# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
131109# 1. `vllm.model_executor.models.roberta.RobertaEmbedding.forward`
132110# Why:
Original file line number Diff line number Diff line change 2323# isort: off
2424import vllm_ascend .patch .platform .patch_sched_yield # noqa
2525import vllm_ascend .patch .worker .patch_distributed # noqa
26- import vllm_ascend .patch .worker .patch_logits # noqa
2726import vllm_ascend .patch .worker .patch_roberta # noqa
2827import vllm_ascend .patch .worker .patch_weight_loader # noqa
2928import vllm_ascend .patch .worker .patch_multimodal_merge # noqa
Load Diff This file was deleted.
You can’t perform that action at this time.
0 commit comments