Feature Support#
The feature support principle of vLLM Ascend is: aligned with the vLLM. We are also actively collaborating with the community to accelerate support.
You can check the support status of vLLM V1 Engine. Below is the feature support status of vLLM Ascend:
Feature |
Status |
Next Step |
---|---|---|
Chunked Prefill |
🟢 Functional |
Functional, see detail note: Chunked Prefill |
Automatic Prefix Caching |
🟢 Functional |
Functional, see detail note: vllm-ascend#732 |
LoRA |
🟢 Functional |
|
Prompt adapter |
🔴 No plan |
This feature has been deprecated by vLLM. |
Speculative decoding |
🟢 Functional |
Basic support |
Pooling |
🟢 Functional |
CI needed and adapting more models; V1 support rely on vLLM support. |
Enc-dec |
🟡 Planned |
vLLM should support this feature first. |
Multi Modality |
🟢 Functional |
Tutorial, optimizing and adapting more models |
LogProbs |
🟢 Functional |
CI needed |
Prompt logProbs |
🟢 Functional |
CI needed |
Async output |
🟢 Functional |
CI needed |
Multi step scheduler |
🔴 Deprecated |
vllm#8779, replaced by vLLM V1 Scheduler |
Best of |
🔴 Deprecated |
|
Beam search |
🟢 Functional |
CI needed |
Guided Decoding |
🟢 Functional |
|
Tensor Parallel |
🟢 Functional |
Make TP >4 work with graph mode |
Pipeline Parallel |
🟢 Functional |
Write official guide and tutorial. |
Expert Parallel |
🟢 Functional |
Dynamic EPLB support. |
Data Parallel |
🟢 Functional |
Data Parallel support for Qwen3 MoE. |
Prefill Decode Disaggregation |
🚧 WIP |
working on 1P1D and xPyD. |
Quantization |
🟢 Functional |
W8A8 available; working on more quantization method support(W4A8, etc) |
Graph Mode |
🔵 Experimental |
Experimental, see detail note: vllm-ascend#767 |
Sleep Mode |
🟢 Functional |
🟢 Functional: Fully operational, with ongoing optimizations.
🔵 Experimental: Experimental support, interfaces and functions may change.
🚧 WIP: Under active development, will be supported soon.
🟡 Planned: Scheduled for future implementation (some may have open PRs/RFCs).
🔴 NO plan / Deprecated: No plan or deprecated by vLLM.