[Feature] Prefill Context Parallel (PCP) basic support (#28718)
Signed-off-by: QiuChunshuo qiuchunshuo@huawei.com Signed-off-by: FENP yuanyongjie.yyj@antgroup.com Signed-off-by: LookAround lixushi@huawei.com Signed-off-by: Jingchun Gao gaojingchun1@huawei.com Signed-off-by: zhenwenqi2024 zhenwenqi_2022@qq.com Co-authored-by: FENP yuanyongjie.yyj@antgroup.com Co-authored-by: LookAround lixushi@huawei.com Co-authored-by: Jingchun Gao gaojingchun1@huawei.com Co-authored-by: zhenwenqi2024 zhenwenqi_2022@qq.com Co-authored-by: Jingchun Gao 63247409+gjc0824@users.noreply.github.com
Easy, fast, and cheap LLM serving for everyone
| Documentation | Blog | Paper | Twitter/X | User Forum | Developer Slack |
Join us at the PyTorch Conference, October 22-23 and Ray Summit, November 3-5 in San Francisco for our latest updates on vLLM and to meet the vLLM team! Register now for the largest vLLM community events of the year!
Latest News 🔥
Previous News
About
vLLM is a fast and easy-to-use library for LLM inference and serving.
Originally developed in the Sky Computing Lab at UC Berkeley, vLLM has evolved into a community-driven project with contributions from both academia and industry.
vLLM is fast with:
vLLM is flexible and easy to use with:
vLLM seamlessly supports most popular open-source models on HuggingFace, including:
Find the full list of supported models here.
Getting Started
Install vLLM with
pipor from source:Visit our documentation to learn more.
Contributing
We welcome and value any contributions and collaborations. Please check out Contributing to vLLM for how to get involved.
Sponsors
vLLM is a community project. Our compute resources for development and testing are supported by the following organizations. Thank you for your support!
Cash Donations:
Compute Resources:
Slack Sponsor: Anyscale
We also have an official fundraising venue through OpenCollective. We plan to use the fund to support the development, maintenance, and adoption of vLLM.
Citation
If you use vLLM for your research, please cite our paper:
Contact Us
Media Kit