[Bugfix] Fix type annotation forward reference (#232)
Signed-off-by: Lianjie Zhang lianjie.zhang@metax-tech.com
vLLM-MetaX GPU适配插件,是为vLLM推理框架打造的硬件适配插件。该插件基于vLLM插件规范开发,可使vLLM推理框架在MetaX GPU上无缝、高效运行,提供接近原生CUDA的使用体验,是vLLM社区中支持MetaX后端的推荐方案。项目兼容vLLM主流版本,支持单机多卡推理部署,同时兼容原生LLM、Engine、Kernel的API接口及OpenAI server接口。
版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9
京公网安备 11010802032778号
vLLM MetaX Plugin
| About MetaX | Documentation | #sig-maca |
Latest News 🔥
About
vLLM MetaX is a hardware plugin that enables vLLM to run seamlessly on MetaX GPUs. MetaX provides a cuda-like backend through MACA, delivering a near-native CUDA experience on MetaX hardware.
It is the recommended approach for supporting the MetaX backend within the vLLM community.
The plugin is implemented in accordance with the vLLM plugin RFCs:
These RFCs help ensure proper feature and functionality support when integrating MetaX GPUs with vLLM.
Prerequisites
Getting Started
vLLM MetaX currently supports deployment only with Docker images released by the MetaX developer community, which work out of the box.
If you want to develop, debug, or test the latest features in vllm-metax, you may need to build it from source. Please follow this source build tutorial.
Branch
vllm-metax has three kinds of branches.
v0.1x.0-dev, indicating that the corresponding vllm-metax development branch has been fully tested and released.Below are the maintained branches:
For more details, please check the Quickstart Guide.
License
Apache License 2.0, as found in the LICENSE file.