Merge pull request #2 from tiger-of-shawn/main Revise README for project overview and features
Merge pull request #2 from tiger-of-shawn/main
Revise README for project overview and features
Efficient Inference for LLMs & MLLMsAn open-source research project from Alibaba Cloud dedicated to efficient large language model inference.
EfficientAI focuses on inference-time optimizations for LLMs and MLLMs:
[2026-03] 🎉 MASQuant accepted to CVPR 2026→ Multimodal LLM PTQ algorithm with SOTA accuracy-efficiency tradeoff📄 Paper | 💻 Code
[2026-02] 🚀 D-CORE open-sourced→ Efficient tool-use reasoning via dynamic computation routing📄 Paper | 💻 Code | 🎮 Demo
[2026-01] 🏆 LaRoSa accepted to ICML 2025→ Training-free activation sparsity for LLM acceleration📄 Paper | 💻 Code
# Clone the repository git clone https://github.com/alibaba/EfficientAI.git cd EfficientAI # Install dependencies (recommended: use conda) pip install -r requirements.txt # Optional: Install with specific module support # pip install -e ".[larosa]" # for LaRoSa # pip install -e ".[masquant]" # for MASQuant
版权所有:中国计算机学会技术支持:开源发展技术委员会 京ICP备13000930号-9 京公网安备 11010802032778号
🚀 EfficientAI
📋 Table of Contents
✨ Key Features
EfficientAI focuses on inference-time optimizations for LLMs and MLLMs:
🔥 Latest Updates
📰 Changelog (Click to expand)
[2026-03] 🎉 MASQuant accepted to CVPR 2026
→ Multimodal LLM PTQ algorithm with SOTA accuracy-efficiency tradeoff
📄 Paper | 💻 Code
[2026-02] 🚀 D-CORE open-sourced
→ Efficient tool-use reasoning via dynamic computation routing
📄 Paper | 💻 Code | 🎮 Demo
[2026-01] 🏆 LaRoSa accepted to ICML 2025
→ Training-free activation sparsity for LLM acceleration
📄 Paper | 💻 Code
📦 Installation