Supported Models¶
You could refer to vllm's docs for more details.
Here the plugin would list all the tested model on Maca.
Feature Status Legend¶
-
✅︎ indicates that the feature is supported for the model.
-
🚧 indicates that the feature is planned but not yet supported for the model.
-
⚠️ indicates that the feature is available but may have known issues or limitations.
List of Text-only Language Models¶
Text Generative Models¶
| Architecture | Models | Example HF Models | LoRA | PP |
|---|---|---|---|---|
AquilaForCausalLM | Aquila, Aquila2 | BAAI/Aquila-7B, BAAI/AquilaChat-7B, etc. | ✅︎ | ✅︎ |
BaiChuanForCausalLM | Baichuan2, Baichuan | baichuan-inc/Baichuan2-13B-Chat, baichuan-inc/Baichuan-7B, etc. | ✅︎ | ✅︎ |
BloomForCausalLM | BLOOM, BLOOMZ, BLOOMChat | bigscience/bloom, bigscience/bloomz, etc. | ✅︎ | |
ChatGLMModel, ChatGLMForConditionalGeneration | ChatGLM | zai-org/chatglm2-6b, zai-org/chatglm3-6b, etc. | ✅︎ | ✅︎ |
DeepseekForCausalLM | DeepSeek | deepseek-ai/deepseek-llm-7b-chat, etc. | ✅︎ | ✅︎ |
DeepseekV2ForCausalLM | DeepSeek-V2 | deepseek-ai/DeepSeek-V2, deepseek-ai/DeepSeek-V2-Chat, etc. | ✅︎ | ✅︎ |
DeepseekV3ForCausalLM | DeepSeek-V3 | deepseek-ai/DeepSeek-V3, deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-V3.1, etc. | ✅︎ | ✅︎ |
Ernie4_5ForCausalLM | Ernie4.5 | baidu/ERNIE-4.5-0.3B-PT, etc. | ✅︎ | ✅︎ |
Ernie4_5_MoeForCausalLM | Ernie4.5MoE | baidu/ERNIE-4.5-21B-A3B-PT, baidu/ERNIE-4.5-300B-A47B-PT, etc. | ✅︎ | ✅︎ |
FalconForCausalLM | Falcon | tiiuae/falcon-7b, tiiuae/falcon-40b, tiiuae/falcon-rw-7b, etc. | ✅︎ | |
GlmForCausalLM | GLM-4 | zai-org/glm-4-9b-chat-hf, etc. | ✅︎ | ✅︎ |
Glm4ForCausalLM | GLM-4-0414 | zai-org/GLM-4-32B-0414, etc. | ✅︎ | ✅︎ |
Glm4MoeForCausalLM | GLM-4.5, GLM-4.6 | zai-org/GLM-4.5, etc. | ✅︎ | ✅︎ |
GPTBigCodeForCausalLM | StarCoder, SantaCoder, WizardCoder | bigcode/starcoder, bigcode/gpt_bigcode-santacoder, WizardLM/WizardCoder-15B-V1.0, etc. | ✅︎ | ✅︎ |
GPTNeoXForCausalLM | GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM | EleutherAI/gpt-neox-20b, EleutherAI/pythia-12b, OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5, databricks/dolly-v2-12b, stabilityai/stablelm-tuned-alpha-7b, etc. | ✅︎ | |
InternLMForCausalLM | InternLM | internlm/internlm-7b, internlm/internlm-chat-7b, etc. | ✅︎ | ✅︎ |
InternLM2ForCausalLM | InternLM2 | internlm/internlm2-7b, internlm/internlm2-chat-7b, etc. | ✅︎ | ✅︎ |
InternLM3ForCausalLM | InternLM3 | internlm/internlm3-8b-instruct, etc. | ✅︎ | ✅︎ |
LlamaForCausalLM | Llama 3.1, Llama 3, Llama 2, LLaMA, Yi | meta-llama/Meta-Llama-3.1-405B-Instruct, meta-llama/Meta-Llama-3.1-70B, meta-llama/Meta-Llama-3-70B-Instruct, meta-llama/Llama-2-70b-hf, 01-ai/Yi-34B, etc. | ✅︎ | ✅︎ |
MixtralForCausalLM | Mixtral-8x7B, Mixtral-8x7B-Instruct | mistralai/Mixtral-8x7B-v0.1, mistralai/Mixtral-8x7B-Instruct-v0.1, mistral-community/Mixtral-8x22B-v0.1, etc. | ✅︎ | ✅︎ |
MPTForCausalLM | MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter | mosaicml/mpt-7b, mosaicml/mpt-7b-storywriter, mosaicml/mpt-30b, etc. | ✅︎ | |
QWenLMHeadModel | Qwen | Qwen/Qwen-7B, Qwen/Qwen-7B-Chat, etc. | ✅︎ | ✅︎ |
Qwen2ForCausalLM | QwQ, Qwen2 | Qwen/QwQ-32B-Preview, Qwen/Qwen2-7B-Instruct, Qwen/Qwen2-7B, etc. | ✅︎ | ✅︎ |
Qwen2MoeForCausalLM | Qwen2MoE | Qwen/Qwen1.5-MoE-A2.7B, Qwen/Qwen1.5-MoE-A2.7B-Chat, etc. | ✅︎ | ✅︎ |
Qwen3ForCausalLM | Qwen3 | Qwen/Qwen3-8B, etc. | ✅︎ | ✅︎ |
Qwen3MoeForCausalLM | Qwen3MoE | Qwen/Qwen3-30B-A3B, etc. | ✅︎ | ✅︎ |
Qwen3NextForCausalLM | Qwen3NextMoE | Qwen/Qwen3-Next-80B-A3B-Instruct, etc. | ✅︎ | ✅︎ |
List of Multimodal Language Models¶
The following modalities are supported depending on the model:
- Text
- Image
- Video
- Audio
Any combination of modalities joined by + are supported.
- e.g.:
T + Imeans that the model supports text-only, image-only, and text-with-image inputs.
On the other hand, modalities separated by / are mutually exclusive.
- e.g.:
T / Imeans that the model supports text-only and image-only inputs, but not text-with-image inputs.
See this page on how to pass multi-modal inputs to the model.
Text Generative Models¶
| Architecture | Models | Inputs | Example HF Models | LoRA | PP |
|---|---|---|---|---|---|
DeepseekVLV2ForCausalLM^ | DeepSeek-VL2 | T + I+ | deepseek-ai/deepseek-vl2-tiny, deepseek-ai/deepseek-vl2-small, deepseek-ai/deepseek-vl2, etc. | ✅︎ | |
DeepseekOCRForCausalLM | DeepSeek-OCR | T + I+ | deepseek-ai/DeepSeek-OCR, etc. | ✅︎ | |
Ernie4_5_VLMoeForConditionalGeneration | Ernie4.5-VL | T + I+/ V+ | baidu/ERNIE-4.5-VL-28B-A3B-PT, baidu/ERNIE-4.5-VL-424B-A47B-PT | ✅︎ | |
GLM4VForCausalLM^ | GLM-4V | T + I | zai-org/glm-4v-9b, zai-org/cogagent-9b-20241220, etc. | ✅︎ | ✅︎ |
Glm4vForConditionalGeneration | GLM-4.1V-Thinking | T + IE+ + VE+ | zai-org/GLM-4.1V-9B-Thinking, etc. | ✅︎ | ✅︎ |
Glm4vMoeForConditionalGeneration | GLM-4.5V | T + IE+ + VE+ | zai-org/GLM-4.5V, etc. | ✅︎ | ✅︎ |
InternS1ForConditionalGeneration | Intern-S1 | T + IE+ + VE+ | internlm/Intern-S1, internlm/Intern-S1-mini, etc. | ✅︎ | ✅︎ |
InternVLChatModel | InternVL 3.5, InternVL 3.0, InternVideo 2.5, InternVL 2.5, Mono-InternVL, InternVL 2.0 | T + IE+ + (VE+) | OpenGVLab/InternVL3_5-14B, OpenGVLab/InternVL3-9B, OpenGVLab/InternVideo2_5_Chat_8B, OpenGVLab/InternVL2_5-4B, OpenGVLab/Mono-InternVL-2B, OpenGVLab/InternVL2-4B, etc. | ✅︎ | ✅︎ |
InternVLForConditionalGeneration | InternVL 3.0 (HF format) | T + IE+ + VE+ | OpenGVLab/InternVL3-1B-hf, etc. | ✅︎ | ✅︎ |
LlavaForConditionalGeneration | LLaVA-1.5, Pixtral (HF Transformers) | T + IE+ | llava-hf/llava-1.5-7b-hf, TIGER-Lab/Mantis-8B-siglip-llama3 (see note), mistral-community/pixtral-12b, etc. | ✅︎ | |
LlavaNextForConditionalGeneration | LLaVA-NeXT | T + IE+ | llava-hf/llava-v1.6-mistral-7b-hf, llava-hf/llava-v1.6-vicuna-7b-hf, etc. | ✅︎ | |
LlavaNextVideoForConditionalGeneration | LLaVA-NeXT-Video | T + V | llava-hf/LLaVA-NeXT-Video-7B-hf, etc. | ✅︎ | |
LlavaOnevisionForConditionalGeneration | LLaVA-Onevision | T + I+ + V+ | llava-hf/llava-onevision-qwen2-7b-ov-hf, llava-hf/llava-onevision-qwen2-0.5b-ov-hf, etc. | ✅︎ | |
Qwen2_5_VLForConditionalGeneration | Qwen2.5-VL | T + IE+ + VE+ | Qwen/Qwen2.5-VL-3B-Instruct, Qwen/Qwen2.5-VL-72B-Instruct, etc. | ✅︎ | ✅︎ |
Qwen2_5OmniThinkerForConditionalGeneration | Qwen2.5-Omni | T + IE+ + VE+ + A+ | Qwen/Qwen2.5-Omni-3B, Qwen/Qwen2.5-Omni-7B | ✅︎ | ✅︎ |
Qwen3VLForConditionalGeneration | Qwen3-VL | T + IE+ + VE+ | Qwen/Qwen3-VL-4B-Instruct, etc. | ✅︎ | ✅︎ |
Qwen3VLMoeForConditionalGeneration | Qwen3-VL-MOE | T + IE+ + VE+ | Qwen/Qwen3-VL-30B-A3B-Instruct, etc. | ✅︎ | ✅︎ |
Qwen3OmniMoeThinkerForConditionalGeneration | Qwen3-Omni | T + IE+ + VE+ + A+ | Qwen/Qwen3-Omni-30B-A3B-Instruct, Qwen/Qwen3-Omni-30B-A3B-Thinking | ✅︎ | ✅︎ |