Skip to content

Registering a Model

vllm-metax basically reuses the models that are already registered by vLLM.

If your model is not on this list, or you have customized a registered model, you must register it to vLLM. This page provides detailed instructions on how to do so.

Out-of-tree models

You can load an external model using a plugin without modifying the vLLM codebase.

To register the model, use the following code:

# The entrypoint of your plugin
def register():
    from vllm import ModelRegistry
    from your_code import YourModelForCausalLM

    ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)

If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like RuntimeError: Cannot re-initialize CUDA in forked subprocess:

# The entrypoint of your plugin
def register():
    from vllm import ModelRegistry

    ModelRegistry.register_model(
        "YourModelForCausalLM",
        "your_code:YourModelForCausalLM",
    )

Note

This is best practice on vllm-metax. Please review it here.