Overview
Outlines provides structured generation and control for models run locally or via SGLang, using the Outlines library. There is no API key or base URL; you build anOutlinesModel from a concrete backend (Transformers, LlamaCpp, MLXLM, SGLang, or vLLM offline). Tool calls are not supported; JSON schema and JSON object output are supported.
Model Class: OutlinesModel
Authentication
No API key or environment variables are required. For SGLang, configurebase_url and optional api_key in OutlinesModel.from_sglang().
Examples
From Transformers (local):Model Settings
You can set model parameters on the model or on the Agent. Supported parameters depend on the backend (Transformers, LlamaCpp, SGLang, vLLMOffline). On the model:Parameters
Supported settings vary by backend. Base options:| Parameter | Type | Description | Default | Backends |
|---|---|---|---|---|
max_tokens | int | Maximum tokens to generate | Model default | Transformers, LlamaCpp, SGLang, vLLMOffline |
temperature | float | Sampling temperature | 1.0 | Transformers, LlamaCpp, SGLang, vLLMOffline |
top_p | float | Nucleus sampling | 1.0 | Transformers, LlamaCpp, SGLang, vLLMOffline |
seed | int | Random seed | None | LlamaCpp, vLLMOffline |
presence_penalty | float | Token presence penalty | 0.0 | LlamaCpp, SGLang, vLLMOffline |
frequency_penalty | float | Token frequency penalty | 0.0 | LlamaCpp, SGLang, vLLMOffline |
logit_bias | dict[str, int] | Logit bias per token | None | Transformers, LlamaCpp, vLLMOffline |

