LLM Models Overview

What are LLM Models?

Large Language Models (LLMs) are the foundation of the Upsonic AI Agent Framework. The framework provides a unified interface to interact with various LLM providers, allowing you to build AI agents that can leverage different models without changing your code structure.

Model Architecture

In Upsonic, all model classes inherit from the base Model class, which provides:

Unified Interface: Consistent API across all providers
LCEL Integration: Models implement the Runnable interface for chain composition
Streaming Support: Real-time response streaming for better UX
Tool Calling: Native function calling capabilities
Structured Output: Type-safe responses using Pydantic models
Memory Management: Built-in conversation history support

Key Components

1. Model Settings

Model settings control the behavior of LLM requests:

from upsonic.models.settings import ModelSettings

settings = ModelSettings(
    max_tokens=2048,
    temperature=0.7,
    top_p=0.9,
    seed=42
)

All settings are optional and provider-specific settings are prefixed with the provider name (e.g., openai_, anthropic_, google_).

2. Model Profiles

Profiles define model capabilities and behaviors:

from upsonic.profiles import ModelProfile

profile = ModelProfile(
    supports_tools=True,
    supports_json_schema_output=True,
    default_structured_output_mode='native'
)

3. Model Inference

Use infer_model() to automatically select the appropriate model class:

from upsonic import infer_model

# Automatic provider detection
model = infer_model("openai/gpt-4o")
model = infer_model("anthropic/claude-3-5-sonnet-20241022")
model = infer_model("google-gla/gemini-2.5-flash")

Usage Patterns

Basic Usage

from upsonic import Agent, Task, infer_model

model = infer_model("openai/gpt-4o")
agent = Agent(model=model)

task = Task("Explain quantum computing")
result = agent.do(task)

With Custom Settings

from upsonic.models.openai import OpenAIChatModel, OpenAIChatModelSettings

settings = OpenAIChatModelSettings(
    max_tokens=1024,
    temperature=0.5,
    openai_reasoning_effort="high"
)

model = OpenAIChatModel(
    model_name="gpt-4o",
    settings=settings
)

agent = Agent(model=model)

LCEL Chains

from upsonic.lcel import ChatPromptTemplate
from upsonic import infer_model

prompt = ChatPromptTemplate.from_template("Tell me about {topic}")
model = infer_model("openai/gpt-4o")

# Chain composition with pipe operator
chain = prompt | model
result = await chain.invoke({"topic": "AI"})

Error Handling

The framework provides comprehensive error handling for LLM operations:

Common Exceptions

ModelHTTPError

Raised when an HTTP error occurs during model requests:

from upsonic.utils.package.exception import ModelHTTPError

try:
    result = agent.do(task)
except ModelHTTPError as e:
    print(f"Status Code: {e.status_code}")
    print(f"Model: {e.model_name}")
    print(f"Body: {e.body}")

UserError

Raised for user-facing configuration or usage errors:

from upsonic.utils.package.exception import UserError

try:
    model = infer_model("unknown/model")
except UserError as e:
    print(f"Error: {e}")

UnexpectedModelBehavior

Raised when a model responds in an unexpected way:

from upsonic.utils.package.exception import UnexpectedModelBehavior

try:
    response = await model.request_stream(messages, settings, params)
except UnexpectedModelBehavior as e:
    print(f"Unexpected behavior: {e}")

Handling Rate Limits

import asyncio
from upsonic.utils.package.exception import ModelHTTPError

async def request_with_retry(agent, task, max_retries=3):
    for attempt in range(max_retries):
        try:
            return agent.do(task)
        except ModelHTTPError as e:
            if e.status_code == 429:  # Rate limit
                wait_time = 2 ** attempt  # Exponential backoff
                print(f"Rate limited. Waiting {wait_time}s...")
                await asyncio.sleep(wait_time)
            else:
                raise
    raise Exception("Max retries exceeded")

Handling Token Limits

from upsonic.models.settings import ModelSettings

settings = ModelSettings(
    max_tokens=4096,
    # Will raise error if input + output exceeds context window
)

try:
    model = infer_model("openai/gpt-4o")
    agent = Agent(model=model, settings=settings)
    result = agent.do(very_long_task)
except ModelHTTPError as e:
    if "context_length_exceeded" in str(e.body):
        print("Input too long, consider truncating")

Handling Invalid Responses

from pydantic import BaseModel, ValidationError

class ResponseFormat(BaseModel):
    answer: str
    confidence: float

try:
    model = infer_model("openai/gpt-4o")
    model = model.with_structured_output(ResponseFormat)
    result = await model.ainvoke("What is AI?")
except ValidationError as e:
    print(f"Invalid response format: {e}")

Global Error Handling

Disable model requests globally for testing:

from upsonic.models import override_allow_model_requests

# Disable requests (useful for testing)
with override_allow_model_requests(False):
    try:
        result = agent.do(task)
    except RuntimeError as e:
        print("Model requests are disabled")

Best Practices

Always Use Environment Variables: Store API keys in environment variables, never hardcode them
Implement Retry Logic: Network errors and rate limits are common, implement exponential backoff
Monitor Token Usage: Track usage to avoid unexpected costs
Handle Timeouts: Set appropriate timeouts based on your use case
Validate Outputs: Use structured output with Pydantic models for type safety
Log Errors: Implement comprehensive logging for debugging
Use Streaming: For better UX, use streaming responses when available
Test Error Paths: Write tests that cover error scenarios

Next Steps

Explore Compatibility Overview to see feature support across providers
Learn about Native Model Providers for direct API access
Check Model Gateways for unified access to multiple models

GET STARTED

UPSONIC 101 GUIDE

CONCEPTS

DEPLOYMENT

FURTHER READINGS

What are LLM Models?

Model Architecture

Key Components

1. Model Settings

2. Model Profiles

3. Model Inference

Usage Patterns

Basic Usage

With Custom Settings

LCEL Chains

Error Handling

Common Exceptions

ModelHTTPError

UserError

UnexpectedModelBehavior

Handling Rate Limits

Handling Token Limits

Handling Invalid Responses

Global Error Handling

Best Practices

Next Steps

GET STARTED

UPSONIC 101 GUIDE

CONCEPTS

DEPLOYMENT

FURTHER READINGS

​What are LLM Models?

​Model Architecture

​Key Components

​1. Model Settings

​2. Model Profiles

​3. Model Inference

​Usage Patterns

​Basic Usage

​With Custom Settings

​LCEL Chains

​Error Handling

​Common Exceptions

​ModelHTTPError

​UserError

​UnexpectedModelBehavior

​Handling Rate Limits

​Handling Token Limits

​Handling Invalid Responses

​Global Error Handling

​Best Practices

​Next Steps

What are LLM Models?

Model Architecture

Key Components

1. Model Settings

2. Model Profiles

3. Model Inference

Usage Patterns

Basic Usage

With Custom Settings

LCEL Chains

Error Handling

Common Exceptions

ModelHTTPError

UserError

UnexpectedModelBehavior

Handling Rate Limits

Handling Token Limits

Handling Invalid Responses

Global Error Handling

Best Practices

Next Steps