NVIDIA NIM - Upsonic AI

Overview

NVIDIA NIM (NVIDIA Inference Microservices) provides access to various models from different vendors through an OpenAI-compatible API. Access models from Meta, Google, Mistral, DeepSeek, Qwen, and more through NVIDIA’s optimized inference platform. Model Class: OpenAIChatModel (OpenAI-compatible API)

Authentication

export NVIDIA_API_KEY="nvapi-..."  # Required (or use NGC_API_KEY)
export NVIDIA_BASE_URL="https://integrate.api.nvidia.com/v1"  # Optional, this is the default

Examples

from upsonic import Agent, Task
from upsonic.models.nvidia import NvidiaModel

model = NvidiaModel(model_name="meta/llama-3.1-8b-instruct")

agent = Agent(model=model)
task = Task("Hello, how are you?")
result = agent.do(task)

print(result)

Model Settings

You can set model parameters in two ways: on the model or on the Agent. On the model:

from upsonic import Agent, Task
from upsonic.models.nvidia import NvidiaModel, NvidiaModelSettings

model = NvidiaModel(
    model_name="meta/llama-3.1-8b-instruct",
    settings=NvidiaModelSettings(max_tokens=1024, temperature=0.7)
)
agent = Agent(model=model)

On the Agent:

from upsonic import Agent, Task
from upsonic.models.nvidia import NvidiaModelSettings

agent = Agent(
    model="nvidia/meta/llama-3.1-8b-instruct",
    settings=NvidiaModelSettings(max_tokens=1024, temperature=0.7)
)

Parameters

Parameter	Type	Description	Default	Source
`max_tokens`	`int`	Maximum tokens to generate	Model default	Base
`temperature`	`float`	Sampling temperature	Model default	Base
`top_p`	`float`	Nucleus sampling	Model default	Base
`seed`	`int`	Random seed	None	Base
`stop_sequences`	`list[str]`	Stop sequences	None	Base
`presence_penalty`	`float`	Token presence penalty	0.0	Base
`frequency_penalty`	`float`	Token frequency penalty	0.0	Base
`parallel_tool_calls`	`bool`	Allow parallel tools	True	Base
`timeout`	`float`	Request timeout (seconds)	Model default	Base

​Overview

​Authentication

​Examples

​Model Settings

​Parameters

Overview

Authentication

Examples

Model Settings

Parameters