Skip to main content

Overview

Ollama provides local embedding models that run entirely on your machine. Supports models like nomic-embed-text, mxbai-embed-large, and snowflake-arctic-embed. No API costs and works offline. Requires an Ollama server running locally. Provider Class: OllamaEmbedding Config Class: OllamaEmbeddingConfig

Dependencies

pip install aiohttp requests
Requires Ollama server running locally. Install from ollama.ai.

Examples

from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import OllamaEmbedding, OllamaEmbeddingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

# Create embedding provider
embedding = OllamaEmbedding(OllamaEmbeddingConfig(
    model_name="nomic-embed-text",
    base_url="http://localhost:11434",
    auto_pull_model=True
))

# Setup KnowledgeBase
vectordb = ChromaProvider(ChromaConfig(
    collection_name="ollama_docs",
    vector_size=768,
    connection=ConnectionConfig(mode=Mode.IN_MEMORY)
))

kb = KnowledgeBase(
    sources=["document.txt"],
    embedding_provider=embedding,
    vectordb=vectordb
)

# Query with Agent
agent = Agent("openai/gpt-4o")
task = Task("What is this document about?", context=[kb])
result = agent.do(task)
print(result)

Parameters

ParameterTypeDescriptionDefaultSource
model_namestrOllama embedding model name"nomic-embed-text"Specific
base_urlstrOllama server URL"http://localhost:11434"Specific
auto_pull_modelboolAutomatically pull model if not availableTrueSpecific
keep_alivestr | NoneKeep model loaded for duration"5m"Specific
temperaturefloat | NoneModel temperatureNoneSpecific
top_pfloat | NoneTop-p samplingNoneSpecific
num_ctxint | NoneContext window sizeNoneSpecific
request_timeoutfloatRequest timeout in seconds120.0Specific
connection_timeoutfloatConnection timeout in seconds10.0Specific
enable_keep_aliveboolKeep model loaded between requestsTrueSpecific
enable_model_preloadboolPreload model on startupTrueSpecific
batch_sizeintBatch size for document embedding100Base
max_retriesintMaximum number of retries on failure3Base
normalize_embeddingsboolWhether to normalize embeddings to unit lengthTrueBase
show_progressboolWhether to show progress during batch operationsTrueBase