Skip to main content

Overview

FastEmbed provides fast, local embedding models powered by ONNX runtime. Supports GPU acceleration, sparse embeddings, and multiple model architectures including BGE, E5, and multilingual models. No API costs - runs entirely locally. Provider Class: FastEmbedProvider Config Class: FastEmbedConfig

Dependencies

pip install fastembed

Examples

from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import FastEmbedProvider, FastEmbedConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

# Create embedding provider
embedding = FastEmbedProvider(FastEmbedConfig(
    model_name="BAAI/bge-small-en-v1.5",
    enable_gpu=True
))

# Setup KnowledgeBase
vectordb = ChromaProvider(ChromaConfig(
    collection_name="fastembed_docs",
    vector_size=384,
    connection=ConnectionConfig(mode=Mode.IN_MEMORY)
))

kb = KnowledgeBase(
    sources=["document.txt"],
    embedding_provider=embedding,
    vectordb=vectordb
)

# Query with Agent
agent = Agent("openai/gpt-4o")
task = Task("What is this document about?", context=[kb])
result = agent.do(task)
print(result)

Parameters

ParameterTypeDescriptionDefaultSource
model_namestrFastEmbed model name"BAAI/bge-small-en-v1.5"Specific
cache_dirstr | NoneModel cache directoryNoneSpecific
threadsint | NoneNumber of threads (auto-detected if None)NoneSpecific
providerslist[str]ONNX execution providers["CPUExecutionProvider"]Specific
enable_gpuboolEnable GPU acceleration if availableFalseSpecific
enable_parallel_processingboolEnable parallel text processingTrueSpecific
doc_embed_typestrDocument embedding type (default, passage)"default"Specific
max_memory_mbint | NoneMaximum memory usage in MBNoneSpecific
model_warmupboolWarm up model on initializationTrueSpecific
enable_sparse_embeddingsboolUse sparse embeddings for better performanceFalseSpecific
sparse_model_namestr | NoneSparse model name if different from denseNoneSpecific
batch_sizeintBatch size for document embedding100Base
max_retriesintMaximum number of retries on failure3Base
normalize_embeddingsboolWhether to normalize embeddings to unit lengthTrueBase
show_progressboolWhether to show progress during batch operationsTrueBase