Skip to main content

Overview

Upsonic supports 8 vector database providers out of the box. Each provider shares a common configuration interface (BaseVectorDBConfig) while exposing provider-specific options for advanced tuning. All providers support dense vector search, full-text search, and hybrid search through a unified API — so you can switch providers without changing your application logic.

Provider Comparison

ProviderDeploymentIndex TypesHybrid SearchBest For
ChromaEmbedded, Local, CloudHNSW, FlatYesQuick prototyping, embedded apps
FAISSLocal fileHNSW, IVF, FlatYesHigh-performance local search, large datasets
QdrantEmbedded, Local, CloudHNSW, FlatYesProduction workloads, advanced filtering
MilvusEmbedded (Lite), Local, CloudHNSW, IVF, FlatYesScalable production, distributed search
PineconeCloud onlyManagedYesFully managed, zero-ops production
WeaviateEmbedded, Local, CloudHNSW, FlatYesSchema-based collections, AI modules
PGVectorPostgreSQLHNSW, IVFYesExisting PostgreSQL infrastructure
SuperMemoryCloud (managed)ManagedYesZero-config RAG (no embedding provider needed)

Quick Start

Every provider follows the same pattern — swap the provider and config to switch vector databases:
from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

# 1. Create embedding provider
embedding = OpenAIEmbedding(OpenAIEmbeddingConfig())

# 2. Configure and create vector database
vectordb = ChromaProvider(ChromaConfig(
    collection_name="my_kb",
    vector_size=1536,
    connection=ConnectionConfig(mode=Mode.EMBEDDED, db_path="./db")
))

# 3. Create knowledge base with documents
kb = KnowledgeBase(
    sources=["document.pdf"],
    embedding_provider=embedding,
    vectordb=vectordb
)

# 4. Query with Agent and Task
agent = Agent("anthropic/claude-sonnet-4-5")
task = Task(
    description="What are the key points in the document?",
    context=[kb]
)

result = agent.do(task)
print(result)

Connection Modes

Most providers support multiple connection modes via ConnectionConfig:
ModeDescriptionUse Case
Mode.IN_MEMORYEphemeral, no persistenceTesting, prototyping
Mode.EMBEDDEDLocal file storageDevelopment, single-node apps
Mode.LOCALConnect to local serverSelf-hosted deployments
Mode.CLOUDConnect to cloud serviceProduction, managed services
Not all providers use ConnectionConfig. FAISS uses db_path directly, Pinecone uses api_key + environment, and PGVector uses connection_string. SuperMemory only requires an api_key. See each provider’s page for details.

Shared Configuration

All providers inherit these base parameters from BaseVectorDBConfig:
ParameterTypeDefaultDescription
collection_namestr"default_collection"Name of the collection
vector_sizeint(required)Dimension of vectors (must match your embedding model)
distance_metricDistanceMetricCOSINESimilarity metric: COSINE, EUCLIDEAN, or DOT_PRODUCT
recreate_if_existsboolFalseRecreate collection if it already exists
default_top_kint10Default number of results returned
default_similarity_thresholdfloat | NoneNoneMinimum similarity score (0.0-1.0)
dense_search_enabledboolTrueEnable dense vector search
full_text_search_enabledboolTrueEnable full-text search
hybrid_search_enabledboolTrueEnable hybrid search
default_hybrid_alphafloat0.5Alpha for hybrid search blending (0.0 = full-text, 1.0 = dense)
default_fusion_methodstr'weighted'Fusion method: 'rrf' or 'weighted'

Choosing a Provider

For prototyping and development:
  • Chroma (embedded mode) or FAISS — zero infrastructure, fast iteration
For production with managed infrastructure:
  • Pinecone — fully managed, auto-scaling, zero ops
  • SuperMemory — zero-config (handles embeddings too)
For production with self-hosted infrastructure:
  • Qdrant or Milvus — feature-rich, scalable, battle-tested
  • Weaviate — if you need schema-based collections and AI modules
For existing PostgreSQL setups:
  • PGVector — adds vector search to your existing database