Vector Stores - Upsonic AI

Overview

Upsonic supports 8 vector database providers out of the box. Each provider shares a common configuration interface (BaseVectorDBConfig) while exposing provider-specific options for advanced tuning. All providers support dense vector search, full-text search, and hybrid search through a unified API — so you can switch providers without changing your application logic.

Provider Comparison

Provider	Deployment	Index Types	Hybrid Search	Best For
Chroma	Embedded, Local, Cloud	HNSW, Flat	Yes	Quick prototyping, embedded apps
FAISS	Local file	HNSW, IVF, Flat	Yes	High-performance local search, large datasets
Qdrant	Embedded, Local, Cloud	HNSW, Flat	Yes	Production workloads, advanced filtering
Milvus	Embedded (Lite), Local, Cloud	HNSW, IVF, Flat	Yes	Scalable production, distributed search
Pinecone	Cloud only	Managed	Yes	Fully managed, zero-ops production
Weaviate	Embedded, Local, Cloud	HNSW, Flat	Yes	Schema-based collections, AI modules
PGVector	PostgreSQL	HNSW, IVF	Yes	Existing PostgreSQL infrastructure
SuperMemory	Cloud (managed)	Managed	Yes	Zero-config RAG (no embedding provider needed)

Quick Start

Every provider follows the same pattern — swap the provider and config to switch vector databases:

from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

# 1. Create embedding provider
embedding = OpenAIEmbedding(OpenAIEmbeddingConfig())

# 2. Configure and create vector database
vectordb = ChromaProvider(ChromaConfig(
    collection_name="my_kb",
    vector_size=1536,
    connection=ConnectionConfig(mode=Mode.EMBEDDED, db_path="./db")
))

# 3. Create knowledge base with documents
kb = KnowledgeBase(
    sources=["document.pdf"],
    embedding_provider=embedding,
    vectordb=vectordb
)

# 4. Query with Agent and Task
agent = Agent("anthropic/claude-sonnet-4-5")
task = Task(
    description="What are the key points in the document?",
    context=[kb]
)

result = agent.do(task)
print(result)

Connection Modes

Most providers support multiple connection modes via ConnectionConfig:

Mode	Description	Use Case
`Mode.IN_MEMORY`	Ephemeral, no persistence	Testing, prototyping
`Mode.EMBEDDED`	Local file storage	Development, single-node apps
`Mode.LOCAL`	Connect to local server	Self-hosted deployments
`Mode.CLOUD`	Connect to cloud service	Production, managed services

Not all providers use ConnectionConfig. FAISS uses db_path directly, Pinecone uses api_key + environment, and PGVector uses connection_string. SuperMemory only requires an api_key. See each provider’s page for details.

Shared Configuration

All providers inherit these base parameters from BaseVectorDBConfig:

Parameter	Type	Default	Description
`collection_name`	`str`	`"default_collection"`	Name of the collection
`vector_size`	`int`	(required)	Dimension of vectors (must match your embedding model)
`distance_metric`	`DistanceMetric`	`COSINE`	Similarity metric: `COSINE`, `EUCLIDEAN`, or `DOT_PRODUCT`
`recreate_if_exists`	`bool`	`False`	Recreate collection if it already exists
`default_top_k`	`int`	`10`	Default number of results returned
`default_similarity_threshold`	`float \| None`	`None`	Minimum similarity score (0.0-1.0)
`dense_search_enabled`	`bool`	`True`	Enable dense vector search
`full_text_search_enabled`	`bool`	`True`	Enable full-text search
`hybrid_search_enabled`	`bool`	`True`	Enable hybrid search
`default_hybrid_alpha`	`float`	`0.5`	Alpha for hybrid search blending (0.0 = full-text, 1.0 = dense)
`default_fusion_method`	`str`	`'weighted'`	Fusion method: `'rrf'` or `'weighted'`

Choosing a Provider

For prototyping and development:

Chroma (embedded mode) or FAISS — zero infrastructure, fast iteration

For production with managed infrastructure:

Pinecone — fully managed, auto-scaling, zero ops
SuperMemory — zero-config (handles embeddings too)

For production with self-hosted infrastructure:

Qdrant or Milvus — feature-rich, scalable, battle-tested
Weaviate — if you need schema-based collections and AI modules

For existing PostgreSQL setups:

PGVector — adds vector search to your existing database

​Overview

​Provider Comparison

​Quick Start

​Connection Modes

​Shared Configuration

​Choosing a Provider

Overview

Provider Comparison

Quick Start

Connection Modes

Shared Configuration

Choosing a Provider