Skip to main content

Overview

PGVector is a PostgreSQL extension that enables vector similarity search. It supports HNSW and IVFFlat indexes and integrates seamlessly with existing PostgreSQL infrastructure. Provider Class: PgvectorProvider
Config Class: PgVectorConfig

Dependencies

pip install "upsonic[rag]"

Examples

from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings.openai_provider import OpenAIEmbeddingProvider
from upsonic.vectordb import PgvectorProvider, PgVectorConfig, HNSWIndexConfig
from pydantic import SecretStr

# Setup embedding provider
embedding = OpenAIEmbeddingProvider(api_key="your-api-key")

# Create PGVector configuration
config = PgVectorConfig(
    collection_name="my_collection",
    vector_size=1536,
    connection_string=SecretStr("postgresql://user:password@localhost/dbname"),
    schema_name="public",
    index=HNSWIndexConfig(m=16, ef_construction=200)
)
vectordb = PgvectorProvider(config)

# Create knowledge base
kb = KnowledgeBase(
    sources="document.pdf",
    embedding_provider=embedding,
    vectordb=vectordb
)

# Use with Agent
agent = Agent("openai/gpt-4o")
task = Task(
    description="Search the database",
    context=[kb]
)
result = agent.do(task)

Parameters

ParameterTypeDescriptionDefaultSource
collection_namestrName of the collection"default_collection"Base
vector_sizeintDimension of vectorsRequiredBase
distance_metricDistanceMetricSimilarity metric (COSINE, EUCLIDEAN, DOT_PRODUCT)COSINEBase
recreate_if_existsboolRecreate collection if it existsFalseBase
default_top_kintDefault number of results10Base
default_similarity_thresholdOptional[float]Minimum similarity scoreNoneBase
connection_stringSecretStrPostgreSQL connection stringRequiredSpecific
schema_namestrPostgreSQL schema name"public"Specific
table_nameOptional[str]Table nameUses collection_nameSpecific
indexUnion[HNSWIndexConfig, IVFIndexConfig]Index type configurationHNSWIndexConfig()Specific
content_languagestrLanguage for full-text search"english"Specific
prefix_matchboolEnable prefix matching for full-text searchFalseSpecific
schema_versionintSchema version for migrations1Specific
batch_sizeintBatch size for upsert operations100Specific
pool_sizeintConnection pool size5Specific
max_overflowintMaximum pool overflow10Specific
pool_timeoutfloatPool timeout in seconds30.0Specific
pool_recycleintPool recycle time in seconds3600Specific