Skip to main content

Overview

PGVector is a PostgreSQL extension that enables vector similarity search. It supports HNSW and IVFFlat indexes and integrates seamlessly with existing PostgreSQL infrastructure. Provider Class: PgvectorProvider
Config Class: PgVectorConfig

Dependencies

pip install "upsonic[rag]"

Examples

from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings.openai_provider import OpenAIEmbeddingProvider
from upsonic.vectordb import PgvectorProvider, PgVectorConfig, HNSWIndexConfig
from pydantic import SecretStr

# Setup embedding provider
embedding = OpenAIEmbeddingProvider(api_key="your-api-key")

# Create PGVector configuration
config = PgVectorConfig(
    collection_name="my_collection",
    vector_size=1536,
    connection_string=SecretStr("postgresql://user:password@localhost/dbname"),
    schema_name="public",
    index=HNSWIndexConfig(m=16, ef_construction=200)
)
vectordb = PgvectorProvider(config)

# Create knowledge base
kb = KnowledgeBase(
    sources="document.pdf",
    embedding_provider=embedding,
    vectordb=vectordb
)

# Use with Agent
agent = Agent("openai/gpt-4o")
task = Task(
    description="Search the database",
    context=[kb]
)
result = agent.do(task)

Parameters

Base Parameters (from BaseVectorDBConfig)

ParameterTypeDescriptionDefaultRequired
collection_namestrName of the collection"default_collection"No
vector_sizeintDimension of vectors-Yes
distance_metricDistanceMetricSimilarity metric (COSINE, EUCLIDEAN, DOT_PRODUCT)COSINENo
recreate_if_existsboolRecreate collection if it existsFalseNo
default_top_kintDefault number of results10No
default_similarity_thresholdOptional[float]Minimum similarity score (0.0-1.0)NoneNo
dense_search_enabledboolEnable dense vector searchTrueNo
full_text_search_enabledboolEnable full-text searchTrueNo
hybrid_search_enabledboolEnable hybrid searchTrueNo
default_hybrid_alphafloatDefault alpha for hybrid search (0.0-1.0)0.5No
default_fusion_methodLiteral['rrf', 'weighted']Default fusion method for hybrid search'weighted'No
provider_nameOptional[str]Provider nameNoneNo
provider_descriptionOptional[str]Provider descriptionNoneNo
provider_idOptional[str]Provider IDNoneNo
default_metadataOptional[Dict[str, Any]]Default metadata for all recordsNoneNo
auto_generate_content_idboolAuto-generate content IDsTrueNo
indexed_fieldsOptional[List[Union[str, Dict[str, Any]]]]Fields to index for filteringNoneNo

PGVector-Specific Parameters

ParameterTypeDescriptionDefaultRequired
connection_stringSecretStrPostgreSQL connection string-Yes
schema_namestrPostgreSQL schema name"public"No
table_nameOptional[str]Table name (uses collection_name if not specified)NoneNo
indexUnion[HNSWIndexConfig, IVFIndexConfig]Index type configuration (FLAT not supported)HNSWIndexConfig()No
content_languagestrLanguage for full-text search (e.g., ‘english’, ‘spanish’)"english"No
prefix_matchboolEnable prefix matching for full-text search (appends * to words)FalseNo
schema_versionintSchema version for migrations1No
auto_upgrade_schemaboolAutomatically upgrade schema on version mismatchFalseNo
batch_sizeintBatch size for upsert operations100No
pool_sizeintConnection pool size5No
max_overflowintMaximum pool overflow10No
pool_timeoutfloatPool timeout in seconds30.0No
pool_recycleintPool recycle time in seconds3600No