Skip to main content

Overview

Weaviate is an open-source vector database with a GraphQL API. It supports embedded, local, and cloud deployments with schema-based collections and module configurations. Provider Class: WeaviateProvider
Config Class: WeaviateConfig

Dependencies

pip install "upsonic[rag]"

Examples

from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings.openai_provider import OpenAIEmbeddingProvider
from upsonic.vectordb import WeaviateProvider, WeaviateConfig, ConnectionConfig, Mode, HNSWIndexConfig

# Setup embedding provider
embedding = OpenAIEmbeddingProvider(api_key="your-api-key")

# Embedded mode
config = WeaviateConfig(
    collection_name="my_collection",
    vector_size=1536,
    connection=ConnectionConfig(mode=Mode.EMBEDDED, db_path="./weaviate_db"),
    index=HNSWIndexConfig(m=16, ef_construction=200)
)
vectordb = WeaviateProvider(config)

# Create knowledge base
kb = KnowledgeBase(
    sources="document.pdf",
    embedding_provider=embedding,
    vectordb=vectordb
)

# Use with Agent
agent = Agent("openai/gpt-4o")
task = Task(
    description="Query the documents",
    context=[kb]
)
result = agent.do(task)

Parameters

Base Parameters (from BaseVectorDBConfig)

ParameterTypeDescriptionDefaultRequired
collection_namestrName of the collection"default_collection"No
vector_sizeintDimension of vectors-Yes
distance_metricDistanceMetricSimilarity metric (COSINE, EUCLIDEAN, DOT_PRODUCT)COSINENo
recreate_if_existsboolRecreate collection if it existsFalseNo
default_top_kintDefault number of results10No
default_similarity_thresholdOptional[float]Minimum similarity score (0.0-1.0)NoneNo
dense_search_enabledboolEnable dense vector searchTrueNo
full_text_search_enabledboolEnable full-text searchTrueNo
hybrid_search_enabledboolEnable hybrid searchTrueNo
default_hybrid_alphafloatDefault alpha for hybrid search (0.0-1.0)0.5No
default_fusion_methodLiteral['rrf', 'weighted']Default fusion method for hybrid search'weighted'No
provider_nameOptional[str]Provider nameNoneNo
provider_descriptionOptional[str]Provider descriptionNoneNo
provider_idOptional[str]Provider IDNoneNo
default_metadataOptional[Dict[str, Any]]Default metadata for all recordsNoneNo
auto_generate_content_idboolAuto-generate content IDsTrueNo
indexed_fieldsOptional[List[Union[str, Dict[str, Any]]]]Fields to index for filteringNoneNo

Weaviate-Specific Parameters

ParameterTypeDescriptionDefaultRequired
connectionConnectionConfigConnection configuration (mode, db_path, etc.)-Yes
indexUnion[HNSWIndexConfig, FlatIndexConfig]Index type configuration (IVF not supported)HNSWIndexConfig()No
descriptionOptional[str]Collection descriptionNoneNo
namespaceOptional[str]Tenant name for multi-tenancy (auto-enables multi_tenancy_enabled)NoneNo
multi_tenancy_enabledboolEnable multi-tenancy for the collectionFalseNo
propertiesOptional[List[Dict[str, Any]]]Custom schema properties beyond standard fieldsNoneNo
referencesOptional[List[Dict[str, Any]]]Cross-references to other collectionsNoneNo
inverted_index_configOptional[Dict[str, Any]]Inverted index configuration for BM25 tuning (e.g., {'bm25': {'k1': 1.2, 'b': 0.75}})NoneNo
replication_configOptional[Dict[str, Any]]Replication configuration (e.g., {'factor': 3, 'asyncEnabled': True})NoneNo
sharding_configOptional[Dict[str, Any]]Sharding configuration (e.g., {'virtualPerPhysical': 128, 'desiredCount': 2})NoneNo
generative_configOptional[Dict[str, Any]]Generative AI module configuration (e.g., {'provider': 'openai', 'model': 'gpt-4'})NoneNo
reranker_configOptional[Dict[str, Any]]Reranker module configuration (e.g., {'provider': 'cohere', 'model': 'rerank-english-v2.0'})NoneNo
api_keysOptional[Dict[str, str]]API keys for AI modules (e.g., {'openai': 'sk-...', 'cohere': '...'})NoneNo

ConnectionConfig Parameters

ParameterTypeDescriptionDefaultRequired
modeModeConnection mode (EMBEDDED, LOCAL, CLOUD, IN_MEMORY)-Yes
db_pathOptional[str]Path for embedded/local storageNoneRequired for EMBEDDED
hostOptional[str]Host addressNoneRequired for LOCAL
portOptional[int]Port numberNoneRequired for LOCAL
api_keyOptional[SecretStr]API key for cloud/localNoneRequired for CLOUD
urlOptional[str]Full connection URLNoneNo
use_tlsboolUse TLS encryptionTrueNo
grpc_portOptional[int]gRPC portNoneNo
prefer_grpcboolPrefer gRPC over HTTPFalseNo
httpsOptional[bool]Use HTTPSNoneNo
prefixOptional[str]URL path prefixNoneNo
timeoutOptional[float]Request timeout in secondsNoneNo
locationOptional[str]Special location stringNoneNo