Skip to main content

Overview

When KnowledgeBase is used as context, you can fine-tune retrieval behavior directly on the Task. These parameters control how many results are returned, how they’re ranked, and what minimum quality threshold to enforce.

Available Parameters

ParameterTypeDefaultDescription
vector_search_top_kint | NoneNone (uses provider default of 10)Number of results to retrieve
vector_search_similarity_thresholdfloat | NoneNoneMinimum similarity score (0.0-1.0) — results below this are discarded
vector_search_alphafloat | NoneNone (uses provider default of 0.5)Balance between dense and sparse search (0.0 = full-text only, 1.0 = dense only)
vector_search_fusion_methodstr | NoneNone (uses provider default)How to combine search results: 'rrf' (Reciprocal Rank Fusion) or 'weighted'
vector_search_filterDict | NoneNoneMetadata filter to scope results

Controlling Result Count

Increase top_k when you need comprehensive context, decrease it for focused answers:
from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

embedding = OpenAIEmbedding(OpenAIEmbeddingConfig())
vectordb = ChromaProvider(ChromaConfig(
    collection_name="search_kb",
    vector_size=1536,
    connection=ConnectionConfig(mode=Mode.IN_MEMORY)
))

kb = KnowledgeBase(
    sources=["product_catalog.pdf"],
    embedding_provider=embedding,
    vectordb=vectordb
)

agent = Agent("anthropic/claude-sonnet-4-5")

# Retrieve more results for a broad comparison
task = Task(
    description="Compare all available pricing tiers and their features",
    context=[kb],
    vector_search_top_k=20
)

result = agent.do(task)
print(result)

Setting Similarity Threshold

Filter out low-quality matches by setting a minimum similarity score:
from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

embedding = OpenAIEmbedding(OpenAIEmbeddingConfig())
vectordb = ChromaProvider(ChromaConfig(
    collection_name="threshold_kb",
    vector_size=1536,
    connection=ConnectionConfig(mode=Mode.IN_MEMORY)
))

kb = KnowledgeBase(
    sources=["technical_spec.pdf"],
    embedding_provider=embedding,
    vectordb=vectordb
)

agent = Agent("anthropic/claude-sonnet-4-5")

# Only return highly relevant results
task = Task(
    description="What is the maximum throughput of the system?",
    context=[kb],
    vector_search_top_k=5,
    vector_search_similarity_threshold=0.75
)

result = agent.do(task)
print(result)
Control how dense vector search and full-text search are blended:
from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

embedding = OpenAIEmbedding(OpenAIEmbeddingConfig())
vectordb = ChromaProvider(ChromaConfig(
    collection_name="hybrid_kb",
    vector_size=1536,
    connection=ConnectionConfig(mode=Mode.IN_MEMORY)
))

kb = KnowledgeBase(
    sources=["api_docs/"],
    embedding_provider=embedding,
    vectordb=vectordb
)

agent = Agent("anthropic/claude-sonnet-4-5")

# Favor full-text search for exact keyword matches (alpha closer to 0)
task = Task(
    description="Find documentation about the 'X-Rate-Limit-Remaining' header",
    context=[kb],
    vector_search_alpha=0.2,
    vector_search_fusion_method="rrf"
)

result = agent.do(task)
print(result)
Alpha ValueBehavior
0.0Pure full-text / keyword search
0.3Emphasis on keyword matches with some semantic understanding
0.5Equal blend of dense + full-text (default)
0.7Emphasis on semantic similarity
1.0Pure dense / semantic vector search

Metadata Filtering

Scope search to specific documents or categories using metadata filters:
from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

embedding = OpenAIEmbedding(OpenAIEmbeddingConfig())
vectordb = ChromaProvider(ChromaConfig(
    collection_name="filter_kb",
    vector_size=1536,
    connection=ConnectionConfig(mode=Mode.IN_MEMORY)
))

kb = KnowledgeBase(
    sources=["contracts/", "invoices/"],
    embedding_provider=embedding,
    vectordb=vectordb
)

agent = Agent("anthropic/claude-sonnet-4-5")

# Only search within a specific document
task = Task(
    description="What is the payment schedule?",
    context=[kb],
    vector_search_filter={"document_name": "contract_2024.pdf"}
)

result = agent.do(task)
print(result)

Combining Parameters

All parameters can be combined for precise control:
from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

embedding = OpenAIEmbedding(OpenAIEmbeddingConfig())
vectordb = ChromaProvider(ChromaConfig(
    collection_name="combined_kb",
    vector_size=1536,
    connection=ConnectionConfig(mode=Mode.IN_MEMORY)
))

kb = KnowledgeBase(
    sources=["compliance_docs/"],
    embedding_provider=embedding,
    vectordb=vectordb
)

agent = Agent("anthropic/claude-sonnet-4-5")

task = Task(
    description="What are the GDPR compliance requirements for data retention?",
    context=[kb],
    vector_search_top_k=10,
    vector_search_similarity_threshold=0.7,
    vector_search_alpha=0.6,
    vector_search_fusion_method="weighted"
)

result = agent.do(task)
print(result)