Skip to main content

Overview

CSV loader processes CSV files with options to create documents per row, per chunk, or as a single document. Supports column filtering and flexible content formatting. Loader Class: CSVLoader Config Class: CSVLoaderConfig

Dependencies

pip install "upsonic[loaders]"

Examples

from upsonic import Agent, Task, KnowledgeBase
from upsonic.loaders import CSVLoader, CSVLoaderConfig
from upsonic.embeddings import OpenAIEmbedding, OpenAIEmbeddingConfig
from upsonic.text_splitter import RecursiveChunker, RecursiveChunkingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

# Configure loader for per-row documents
loader_config = CSVLoaderConfig(
    split_mode="per_row",
    content_synthesis_mode="concatenated",
    include_columns=["name", "description"]
)
loader = CSVLoader(loader_config)

# Setup KnowledgeBase
embedding = OpenAIEmbedding(OpenAIEmbeddingConfig())
chunker = RecursiveChunker(RecursiveChunkingConfig())
vectordb = ChromaProvider(ChromaConfig(
    collection_name="csv_data",
    vector_size=1536,
    connection=ConnectionConfig(mode=Mode.IN_MEMORY)
))

kb = KnowledgeBase(
    sources=["data.csv"],
    embedding_provider=embedding,
    vectordb=vectordb,
    loaders=[loader],
    splitters=[chunker]
)

# Query with Agent
agent = Agent("openai/gpt-4o")
task = Task("Find products matching 'laptop'", context=[kb])
result = agent.do(task)
print(result)

Parameters

ParameterTypeDescriptionDefaultSource
encodingstr | NoneFile encoding (auto-detected if None)NoneBase
error_handling"ignore" | "warn" | "raise"How to handle loading errors”warn”Base
include_metadataboolWhether to include file metadataTrueBase
custom_metadatadictAdditional metadata to includeBase
max_file_sizeint | NoneMaximum file size in bytesNoneBase
skip_empty_contentboolSkip documents with empty contentTrueBase
content_synthesis_mode"concatenated" | "json"How to create document content from rows”concatenated”Specific
split_mode"single_document" | "per_row" | "per_chunk"How to split CSV into documents”single_document”Specific
rows_per_chunkintNumber of rows per document (for per_chunk mode)100Specific
include_columnslist[str] | NoneOnly include these columnsNoneSpecific
exclude_columnslist[str] | NoneExclude these columnsNoneSpecific
delimiterstrCSV delimiter”,“Specific
quotecharstrCSV quote character’“‘Specific
has_headerboolWhether CSV has a header rowTrueSpecific