Overview
By default, KnowledgeBase stores document chunks only in the vector database. When you pass astorage backend, KnowledgeBase also writes a document registry — a relational record of every document it has processed, including metadata, content hashes, chunk counts, and processing status.
This is useful when you need to:
- Track which documents are indexed across restarts without querying the vector database
- Share a storage backend between Memory and KnowledgeBase for a unified persistence layer
- Audit document lifecycle — see when documents were added, their status, and source paths
- Enable source removal by document ID — storage lets
remove_document()look up the original file path and clean upsources
Quick Start
Pass any Upsonic storage backend as thestorage parameter:
setup(), every processed document is recorded in the storage’s knowledge table (upsonic_knowledge by default). When you call add_source(), add_text(), or remove_document(), the registry is updated automatically.
What Gets Persisted
Each processed document creates a row in the knowledge table:| Field | Description |
|---|---|
id | Document ID (content-based hash) |
name | Human-readable document name |
type | File extension (e.g., pdf, md) |
size | File size in bytes |
knowledge_base_id | ID of the parent KnowledgeBase |
content_hash | MD5 hash of document content for deduplication |
chunk_count | Number of chunks created from this document |
source | Original file path |
status | Processing status (indexed, failed) |
metadata | Full document metadata as JSON |
created_at | Timestamp of first indexing |
updated_at | Timestamp of last update |
Supported Backends
Any Upsonic storage backend works — the same ones used for Memory:| Backend | Example |
|---|---|
SqliteStorage | SqliteStorage(db_file="app.db") |
PostgresStorage | PostgresStorage(db_url="postgresql://...") |
RedisStorage | RedisStorage(db_url="redis://...") |
MongoStorage | MongoStorage(db_url="mongodb://...") |
JSONStorage | JSONStorage(db_path="./data") |
InMemoryStorage | InMemoryStorage() |
Mem0Storage | Mem0Storage(api_key="...") |
AsyncSqliteStorage, AsyncPostgresStorage, AsyncMongoStorage, AsyncMem0Storage) are also supported.

