Parameters
Parameter | Type | Default | Description |
---|---|---|---|
agent | Agent | Required | Pre-configured Agent class for cognitive processing |
config | Optional[AgenticChunkingConfig] | None | Configuration object with all settings |
Functions
__init__
Initialize agentic chunker.
Parameters:
agent
(Agent): Pre-configured Agent class for cognitive processingconfig
(Optional[AgenticChunkingConfig]): Configuration object with all settings
_chunk_document
Synchronous chunking - delegates to async implementation.
Parameters:
document
(Document): Document to chunk
List[Chunk]
: List of chunks
_achunk_document
Core agentic chunking pipeline.
Parameters:
document
(Document): Document to process with AI agents
List[Chunk]
: List of cognitively-optimized chunks with rich metadata
_get_cache_key
Generate cache key for text content.
Parameters:
text
(str): Text content to generate key for
str
: MD5 hash of the text content
_extract_propositions
Extract atomic propositions from document using AI agent.
Parameters:
document
(Document): Document to extract propositions from
List[str]
: List of extracted propositions
_validate_propositions
Validate proposition quality and filter invalid ones.
Parameters:
propositions
(List[str]): List of propositions to validate
List[str]
: List of validated propositions
_group_propositions_into_topics
Group propositions into coherent thematic topics using AI agent.
Parameters:
propositions
(List[str]): List of propositions to groupdocument
(Document): Source document
List[Topic]
: List of topics containing grouped propositions
_validate_topic_sizes
Validate topic sizes and split oversized topics.
Parameters:
topics
(List[Topic]): List of topics to validate
List[Topic]
: List of validated topics
_optimize_topics
Optimize topic assignments by merging small topics.
Parameters:
topics
(List[Topic]): List of topics to optimize
List[Topic]
: List of optimized topics
_find_chunk_indices_in_document
Find start and end indices of chunk text within document.
Parameters:
chunk_text
(str): Text content of the chunkdocument_content
(str): Full document content
Tuple[int, int]
: Start and end indices
_create_chunks_from_topics
Create chunks from grouped topics.
Parameters:
topics
(List[Topic]): List of topics to create chunks fromdocument
(Document): Source document
List[Chunk]
: List of created chunks
_finalize_topic_chunk
Finalize a chunk from topics with metadata enrichment.
Parameters:
topics
(List[Topic]): List of topics for the chunkdocument
(Document): Source document
Chunk
: Finalized chunk with enriched metadata
_refine_topic_metadata
Refine topic metadata using AI agent.
Parameters:
chunk_text
(str): Text content of the chunktopic
(Topic): Base topic for metadata
RefinedTopic
: Refined topic with title and summary
_score_chunk_coherence
Score chunk coherence and add quality assessment.
Parameters:
chunks
(List[Chunk]): List of chunks to score
List[Chunk]
: List of chunks with coherence scores
_calculate_coherence_score
Calculate coherence score for a chunk.
Parameters:
chunk
(Chunk): Chunk to calculate score for
float
: Coherence score between 0 and 1
_assess_chunk_quality
Assess chunk quality based on coherence score.
Parameters:
coherence_score
(float): Coherence score to assess
str
: Quality assessment (“excellent”, “good”, “fair”, “poor”)
_fallback_chunking
Fallback to recursive chunking on agent failure.
Parameters:
document
(Document): Document to chunk with fallback method
List[Chunk]
: List of chunks created with fallback method
_add_processing_metadata
Add processing metadata to chunks.
Parameters:
chunks
(List[Chunk]): List of chunks to add metadata toprocessing_time
(float): Processing time in milliseconds
get_agentic_stats
Get statistics about agentic processing.
Returns:
Dict[str, any]
: Dictionary containing processing statistics