> ## Documentation Index
> Fetch the complete documentation index at: https://docs.upsonic.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# HuggingFace Embeddings

> Using HuggingFace embedding models with Upsonic

## Overview

HuggingFace provides access to thousands of embedding models from the HuggingFace Hub. Supports both local model execution and Inference API, with options for quantization, GPU acceleration, and custom pooling strategies.

**Provider Class:** `HuggingFaceEmbedding`

**Config Class:** `HuggingFaceEmbeddingConfig`

## Dependencies

```bash theme={null}
uv pip install transformers torch
```

For Inference API (optional):

```bash theme={null}
uv pip install huggingface_hub
```

## Examples

```python theme={null}
from upsonic import Agent, Task, KnowledgeBase
from upsonic.embeddings import HuggingFaceEmbedding, HuggingFaceEmbeddingConfig
from upsonic.vectordb import ChromaProvider, ChromaConfig, ConnectionConfig, Mode

# Create embedding provider (local)
embedding = HuggingFaceEmbedding(HuggingFaceEmbeddingConfig(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    use_local=True,
    pooling_strategy="mean"
))

# Setup KnowledgeBase
vectordb = ChromaProvider(ChromaConfig(
    collection_name="hf_docs",
    vector_size=384,
    connection=ConnectionConfig(mode=Mode.IN_MEMORY)
))

kb = KnowledgeBase(
    sources=["document.txt"],
    embedding_provider=embedding,
    vectordb=vectordb
)

# Query with Agent
agent = Agent("anthropic/claude-sonnet-4-5")
task = Task("What is this document about?", context=[kb])
result = agent.do(task)
print(result)
```

## Parameters

| Parameter                       | Type          | Description                                          | Default                                    | Source   |
| ------------------------------- | ------------- | ---------------------------------------------------- | ------------------------------------------ | -------- |
| `model_name`                    | `str`         | HuggingFace model name or path                       | `"sentence-transformers/all-MiniLM-L6-v2"` | Specific |
| `hf_token`                      | `str \| None` | HuggingFace API token                                | `None`                                     | Specific |
| `use_api`                       | `bool`        | Use HuggingFace Inference API instead of local model | `False`                                    | Specific |
| `use_local`                     | `bool`        | Use local model execution                            | `True`                                     | Specific |
| `device`                        | `str \| None` | Device to run model on (auto-detected if None)       | `None`                                     | Specific |
| `torch_dtype`                   | `str`         | PyTorch data type (float16, float32, bfloat16)       | `"float32"`                                | Specific |
| `trust_remote_code`             | `bool`        | Trust remote code in model                           | `False`                                    | Specific |
| `max_seq_length`                | `int \| None` | Maximum sequence length                              | `None`                                     | Specific |
| `pooling_strategy`              | `str`         | Pooling strategy (mean, cls, max)                    | `"mean"`                                   | Specific |
| `enable_quantization`           | `bool`        | Enable model quantization                            | `False`                                    | Specific |
| `quantization_bits`             | `int`         | Quantization bits (4, 8, 16)                         | `8`                                        | Specific |
| `enable_gradient_checkpointing` | `bool`        | Enable gradient checkpointing to save memory         | `False`                                    | Specific |
| `wait_for_model`                | `bool`        | Wait for model to load if using API                  | `True`                                     | Specific |
| `timeout`                       | `int \| None` | Timeout for model                                    | `None`                                     | Specific |
| `cache_dir`                     | `str \| None` | Model cache directory                                | `None`                                     | Specific |
| `force_download`                | `bool`        | Force re-download of model                           | `False`                                    | Specific |
| `batch_size`                    | `int`         | Batch size for document embedding                    | `100`                                      | Base     |
| `normalize_embeddings`          | `bool`        | Whether to normalize embeddings to unit length       | `True`                                     | Base     |
| `show_progress`                 | `bool`        | Whether to show progress during batch operations     | `True`                                     | Base     |
