Skip to main content

Setup

Get your keys from the Langfuse dashboard → Settings → API Keys.
export LANGFUSE_PUBLIC_KEY=pk-lf-...
export LANGFUSE_SECRET_KEY=sk-lf-...
Install the optional dependencies:
pip install upsonic[langfuse]
# or
uv sync --extra langfuse

Usage with Agent / Task

Every agent.do() or agent.print_do() (including async versions) call is automatically traced to Langfuse when you pass instrument=langfuse.

Minimal — Keys from Environment

from upsonic import Agent, Task
from upsonic.integrations.langfuse import Langfuse

langfuse = Langfuse()
agent = Agent("anthropic/claude-sonnet-4-6", instrument=langfuse)

task = Task(description="What is 2 + 2?")
agent.print_do(task)

langfuse.shutdown()

Full Configuration with Session & User Tracking

from upsonic import Agent, Task
from upsonic.integrations.langfuse import Langfuse
from upsonic.tools import tool

@tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    return f"Sunny in {city}, 22°C"

langfuse = Langfuse(
    public_key="pk-lf-abc123",
    secret_key="sk-lf-xyz789",
    region="us",                  # "eu" (default) or "us"
    service_name="my-agent",
    include_content=True,         # include prompts/responses (default)
)

agent = Agent(
    "anthropic/claude-sonnet-4-6",
    instrument=langfuse,
    session_id="chat-session-001",
    user_id="user@example.com",
    tools=[get_weather],
)

task = Task(description="What is the weather in Paris?")
agent.print_do(task)

langfuse.shutdown()

Usage with Evaluation

AccuracyEvaluator with Langfuse Datasets

When you pass langfuse to AccuracyEvaluator, evaluation results are automatically:
  1. Logged as a dataset item
  2. Linked to the agent’s trace via a dataset run item
  3. Scored on the trace with the evaluation result
import asyncio
from upsonic import Agent, Task
from upsonic.integrations.langfuse import Langfuse
from upsonic.eval import AccuracyEvaluator

langfuse = Langfuse()

agent = Agent("anthropic/claude-sonnet-4-6", instrument=langfuse)
judge = Agent("anthropic/claude-sonnet-4-6")

evaluator = AccuracyEvaluator(
    judge_agent=judge,
    agent_under_test=agent,
    query="What is the capital of France?",
    expected_output="Paris",
    langfuse=langfuse,
    langfuse_dataset_name="my-eval-dataset",  # optional, default: "accuracy-eval"
    langfuse_run_name="run-v1",               # optional, default: auto-generated
)

result = asyncio.run(evaluator.run())
print(f"Score: {result.average_score}/10")

langfuse.shutdown()

AccuracyEvaluator with Multiple Iterations

Run the same query multiple times to get statistical confidence:
import asyncio
from upsonic import Agent
from upsonic.integrations.langfuse import Langfuse
from upsonic.eval import AccuracyEvaluator

langfuse = Langfuse()
agent = Agent("anthropic/claude-sonnet-4-6", instrument=langfuse)
judge = Agent("anthropic/claude-sonnet-4-6")

evaluator = AccuracyEvaluator(
    judge_agent=judge,
    agent_under_test=agent,
    query="What is 10 + 5? Reply with just the number.",
    expected_output="15",
    num_iterations=3,
    langfuse=langfuse,
    langfuse_dataset_name="math-eval",
)

result = asyncio.run(evaluator.run())
print(f"Average score: {result.average_score}/10")
print(f"Iterations: {len(result.evaluation_scores)}")

langfuse.shutdown()

AccuracyEvaluator Parameters

ParameterTypeDefaultDescription
langfuseLangfuseNoneLangfuse instance for dataset logging
langfuse_dataset_namestr"accuracy-eval"Name of the Langfuse dataset to create/use
langfuse_run_namestrauto-generatedName for the dataset run
num_iterationsint1Number of times to run the evaluation

Advanced APIs

For direct access to Langfuse’s scoring, datasets, and annotation queue APIs, see the Advanced Langfuse Guide.

Parameters Reference

ParameterTypeDefaultDescription
public_keystrenv LANGFUSE_PUBLIC_KEYLangfuse public key
secret_keystrenv LANGFUSE_SECRET_KEYLangfuse secret key
hoststrenv LANGFUSE_HOSTCustom host URL
region"eu" | "us""eu"Cloud region (ignored if host is set)
service_namestr"upsonic"Service name in traces
sample_ratefloat1.0Fraction of traces to sample (0.0–1.0)
include_contentboolTrueInclude prompts/responses in traces
flush_on_exitboolTrueAuto-flush on process exit

Environment Variables

VariableDescription
LANGFUSE_PUBLIC_KEYLangfuse public key (pk-lf-...)
LANGFUSE_SECRET_KEYLangfuse secret key (sk-lf-...)
LANGFUSE_HOSTCustom Langfuse host URL