Skip to main content

Setup

Get your API key from the PromptLayer dashboard → Settings → API Keys.
export PROMPTLAYER_API_KEY=pl_...
export OPENAI_API_KEY=sk-...
No extra dependencies required — PromptLayer uses httpx which is already included.

Usage with Agent / Task

Every agent.do() or agent.print_do() call is automatically logged to PromptLayer when you pass promptlayer=pl.

Minimal — Key from Environment

import os
from upsonic import Agent, Task
from upsonic.integrations.promptlayer import PromptLayer

pl = PromptLayer()
agent = Agent("anthropic/claude-sonnet-4-6", promptlayer=pl)

task = Task(description="What is 2 + 2?")
agent.print_do(task)

pl.shutdown()

Full Configuration with Tools and Prompt Registry

import os
from upsonic import Agent, Task
from upsonic.integrations.promptlayer import PromptLayer
from upsonic.tools import tool

@tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    return f"Sunny in {city}, 22°C"

pl = PromptLayer()

agent = Agent(
    "anthropic/claude-sonnet-4-6",
    system_prompt=pl.get_prompt("my-agent-v2"),  # load versioned prompt from PromptLayer
    tools=[get_weather],
    promptlayer=pl,
)

task = Task(description="What is the weather in Paris?")
agent.print_do(task)

pl.shutdown()

Usage with Evaluation

AccuracyEvaluator with PromptLayer Datasets

When you pass promptlayer to AccuracyEvaluator, evaluation results are automatically:
  1. Logged as a PromptLayer request (with score, metadata, tags)
  2. A dataset group is created (if it doesn’t exist) to organize your evaluation data
By default, the mode is "log_only" — the eval is logged as a request and the dataset group is created, but no CSV is uploaded. You control when to create dataset versions.
import asyncio
import os
from upsonic import Agent, Task
from upsonic.integrations.promptlayer import PromptLayer
from upsonic.eval import AccuracyEvaluator

pl = PromptLayer()

agent = Agent("anthropic/claude-sonnet-4-6", promptlayer=pl)
judge = Agent("anthropic/claude-sonnet-4-6")

evaluator = AccuracyEvaluator(
    judge_agent=judge,
    agent_under_test=agent,
    query="What is the capital of France?",
    expected_output="Paris",
    promptlayer=pl,
    promptlayer_dataset_name="my-eval-dataset",  # optional, default: "accuracy-eval"
)

result = asyncio.run(evaluator.run())
print(f"Score: {result.average_score}/10")

pl.shutdown()

Dataset Modes

Control how evaluation data is stored with promptlayer_dataset_mode:
ModeBehavior
"log_only" (default)Creates the dataset group and logs the eval as a request. Use the PromptLayer UI or create_dataset_version_from_filter to pull logged requests into a dataset version when you’re ready
"new_version"Each eval run uploads a new CSV version to the dataset group automatically
import asyncio
import os
from upsonic import Agent, Task
from upsonic.integrations.promptlayer import PromptLayer
from upsonic.eval import AccuracyEvaluator

pl = PromptLayer()

agent = Agent("anthropic/claude-sonnet-4-6", promptlayer=pl)
judge = Agent("anthropic/claude-sonnet-4-6")

# Automatically upload CSV on each eval run
evaluator = AccuracyEvaluator(
    judge_agent=judge,
    agent_under_test=agent,
    query="What is 2 + 2?",
    expected_output="4",
    promptlayer=pl,
    promptlayer_dataset_name="my-eval-dataset",
    promptlayer_dataset_mode="new_version",
)

result = asyncio.run(evaluator.run())
print(f"Score: {result.average_score}/10")

pl.shutdown()
SCREENSHOT PLACE

AccuracyEvaluator with Multiple Iterations

Run the same query multiple times — all iterations land as rows in a single CSV version:
import asyncio
import os
from upsonic import Agent, Task
from upsonic.integrations.promptlayer import PromptLayer
from upsonic.eval import AccuracyEvaluator

pl = PromptLayer()
agent = Agent("anthropic/claude-sonnet-4-6", promptlayer=pl)
judge = Agent("anthropic/claude-sonnet-4-6")

evaluator = AccuracyEvaluator(
    judge_agent=judge,
    agent_under_test=agent,
    query="What is 10 + 5? Reply with just the number.",
    expected_output="15",
    num_iterations=3,
    promptlayer=pl,
    promptlayer_dataset_name="math-eval",
    promptlayer_dataset_mode="new_version",
)

result = asyncio.run(evaluator.run())
print(f"Average score: {result.average_score}/10")
print(f"Iterations: {len(result.evaluation_scores)}")

pl.shutdown()

AccuracyEvaluator Parameters

ParameterTypeDefaultDescription
promptlayerPromptLayerNonePromptLayer instance for logging and datasets
promptlayer_dataset_namestr"accuracy-eval"Name of the dataset group to create/use
promptlayer_dataset_modestr"log_only""log_only" or "new_version"
num_iterationsint1Number of evaluation iterations

Advanced APIs

For direct access to PromptLayer’s dataset, report, and evaluation APIs, see the Advanced PromptLayer Guide.

Parameters Reference

ParameterTypeDefaultDescription
api_keystrenv PROMPTLAYER_API_KEYPromptLayer API key (pl_...)
base_urlstrhttps://api.promptlayer.comCustom API base URL

Environment Variables

VariableDescription
PROMPTLAYER_API_KEYPromptLayer API key (pl_...)
PROMPTLAYER_BASE_URLCustom PromptLayer API base URL