Python SDK
The SupaEval Python SDK provides a simple and intuitive interface to integrate agent evaluation into your Python applications. Perfect for data scientists, ML engineers, and backend developers.
Prerequisites
Python 3.8 or higher is required. We recommend using a virtual environment to manage dependencies.
Installation
Install the SupaEval Python SDK using pip:
bash
pip install supaevalAuthentication
You'll need an API key to use the SupaEval SDK. Get your API key from the SupaEval dashboard.
python
from supaeval import SupaEval
# Option 1: API Key in code
client = SupaEval(api_key="sk_live_...")
# Option 2: Environment variable
# Set SUPAEVAL_API_KEY in your environment
client = SupaEval()
# Option 3: Custom configuration
client = SupaEval(
api_key="sk_live_...",
base_url="https://api.supaeval.com",
timeout=30
)Keep your API key secure
Never commit your API key to version control. Use environment variables or a secrets manager in production environments.
Quick Start
Here's a complete example showing how to create a dataset, add test cases, and run an evaluation:
python
from supaeval import SupaEval
# Initialize the client
client = SupaEval(api_key="your_api_key")
# Create a dataset
dataset = client.datasets.create(
name="my-evaluation-dataset",
description="Test dataset for agent evaluation"
)
# Add test cases
dataset.add_items([
{
"input": "What is the capital of France?",
"expected_output": "Paris",
"metadata": {"difficulty": "easy"}
},
{
"input": "Explain quantum computing",
"expected_output": "Quantum computing uses quantum bits...",
"metadata": {"difficulty": "hard"}
}
])
# Run evaluation
evaluation = client.evaluations.create(
dataset_id=dataset.id,
agent_endpoint="https://your-agent.api/chat",
metrics=["accuracy", "relevance", "faithfulness"]
)
# Get results
results = evaluation.get_results()
print(f"Overall Score: {results.overall_score}")
print(f"Pass Rate: {results.pass_rate}%")Async Support
The SDK provides full async/await support for non-blocking operations:
python
import asyncio
from supaeval import AsyncSupaEval
async def main():
client = AsyncSupaEval(api_key="your_api_key")
# Async evaluation
evaluation = await client.evaluations.create(
dataset_id="dataset_123",
agent_endpoint="https://your-agent.api/chat"
)
# Wait for completion
results = await evaluation.wait_for_completion()
print(results)
asyncio.run(main())Key Features
Type Hints
Full type annotations for better IDE support and fewer bugs
Async/Await
Native async support for high-performance applications
Automatic Retries
Built-in retry logic with exponential backoff
Streaming Support
Stream evaluation results as they become available