Python SDK

The SupaEval Python SDK provides a simple and intuitive interface to integrate agent evaluation into your Python applications. Perfect for data scientists, ML engineers, and backend developers.

Prerequisites
Python 3.8 or higher is required. We recommend using a virtual environment to manage dependencies.

Installation

Install the SupaEval Python SDK using pip:

bash
pip install supaeval

Authentication

You'll need an API key to use the SupaEval SDK. Get your API key from the SupaEval dashboard.

python
from supaeval import SupaEval

# Option 1: API Key in code
client = SupaEval(api_key="sk_live_...")

# Option 2: Environment variable
# Set SUPAEVAL_API_KEY in your environment
client = SupaEval()

# Option 3: Custom configuration
client = SupaEval(
    api_key="sk_live_...",
    base_url="https://api.supaeval.com",
    timeout=30
)
Keep your API key secure
Never commit your API key to version control. Use environment variables or a secrets manager in production environments.

Quick Start

Here's a complete example showing how to create a dataset, add test cases, and run an evaluation:

python
from supaeval import SupaEval

# Initialize the client
client = SupaEval(api_key="your_api_key")

# Create a dataset
dataset = client.datasets.create(
    name="my-evaluation-dataset",
    description="Test dataset for agent evaluation"
)

# Add test cases
dataset.add_items([
    {
        "input": "What is the capital of France?",
        "expected_output": "Paris",
        "metadata": {"difficulty": "easy"}
    },
    {
        "input": "Explain quantum computing",
        "expected_output": "Quantum computing uses quantum bits...",
        "metadata": {"difficulty": "hard"}
    }
])

# Run evaluation
evaluation = client.evaluations.create(
    dataset_id=dataset.id,
    agent_endpoint="https://your-agent.api/chat",
    metrics=["accuracy", "relevance", "faithfulness"]
)

# Get results
results = evaluation.get_results()
print(f"Overall Score: {results.overall_score}")
print(f"Pass Rate: {results.pass_rate}%")

Async Support

The SDK provides full async/await support for non-blocking operations:

python
import asyncio
from supaeval import AsyncSupaEval

async def main():
    client = AsyncSupaEval(api_key="your_api_key")
    
    # Async evaluation
    evaluation = await client.evaluations.create(
        dataset_id="dataset_123",
        agent_endpoint="https://your-agent.api/chat"
    )
    
    # Wait for completion
    results = await evaluation.wait_for_completion()
    print(results)

asyncio.run(main())

Key Features

Type Hints

Full type annotations for better IDE support and fewer bugs

Async/Await

Native async support for high-performance applications

Automatic Retries

Built-in retry logic with exponential backoff

Streaming Support

Stream evaluation results as they become available

Next Steps