Skip to main content
Synkro automatically tracks LLM costs and API calls during generation. This helps you monitor usage and optimize your generation pipeline.

Viewing Costs

When generation completes, the RichReporter displays cost information:
Complete
Done!        Generated 100 traces in 2m 15s
Quality:     95% passed verification
Cost:        $0.1234
LLM Calls:   20 scenario + 100 response + 50 grading

Session API Cost Tracking

The Session API provides the cleanest way to track costs with persistent storage:
from synkro import Session

session = await Session.create(policy="...", session_id="my-run")
await session.extract_rules(session.policy)
await session.generate_scenarios(count=30)
await session.done()

# One-liner summary
print(session.show_cost_summary())
# Cost: $0.4265 | Calls: 75 | Time: 1m 34s

# Detailed breakdown by phase (includes HITL)
print(session.show_cost())
# Phase Breakdown:
# --------------------------------------------------
# Phase             Cost      Calls       Time
# --------------------------------------------------
# extraction       $0.0173        1      24.3s
# hitl             $0.0190        3       0.0s  <-- refine_* calls
# scenarios        $0.0083        3       6.4s
# traces           $0.0861        7      28.1s
# verification     $0.0488        7      1m 0s
# --------------------------------------------------
# Total            $0.1795       21

# Status includes cost
print(session.status())
# Rules: ✓ (18) | Scenarios: ✓ (30) | Traces: ✓ (30) | Verified: ✓ (28/30) | Cost: $0.4265

# Programmatic access
print(session.metrics.total_cost)       # 0.4265
print(session.metrics.total_calls)      # 75
print(session.metrics.breakdown)        # {'extraction': 0.012, ...}
print(session.metrics.calls_breakdown)  # {'extraction': 1, ...}
Cost data persists with the session - reload anytime to check costs:
session = await Session.load_from_db("my-run")
print(session.show_cost())  # Still available

Pipeline Cost Tracking

For pipeline-based generation, costs are displayed by the reporter:
from synkro import create_pipeline

pipeline = create_pipeline()
result = pipeline.generate(policy, traces=100, return_logic_map=True)

# Access metrics from result
print(result.metrics.total_cost)
print(result.metrics.format_table())

Cost Breakdown by Phase

The generation pipeline has several phases, each with its own LLM calls:
PhaseDescription
scenarioGenerating test scenarios from policy
coverageCoverage calculation and improvement
hitlHuman-in-the-Loop editing calls
responseGenerating assistant responses
refinementRefining failed responses
gradingVerifying response quality

Reducing Costs

Use Smaller Models for Generation

from synkro import create_pipeline
from synkro.models import OpenAI

pipeline = create_pipeline(
    model=OpenAI.GPT_5_MINI,      # Cheaper generation model
    grading_model=OpenAI.GPT_52,  # Keep strong grading model
)

Skip Grading for Exploratory Runs

pipeline = create_pipeline(skip_grading=True)
dataset = pipeline.generate(policy, traces=100)

Generate Fewer Traces

# Start small, scale up when satisfied
dataset = synkro.generate(policy, traces=10)  # Quick test
# dataset = synkro.generate(policy, traces=1000)  # Production run

Use Local Models

from synkro import create_pipeline
from synkro.models import Local

pipeline = create_pipeline(
    model=Local.llama("llama3.2:latest"),
    base_url="http://localhost:11434/v1",  # Ollama
)

Cost Estimation

Approximate costs per 100 traces (varies by policy complexity):
ModelEstimated Cost
GPT-5-mini~$0.05
GPT-5.2~$0.20
Claude 4.5 Sonnet~$0.15
Gemini 2.5 Flash~$0.03
Local (Ollama)$0.00
Actual costs depend on policy length, response complexity, and number of refinement iterations.

Silent Cost Tracking

To track costs without console output, use FileLoggingReporter:
from synkro import FileLoggingReporter, SilentReporter

reporter = FileLoggingReporter(
    delegate=SilentReporter(),  # No console output
    log_dir="./logs"
)

dataset = synkro.generate(policy, reporter=reporter)

# Check the log file for cost information
print(f"Log: {reporter.log_path}")

Callback for Real-time Cost Tracking

from synkro import CallbackReporter

def on_progress(event: str, data: dict):
    if event == "complete":
        print(f"Total cost: ${data.get('total_cost', 0):.4f}")
        print(f"Scenario calls: {data.get('scenario_calls', 0)}")
        print(f"Response calls: {data.get('response_calls', 0)}")
        print(f"Grading calls: {data.get('grading_calls', 0)}")

reporter = CallbackReporter(on_progress=on_progress)
dataset = synkro.generate(policy, reporter=reporter)