Synkro automatically tracks LLM costs and API calls during generation. This helps you monitor usage and optimize your generation pipeline.
Viewing Costs
When generation completes, the RichReporter displays cost information:
Complete
Done! Generated 100 traces in 2m 15s
Quality: 95% passed verification
Cost: $0.1234
LLM Calls: 20 scenario + 100 response + 50 grading
Session API Cost Tracking
The Session API provides the cleanest way to track costs with persistent storage:
from synkro import Session
session = await Session.create(policy="...", session_id="my-run")
await session.extract_rules(session.policy)
await session.generate_scenarios(count=30)
await session.done()
# One-liner summary
print(session.show_cost_summary())
# Cost: $0.4265 | Calls: 75 | Time: 1m 34s
# Detailed breakdown by phase (includes HITL)
print(session.show_cost())
# Phase Breakdown:
# --------------------------------------------------
# Phase Cost Calls Time
# --------------------------------------------------
# extraction $0.0173 1 24.3s
# hitl $0.0190 3 0.0s <-- refine_* calls
# scenarios $0.0083 3 6.4s
# traces $0.0861 7 28.1s
# verification $0.0488 7 1m 0s
# --------------------------------------------------
# Total $0.1795 21
# Status includes cost
print(session.status())
# Rules: ✓ (18) | Scenarios: ✓ (30) | Traces: ✓ (30) | Verified: ✓ (28/30) | Cost: $0.4265
# Programmatic access
print(session.metrics.total_cost) # 0.4265
print(session.metrics.total_calls) # 75
print(session.metrics.breakdown) # {'extraction': 0.012, ...}
print(session.metrics.calls_breakdown) # {'extraction': 1, ...}
Cost data persists with the session - reload anytime to check costs:
session = await Session.load_from_db("my-run")
print(session.show_cost()) # Still available
Pipeline Cost Tracking
For pipeline-based generation, costs are displayed by the reporter:
from synkro import create_pipeline
pipeline = create_pipeline()
result = pipeline.generate(policy, traces=100, return_logic_map=True)
# Access metrics from result
print(result.metrics.total_cost)
print(result.metrics.format_table())
Cost Breakdown by Phase
The generation pipeline has several phases, each with its own LLM calls:
| Phase | Description |
|---|
scenario | Generating test scenarios from policy |
coverage | Coverage calculation and improvement |
hitl | Human-in-the-Loop editing calls |
response | Generating assistant responses |
refinement | Refining failed responses |
grading | Verifying response quality |
Reducing Costs
Use Smaller Models for Generation
from synkro import create_pipeline
from synkro.models import OpenAI
pipeline = create_pipeline(
model=OpenAI.GPT_5_MINI, # Cheaper generation model
grading_model=OpenAI.GPT_52, # Keep strong grading model
)
Skip Grading for Exploratory Runs
pipeline = create_pipeline(skip_grading=True)
dataset = pipeline.generate(policy, traces=100)
Generate Fewer Traces
# Start small, scale up when satisfied
dataset = synkro.generate(policy, traces=10) # Quick test
# dataset = synkro.generate(policy, traces=1000) # Production run
Use Local Models
from synkro import create_pipeline
from synkro.models import Local
pipeline = create_pipeline(
model=Local.llama("llama3.2:latest"),
base_url="http://localhost:11434/v1", # Ollama
)
Cost Estimation
Approximate costs per 100 traces (varies by policy complexity):
| Model | Estimated Cost |
|---|
| GPT-5-mini | ~$0.05 |
| GPT-5.2 | ~$0.20 |
| Claude 4.5 Sonnet | ~$0.15 |
| Gemini 2.5 Flash | ~$0.03 |
| Local (Ollama) | $0.00 |
Actual costs depend on policy length, response complexity, and number of refinement iterations.
Silent Cost Tracking
To track costs without console output, use FileLoggingReporter:
from synkro import FileLoggingReporter, SilentReporter
reporter = FileLoggingReporter(
delegate=SilentReporter(), # No console output
log_dir="./logs"
)
dataset = synkro.generate(policy, reporter=reporter)
# Check the log file for cost information
print(f"Log: {reporter.log_path}")
Callback for Real-time Cost Tracking
from synkro import CallbackReporter
def on_progress(event: str, data: dict):
if event == "complete":
print(f"Total cost: ${data.get('total_cost', 0):.4f}")
print(f"Scenario calls: {data.get('scenario_calls', 0)}")
print(f"Response calls: {data.get('response_calls', 0)}")
print(f"Grading calls: {data.get('grading_calls', 0)}")
reporter = CallbackReporter(on_progress=on_progress)
dataset = synkro.generate(policy, reporter=reporter)