Use this file to discover all available pages before exploring further.
Agent simulation lets you run your agent against realistic, auto-generated test scenarios derived from a policy document. Synkro extracts rules, generates diverse scenarios, drives multi-turn conversations with a simulated user, and verifies every conversation against the policy.
Policy Document | v1. Extract Rules ──> Logic Map (DAG of rules) | v2. Generate Scenarios ──> Positive, negative, edge cases | v3. Simulate Conversations ──> Simulated user + your agent | v4. Verify ──> Each conversation graded against rules | vSimulationResults (pass/fail per scenario)
Stages 1, 2, and 4 reuse the same pipeline components that power synkro.generate() — the Logic Extractor, Golden Scenario Generator, and Trace Verifier.
import synkro# Your agent — any callable that takes messages and returns a stringdef my_agent(messages): response = openai.chat.completions.create( model="gpt-4o", messages=messages, ) return response.choices[0].message.content# Run simulationresults = synkro.simulate( agent=my_agent, policy="All refunds require a receipt. Maximum refund is $500.", scenarios=10, turns=3,)print(f"Pass rate: {results.pass_rate:.0%}")print(f"Passed: {results.passed}/{results.total}")
# Save to JSON (includes summary, all transcripts, and logic map)results.save("simulation_results.json")# Convert to a Dataset for JSONL export or HuggingFace uploaddataset = results.datasetdataset.save("simulation_traces.jsonl")
The scenarios parameter controls how many test cases are auto-generated from the policy. Synkro produces a balanced mix of positive, negative, edge-case, and irrelevant scenarios.
The turns parameter sets the maximum number of user-agent exchanges per scenario. The simulated user may end the conversation early if it reaches a natural conclusion.
# Longer conversations for complex policiesresults = synkro.simulate(agent=my_agent, policy=policy, turns=5)
By default, Synkro auto-detects available models. You can specify which model to use for the simulated user and scenario generation, and optionally a separate (stronger) model for verification.
from synkro import Simulatorsim = Simulator(model="gpt-4o-mini")# In an async contextresults = await sim.run_async( agent=my_async_agent, policy=policy, scenarios=20,)# Or use the convenience functionresults = await synkro.simulate_async( agent=my_async_agent, policy=policy, scenarios=20,)
import synkrodef rag_agent(messages): last_user_msg = [m for m in messages if m["role"] == "user"][-1]["content"] # Your RAG retrieval docs = vector_store.similarity_search(last_user_msg, k=3) context = "\n".join(d.page_content for d in docs) # Generate response with context response = openai.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": f"Use this context:\n{context}"}, *messages, ], ) return response.choices[0].message.contentresults = synkro.simulate( agent=rag_agent, policy=policy_text, scenarios=30, turns=3,)# Check which scenarios failedfor r in results: if not r.passed: print(f"FAILED: {r.scenario.description}") print(f" Issues: {r.issues}")
The verifier checks each conversation against the extracted policy rules:
Check
Description
Skipped rules
Rules that should have been applied but weren’t
Hallucinated rules
Rules cited that don’t exist or don’t apply
Contradictions
Logical inconsistencies in the agent’s responses
DAG compliance
Dependency order between rules was respected
Outcome alignment
Response matches the expected outcome for the scenario
Each SimulationResult includes the full VerificationResult with details:
for r in results: v = r.verification print(f"Passed: {v.passed}") print(f"Rules verified: {v.rules_verified}") print(f"Skipped rules: {v.skipped_rules}") print(f"Hallucinated: {v.hallucinated_rules}") print(f"Contradictions: {v.contradictions}")