Signature
Parameters
A callable that takes OpenAI-format messages (
[{"role": "user", "content": "..."}]) and returns a response string. Can be sync or async — async agents are auto-detected.Policy document text. Rules are extracted automatically.
Number of test scenarios to auto-generate. Produces a balanced mix of positive, negative, edge-case, and irrelevant scenarios.
Maximum conversation turns (user-agent exchanges) per scenario. The simulated user may end earlier if the conversation reaches a natural conclusion.
Model for the simulated user and scenario generation. Auto-detected if not specified.
Model for verification/grading. Defaults to
model. A stronger model produces more accurate verification.Returns
SimulationResults — aggregated results with per-scenario details.
SimulationResults
| Property | Type | Description |
|---|---|---|
pass_rate | float | Fraction of scenarios that passed (0.0–1.0) |
total | int | Total scenarios simulated |
passed | int | Number that passed verification |
failed | int | Number that failed verification |
results | list[SimulationResult] | Individual scenario results |
logic_map | LogicMap | Extracted policy rules |
dataset | Dataset | Convert results to a Dataset for export |
save(path)— Save results to JSON__iter__— Iterate over individualSimulationResultobjects__len__— Number of results
SimulationResult
| Field | Type | Description |
|---|---|---|
scenario | GoldenScenario | The scenario that was tested |
messages | list[dict] | Full conversation transcript (OpenAI-format) |
passed | bool | Whether the agent passed verification |
issues | list[str] | Issues found during verification |
verification | VerificationResult | Full verification details |
Examples
Basic Usage
With Model Selection
Async
Export Results
Simulator Class
For advanced control (e.g., custom concurrency), use theSimulator class directly:
Simulator Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model | str | None | auto | Model for simulated user and scenarios |
grading_model | str | None | model | Model for verification |
concurrency | int | 5 | Max parallel scenario executions |