Signature
synkro.generate_scenarios(
policy: str | Policy,
count: int = 100,
generation_model: str = "gpt-4o-mini",
temperature: float = 0.8,
reporter: ProgressReporter | None = None,
enable_hitl: bool = False,
base_url: str | None = None,
) -> ScenariosResult
Parameters
Policy text or Policy object
Number of scenarios to generate
Sampling temperature (higher = more diverse scenarios)
Enable Human-in-the-Loop editing
Returns
ScenariosResult containing:
scenarios - List of EvalScenario objects
logic_map - Extracted Logic Map
distribution - Scenario type distribution
EvalScenario Fields
| Field | Type | Description |
|---|
user_message | str | The test input |
expected_outcome | str | Ground truth behavior |
target_rule_ids | list[str] | Rules being tested |
scenario_type | str | positive/negative/edge_case/irrelevant |
category | str | Policy category |
context | str | Additional context |
Examples
Generate and Grade
import synkro
# Generate scenarios
result = synkro.generate_scenarios(policy, count=100)
# Grade your model's responses
for scenario in result.scenarios:
response = my_model(scenario.user_message)
grade = synkro.grade(response, scenario, policy)
if not grade.passed:
print(f"Failed: {grade.feedback}")
Inspect Scenarios
result = synkro.generate_scenarios(policy, count=50)
for scenario in result.scenarios:
print(f"Type: {scenario.scenario_type}")
print(f"Input: {scenario.user_message}")
print(f"Expected: {scenario.expected_outcome}")
print(f"Rules: {scenario.target_rule_ids}")
print("---")
Check Distribution
result = synkro.generate_scenarios(policy, count=100)
print(result.distribution)
# {'positive': 35, 'negative': 30, 'edge_case': 25, 'irrelevant': 10}