Enabling Checkpointing
How It Works
- Policy Hash: Each policy gets a unique hash based on its content
- Phase Checkpoints: Progress is saved after each major phase:
- Logic Map extraction
- Scenario generation
- Response generation
- Grading
- Automatic Resume: On restart, synkro detects existing checkpoints and continues
Checkpoint Directory Structure
Clearing Checkpoints
To start fresh instead of resuming:Use Cases
Large Production Runs
Iterative Development
CI/CD Pipelines
Best Practices
- Use unique checkpoint dirs for different policies or experiments
- Clear checkpoints when changing policy content significantly
- Enable for jobs > 100 traces to avoid losing progress
- Disable HITL for batch checkpoint runs (can’t resume interactive sessions)
Limitations
- HITL (Human-in-the-Loop) sessions cannot be checkpointed mid-session
- Changing model parameters between runs may produce inconsistent results
- Checkpoint format may change between synkro versions