evaluation
This skill should be used when building agent evaluation systems: deterministic checks, regression suites, multi-dimensional rubrics, quality gates, production monitoring, baseline comparison, and outcome measurement for agent pipelines.
Install this skill
or
100/100
Security score
The evaluation skill was audited on Jun 24, 2026. Our scanner tested it across 12 threat categories and found no security issues.
Categories Tested
Security Issues
No security issues detected
This skill passed all security checks.
Scanned on Jun 24, 2026
View Security Dashboard