Evals
Agent evaluation framework based on Anthropic's best practices. USE WHEN eval, evaluate, test agent, benchmark, verify behavior, regression test, capability tes
Security score
The Evals skill was audited on Mar 1, 2026 and we found 11 security issues across 3 threat categories. Review the findings below before installing.
Categories Tested
Security Issues
Curl to non-GitHub URL
| 22 | curl -s -X POST http://localhost:8888/notify \ |
Access to hidden dotfiles in home directory
| 11 | `~/.claude/skills/CORE/USER/SKILLCUSTOMIZATIONS/Evals/` |
Access to hidden dotfiles in home directory
| 96 | bun run ~/.claude/skills/Evals/Tools/AlgorithmBridge.ts -s <suite> |
Access to hidden dotfiles in home directory
| 99 | bun run ~/.claude/skills/Evals/Tools/FailureToTask.ts log "description" -c category -s severity |
Access to hidden dotfiles in home directory
| 102 | bun run ~/.claude/skills/Evals/Tools/FailureToTask.ts convert-all |
Access to hidden dotfiles in home directory
| 105 | bun run ~/.claude/skills/Evals/Tools/SuiteManager.ts create <name> -t capability -d "description" |
Access to hidden dotfiles in home directory
| 106 | bun run ~/.claude/skills/Evals/Tools/SuiteManager.ts list |
Access to hidden dotfiles in home directory
| 107 | bun run ~/.claude/skills/Evals/Tools/SuiteManager.ts check-saturation <name> |
Access to hidden dotfiles in home directory
| 108 | bun run ~/.claude/skills/Evals/Tools/SuiteManager.ts graduate <name> |
Access to hidden dotfiles in home directory
| 117 | bun run ~/.claude/skills/Evals/Tools/AlgorithmBridge.ts -s regression-core -r 3 -u |
External URL reference
| 22 | curl -s -X POST http://localhost:8888/notify \ |