fair-evaluation-baseline
by HomericIntelligencev1.0.0
Implement baseline pipeline capture and regression detection to distinguish agent-introduced failures from pre-existing issues in E2E evaluations
Install any skill with /learn
/learn @owner/skill-nameevaluation GitHub