evaluating-llms-harness
Evaluates LLMs using 60+ benchmarks for model quality assessment and comparison, widely adopted in academic and industry settings.
Install this skill
or
evaluating-llms-harness5 files
Comments
Sign in to leave a comment.
No comments yet. Be the first to comment!
Install this skill with one command
/learn @dicklesworthstone/evaluation-lm-evaluation-harnessGitHub Stars 508
Rate this skill
Categorydevelopment
UpdatedMarch 29, 2026
openclawapiml-ai-engineerdata-scientistdata-analystresearcherproduct-managerhuggingfacedevelopmentdata analyticseducation researchproduct
Dicklesworthstone/pi_agent_rust