hugging-face-evaluation

🏢Official
by huggingface · v1.0.0 · Apache-2.0

Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom model evaluations

Install on your platform

We auto-selected OpenClaw based on this skill’s supported platforms.

1Run this command in your terminal. The skill is immediately available.
terminal

About This Skill

What it does

This skill allows you to integrate evaluation metrics directly into your Hugging Face model cards. It can automatically extract evaluation data presented as tables within a README file, import results from the Artificial Analysis API, or facilitate the execution of custom evaluation scripts. This ensures consistent and readily accessible performance benchmarks for your models.

When to use it

  • Documenting Model Performance: When releasing a new model on the Hugging Face Hub, use this skill to clearly display its evaluation scores in the model card.
  • Tracking Evaluation Progress: As you iterate on a model and run different evaluations, easily update the model card with the latest results.
  • Integrating External Analysis: When using an external service like Artificial Analysis for model benchmarking, automatically import those results into your Hugging Face model cards.
  • Sharing Custom Evaluations: If you've developed custom evaluation scripts, this skill can help streamline the process of incorporating their output into a standardized format within the model card.

Key capabilities

  • Extracts evaluation tables from README content.
  • Imports scores from the Artificial Analysis API.
  • Supports running custom model evaluations.
  • Updates Hugging Face model cards with evaluation results.

Example prompts

  • "Add the evaluation table from my model's README to its Hugging Face model card."
  • "Import the latest scores for this model from the Artificial Analysis API and update the model card."
  • “Run my custom evaluation script on this model and add the results to the model card.”

Tips & gotchas

The skill requires access to your Hugging Face account and appropriate permissions to modify model cards. Ensure that any external APIs (like Artificial Analysis) are properly configured with necessary credentials for seamless integration.

Tags

🛡️

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust HubPass
SocketPass
SnykPass

Details

Version
v1.0.0
License
Apache-2.0
Author
huggingface
Installs
0

🏢 Official

Published by the company or team that built the technology.