Nemo Evaluator

🌐Community
by eyadsibai · vlatest · Repository

Nemo Evaluator assesses the quality of generated text based on a defined prompt, ensuring outputs align with desired criteria and improving consistency.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

1

Run in terminal (recommended)

terminal
claude mcp add nemo-evaluator npx -- -y @trustedskills/nemo-evaluator
2

Or manually add to ~/.claude/settings.json

~/.claude/settings.json
{
  "mcpServers": {
    "nemo-evaluator": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/nemo-evaluator"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

The nemo-evaluator skill provides a way to evaluate text generation models using NVIDIA's NeMo framework. It allows you to run inference on specified models and assess their performance based on predefined metrics. This facilitates the comparison of different model outputs and helps in identifying areas for improvement during development or deployment.

When to use it

  • Model Comparison: Evaluate multiple text generation models (e.g., summarization, translation) against each other to determine which performs best for a given task.
  • Performance Monitoring: Track the performance of a deployed model over time and identify potential degradation in quality.
  • A/B Testing: Compare different versions of a model or prompting strategies to optimize output quality.
  • Automated Evaluation Pipelines: Integrate into automated workflows for continuous model evaluation and improvement.

Key capabilities

  • Model inference using NVIDIA NeMo
  • Performance metric calculation
  • Comparison of model outputs
  • Integration with automated pipelines

Example prompts

  • "Evaluate the summarization performance of Model A versus Model B on this dataset."
  • "Run inference with the 'translation' model and calculate BLEU score."
  • "Compare the output quality of these two models using ROUGE metrics."

Tips & gotchas

  • Requires NVIDIA NeMo to be installed and configured.
  • Ensure that the specified models are compatible with the nemo-evaluator skill's supported architectures.

Tags

🛡️

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust HubPass
SocketPass
SnykPass

Details

Version
vlatest
License
Author
eyadsibai
Installs
27

🌐 Community

Passed automated security scans.