Nemo Evaluator

Name: Nemo Evaluator
Author: eyadsibai

🌐Community

by eyadsibai · vlatest · Repository

Nemo Evaluator assesses the quality of generated text based on a defined prompt, ensuring outputs align with desired criteria and improving consistency.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

Run in terminal (recommended)

terminal

claude mcp add nemo-evaluator npx -- -y @trustedskills/nemo-evaluator

Or manually add to ~/.claude/settings.json

~/.claude/settings.json

{
  "mcpServers": {
    "nemo-evaluator": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/nemo-evaluator"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

The nemo-evaluator skill provides a way to evaluate text generation models using NVIDIA's NeMo framework. It allows you to run inference on specified models and assess their performance based on predefined metrics. This facilitates the comparison of different model outputs and helps in identifying areas for improvement during development or deployment.

When to use it

Model Comparison: Evaluate multiple text generation models (e.g., summarization, translation) against each other to determine which performs best for a given task.
Performance Monitoring: Track the performance of a deployed model over time and identify potential degradation in quality.
A/B Testing: Compare different versions of a model or prompting strategies to optimize output quality.
Automated Evaluation Pipelines: Integrate into automated workflows for continuous model evaluation and improvement.

Key capabilities

Model inference using NVIDIA NeMo
Performance metric calculation
Comparison of model outputs
Integration with automated pipelines

Example prompts

"Evaluate the summarization performance of Model A versus Model B on this dataset."
"Run inference with the 'translation' model and calculate BLEU score."
"Compare the output quality of these two models using ROUGE metrics."

Tips & gotchas

Requires NVIDIA NeMo to be installed and configured.
Ensure that the specified models are compatible with the nemo-evaluator skill's supported architectures.

View Repository →

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust Hub	Pass
Socket	Pass
Snyk	Pass

Details

Version: vlatest
License
Author: eyadsibai
Installs: 27

Repository (canonical source) →

🌐 Community

Passed automated security scans.