Agent Reviewer

Name: Agent Reviewer
Author: ruvnet

🌐Community

by ruvnet · vlatest · Repository

Ruvnet's agent-reviewer analyzes agent conversations, identifying areas for improvement in response quality and adherence to guidelines.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

Run in terminal (recommended)

terminal

claude mcp add agent-reviewer npx -- -y @trustedskills/agent-reviewer

Or manually add to ~/.claude/settings.json

~/.claude/settings.json

{
  "mcpServers": {
    "agent-reviewer": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/agent-reviewer"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

The agent-reviewer skill provides automated feedback on AI agent performance. It analyzes agent outputs based on defined criteria, identifying areas of strength and weakness. This allows for iterative improvement of agents through data-driven insights and targeted adjustments to prompts or underlying models.

When to use it

Evaluating new agent versions: Quickly assess the impact of changes made to an agent's prompt or configuration.
Identifying common failure modes: Pinpoint recurring errors or weaknesses in an agent’s responses across a range of tasks.
Benchmarking agents against each other: Compare the performance of different agent setups on standardized evaluation datasets.
Automating quality assurance: Integrate into a continuous integration/continuous delivery (CI/CD) pipeline to ensure consistent agent quality.

Key capabilities

Automated feedback generation
Performance analysis based on defined criteria
Identification of strengths and weaknesses in agent outputs
Comparison of different agent versions or configurations

Example prompts

"Review this agent's response: [agent output] against the following criteria: accuracy, completeness, and clarity."
"Compare the performance of Agent A and Agent B on this task: [task description]."
"Analyze this set of agent responses for common errors related to factual recall."

Tips & gotchas

The skill's effectiveness depends heavily on clearly defined evaluation criteria. Ensure these are specific, measurable, achievable, relevant, and time-bound (SMART) to get the most accurate and actionable feedback.

View Repository →

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust Hub	Pass
Socket	Pass
Snyk	Pass

Details

Version: vlatest
License
Author: ruvnet
Installs: 19

Repository (canonical source) →

🌐 Community

Passed automated security scans.