Agent Reviewer
Ruvnet's agent-reviewer analyzes agent conversations, identifying areas for improvement in response quality and adherence to guidelines.
Install on your platform
We auto-selected Claude Code based on this skill’s supported platforms.
Run in terminal (recommended)
claude mcp add agent-reviewer npx -- -y @trustedskills/agent-reviewer
Or manually add to ~/.claude/settings.json
{
"mcpServers": {
"agent-reviewer": {
"command": "npx",
"args": [
"-y",
"@trustedskills/agent-reviewer"
]
}
}
}Requires Claude Code (claude CLI). Run claude --version to verify your install.
About This Skill
What it does
The agent-reviewer skill provides automated feedback on AI agent performance. It analyzes agent outputs based on defined criteria, identifying areas of strength and weakness. This allows for iterative improvement of agents through data-driven insights and targeted adjustments to prompts or underlying models.
When to use it
- Evaluating new agent versions: Quickly assess the impact of changes made to an agent's prompt or configuration.
- Identifying common failure modes: Pinpoint recurring errors or weaknesses in an agent’s responses across a range of tasks.
- Benchmarking agents against each other: Compare the performance of different agent setups on standardized evaluation datasets.
- Automating quality assurance: Integrate into a continuous integration/continuous delivery (CI/CD) pipeline to ensure consistent agent quality.
Key capabilities
- Automated feedback generation
- Performance analysis based on defined criteria
- Identification of strengths and weaknesses in agent outputs
- Comparison of different agent versions or configurations
Example prompts
- "Review this agent's response: [agent output] against the following criteria: accuracy, completeness, and clarity."
- "Compare the performance of Agent A and Agent B on this task: [task description]."
- "Analyze this set of agent responses for common errors related to factual recall."
Tips & gotchas
The skill's effectiveness depends heavily on clearly defined evaluation criteria. Ensure these are specific, measurable, achievable, relevant, and time-bound (SMART) to get the most accurate and actionable feedback.
Tags
TrustedSkills Verification
Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.
Security Audits
| Gen Agent Trust Hub | Pass |
| Socket | Pass |
| Snyk | Pass |
🌐 Community
Passed automated security scans.