Write Judge Prompt

Name: Write Judge Prompt
Author: hamelsmu

🌐Community

by hamelsmu · vlatest · Repository

Critically evaluates and refines prompts for large language models to maximize output quality and relevance.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

Run in terminal (recommended)

terminal

claude mcp add write-judge-prompt npx -- -y @trustedskills/write-judge-prompt

Or manually add to ~/.claude/settings.json

~/.claude/settings.json

{
  "mcpServers": {
    "write-judge-prompt": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/write-judge-prompt"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

The write-judge-prompt skill generates structured evaluation instructions to assess AI agent outputs against specific criteria. It transforms raw task requirements into precise rubrics that guide an evaluator model in scoring responses accurately and consistently.

When to use it

Automated grading: Create objective scoring rules for automated tests or benchmark suites.
Quality assurance: Define checklists to verify if an agent's output meets safety, format, or accuracy standards.
Model alignment: Craft specific constraints to ensure generated text adheres to complex domain rules (e.g., legal formatting).
Iterative improvement: Generate feedback prompts to analyze why a previous agent response succeeded or failed.

Key capabilities

Converts natural language task descriptions into formal evaluation rubrics.
Supports multi-criteria scoring with weighted importance for different aspects of the output.
Generates clear pass/fail conditions for binary decision-making in evaluation pipelines.

Example prompts

"Create a judge prompt to evaluate if a Python script correctly parses CSV files while handling missing values."
"Write an evaluation rubric for an AI agent that summarizes medical articles, focusing on accuracy and tone neutrality."
"Generate a scoring guide to assess whether a coding assistant's explanation is clear for a beginner audience."

Tips & gotchas

Ensure your input task description includes explicit success criteria; vague goals will result in weak judge prompts. This skill works best when paired with a separate generation prompt, creating a closed-loop evaluation system where the agent creates content and the judge validates it.

View Repository →

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust Hub	Pass
Socket	Pass
Snyk	Pass

Details

Version: vlatest
License
Author: hamelsmu
Installs: 48

Repository (canonical source) →

🌐 Community

Passed automated security scans.