Tbench

🌐Community
by coder · vlatest · Repository

Tbench automatically generates diverse test cases for your code, improving reliability and catching edge-case bugs quickly.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

1

Run in terminal (recommended)

terminal
claude mcp add tbench npx -- -y @trustedskills/tbench
2

Or manually add to ~/.claude/settings.json

~/.claude/settings.json
{
  "mcpServers": {
    "tbench": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/tbench"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

The tbench skill provides a simple benchmarking tool for AI agents. It allows users to run predefined tests and measure performance metrics, offering insights into an agent's capabilities. This helps evaluate and compare different agent configurations or track improvements over time.

When to use it

  • Performance Evaluation: Assess the speed and accuracy of your AI agent on specific tasks.
  • Regression Testing: Ensure new changes don’t negatively impact existing functionality by running benchmarks before and after updates.
  • Configuration Tuning: Experiment with different settings or parameters to optimize an agent's performance.
  • Comparison Across Models: Compare the effectiveness of various AI models in a standardized environment.

Key capabilities

  • Predefined benchmark tests
  • Performance metric measurement
  • Standardized testing environment

Example prompts

  • "Run the 'basic_math' benchmark."
  • "Execute all available benchmarks and report results."
  • "Compare the performance of agent A versus agent B on the 'logic_puzzle' test."

Tips & gotchas

The tbench skill requires a properly configured AI agent environment to function correctly. Results are only meaningful when compared within the same testing conditions; variations in hardware or software can skew results.

Tags

🛡️

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust HubPass
SocketPass
SnykPass

Details

Version
vlatest
License
Author
coder
Installs
10

🌐 Community

Passed automated security scans.