Prompt Injection Scanner

🌐Community
by jorgealves · vlatest · Repository

This scanner analyzes prompts for potential vulnerabilities like prompt injection attacks, safeguarding your AI applications from malicious manipulation.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

1

Run in terminal (recommended)

terminal
claude mcp add prompt-injection-scanner npx -- -y @trustedskills/prompt-injection-scanner
2

Or manually add to ~/.claude/settings.json

~/.claude/settings.json
{
  "mcpServers": {
    "prompt-injection-scanner": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/prompt-injection-scanner"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

The prompt-injection-scanner skill analyzes input data to detect and flag potential prompt injection attempts before they reach an AI model. It helps secure agent interactions by identifying malicious patterns designed to bypass safety filters or alter intended behavior.

When to use it

  • User-generated content: Scan comments, forum posts, or chat logs before processing them with an LLM.
  • Dynamic data ingestion: Filter untrusted external APIs or database entries that might contain hidden instructions.
  • Public-facing agents: Protect customer support bots or public assistants from adversarial attacks.
  • Pre-deployment testing: Validate new prompt templates against known injection vectors to ensure robustness.

Key capabilities

  • Identifies structural anomalies in input strings typical of injection attacks.
  • Flags suspicious keywords and formatting tricks used to override system instructions.
  • Provides clear alerts when potentially harmful payloads are detected in user inputs.

Example prompts

  • "Scan this batch of customer reviews for any prompt injection attempts before I feed them into the sentiment analysis model."
  • "Check if these dynamically generated API responses contain hidden commands trying to manipulate my agent's logic."
  • "Analyze this forum thread to detect users attempting to bypass safety filters through indirect instruction embedding."

Tips & gotchas

Ensure you have a baseline of legitimate input patterns to distinguish between false positives and actual threats. This skill is most effective when integrated early in the data pipeline, before any generative processing occurs.

Tags

🛡️

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust HubPass
SocketPass
SnykPass

Details

Version
vlatest
License
Author
jorgealves
Installs
64

🌐 Community

Passed automated security scans.