Local Llm Router

🌐Community
by hoodini · vlatest · Repository

Provides LLMs guidance and assistance for building AI and machine learning applications.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

1

Run in terminal (recommended)

terminal
claude mcp add local-llm-router npx -- -y @trustedskills/local-llm-router
2

Or manually add to ~/.claude/settings.json

~/.claude/settings.json
{
  "mcpServers": {
    "local-llm-router": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/local-llm-router"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

The local-llm-router skill enables AI agents to dynamically select and route requests to specific locally hosted Large Language Models based on defined criteria. It manages connections to multiple open-source models running on your machine without requiring external API keys or cloud dependencies.

When to use it

  • Privacy-first workflows: Process sensitive data entirely offline where internet connectivity is restricted or prohibited.
  • Cost optimization: Reduce expenses by utilizing free, self-hosted models instead of paid enterprise APIs for routine tasks.
  • Custom model testing: Quickly switch between different local architectures (e.g., Llama 3 vs. Mistral) to evaluate performance on specific datasets.
  • Low-latency requirements: Minimize response times by routing queries directly to a model running on the same network as the agent.

Key capabilities

  • Routes incoming prompts to multiple configured local LLM endpoints simultaneously.
  • Supports various open-source model formats commonly used in local environments.
  • Manages connection states and health checks for locally deployed inference servers.
  • Provides a unified interface to interact with disparate local models through a single agent skill.

Example prompts

  • "Route this legal document analysis to the high-precision local model, but keep the summary generation on the faster, smaller instance."
  • "Check if the primary local LLM is responsive; if not, automatically failover to the backup endpoint for this query."
  • "Process this batch of internal employee records using the secure, offline-only model configuration."

Tips & gotchas

Ensure your local inference server (such as Ollama or vLLM) is running and accessible via the network before attempting to route requests. Performance will vary significantly depending on your hardware specifications and the size of the selected local models.

Tags

🛡️

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust HubPass
SocketPass
SnykPass

Details

Version
vlatest
License
Author
hoodini
Installs
33

🌐 Community

Passed automated security scans.