Local Llm Router
Provides LLMs guidance and assistance for building AI and machine learning applications.
Install on your platform
We auto-selected Claude Code based on this skill’s supported platforms.
Run in terminal (recommended)
claude mcp add local-llm-router npx -- -y @trustedskills/local-llm-router
Or manually add to ~/.claude/settings.json
{
"mcpServers": {
"local-llm-router": {
"command": "npx",
"args": [
"-y",
"@trustedskills/local-llm-router"
]
}
}
}Requires Claude Code (claude CLI). Run claude --version to verify your install.
About This Skill
The local-llm-router skill enables AI agents to dynamically select and route requests to specific locally hosted Large Language Models based on defined criteria. It manages connections to multiple open-source models running on your machine without requiring external API keys or cloud dependencies.
When to use it
- Privacy-first workflows: Process sensitive data entirely offline where internet connectivity is restricted or prohibited.
- Cost optimization: Reduce expenses by utilizing free, self-hosted models instead of paid enterprise APIs for routine tasks.
- Custom model testing: Quickly switch between different local architectures (e.g., Llama 3 vs. Mistral) to evaluate performance on specific datasets.
- Low-latency requirements: Minimize response times by routing queries directly to a model running on the same network as the agent.
Key capabilities
- Routes incoming prompts to multiple configured local LLM endpoints simultaneously.
- Supports various open-source model formats commonly used in local environments.
- Manages connection states and health checks for locally deployed inference servers.
- Provides a unified interface to interact with disparate local models through a single agent skill.
Example prompts
- "Route this legal document analysis to the high-precision local model, but keep the summary generation on the faster, smaller instance."
- "Check if the primary local LLM is responsive; if not, automatically failover to the backup endpoint for this query."
- "Process this batch of internal employee records using the secure, offline-only model configuration."
Tips & gotchas
Ensure your local inference server (such as Ollama or vLLM) is running and accessible via the network before attempting to route requests. Performance will vary significantly depending on your hardware specifications and the size of the selected local models.
Tags
TrustedSkills Verification
Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.
Security Audits
| Gen Agent Trust Hub | Pass |
| Socket | Pass |
| Snyk | Pass |
🌐 Community
Passed automated security scans.