Parallel Data Enrichment

🌐Community
by parallel-web · vlatest · Repository

Automatically expands limited training data with synthetic examples generated from web searches, boosting model accuracy.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

1

Run in terminal (recommended)

terminal
claude mcp add parallel-data-enrichment npx -- -y @trustedskills/parallel-data-enrichment
2

Or manually add to ~/.claude/settings.json

~/.claude/settings.json
{
  "mcpServers": {
    "parallel-data-enrichment": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/parallel-data-enrichment"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

This skill automatically expands limited training data with synthetic examples generated from web searches. It takes existing data (either inline or from a CSV file) and enriches it based on a specified intent, such as retrieving CEO names and founding years for companies. The process can take several minutes depending on the amount of data requested and allows for context chaining across enrichment tasks to build upon previously discovered entities.

When to use it

  • Expanding small datasets to improve model accuracy.
  • Adding missing information or details to existing records.
  • Generating synthetic training examples based on a specific intent (e.g., finding company founding dates).
  • Building upon previous enrichment tasks by leveraging context chaining.
  • Preparing data for machine learning models where more comprehensive information is needed.

Key capabilities

  • Data Enrichment: Generates additional data points based on user-defined intents.
  • CSV File Support: Accepts CSV files as input and outputs enriched data to a specified file.
  • Inline Data Support: Accepts data directly within the command line.
  • Context Chaining: Allows follow-up enrichment tasks to build upon the context of previous runs using an interaction_id.
  • Asynchronous Processing: Uses --no-wait flag to initiate enrichment and return immediately, allowing for background processing.

Example prompts

  • "Enrich this data with CEO names and founding years: '[{"company": "Google"}, {"company": "Microsoft"}]'"
  • "Can you enrich the 'company' column in 'input.csv' to find the CEO name and founding year?"
  • "Continue enriching the data from the previous task, using interaction ID [previous_interaction_id]."

Tips & gotchas

  • Asynchronous Operation: Always use the --no-wait flag to avoid blocking the agent’s execution. You'll need to poll for results separately.
  • Timeout Limit: The polling process has a timeout of 9 minutes (--timeout 540). If it times out, re-run the parallel-cli enrich poll command to continue waiting.
  • Interaction ID: Remember and reuse the interaction_id for follow-up enrichment tasks to maintain context.

Tags

🛡️

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust HubPass
SocketPass
SnykPass

Details

Version
vlatest
License
Author
parallel-web
Installs
81

🌐 Community

Passed automated security scans.