Senior Data Engineer

🌐Community
by ovachiever · vlatest · Repository

Analyzes complex data engineering challenges, designs scalable solutions, and optimizes pipelines for senior-level performance.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

1

Run in terminal (recommended)

terminal
claude mcp add ovachiever-senior-data-engineer npx -- -y @trustedskills/ovachiever-senior-data-engineer
2

Or manually add to ~/.claude/settings.json

~/.claude/settings.json
{
  "mcpServers": {
    "ovachiever-senior-data-engineer": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/ovachiever-senior-data-engineer"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

This skill provides AI agents with expertise equivalent to a senior data engineer, enabling them to design, implement, and optimize production-grade AI/ML/Data systems. It leverages Python scripts for pipeline orchestration, data quality validation, and ETL performance optimization. The skill also offers access to reference documentation covering advanced data pipeline architecture, data modeling patterns, and DataOps best practices.

When to use it

  • Designing scalable data architectures for new AI/ML projects.
  • Optimizing existing data pipelines for improved performance and efficiency.
  • Implementing MLOps and DataOps best practices within a development workflow.
  • Troubleshooting complex issues related to data quality, pipeline failures, or system bottlenecks.
  • Reviewing and improving the security and compliance of data infrastructure.

Key capabilities

  • Pipeline orchestration using pipeline_orchestrator.py
  • Data Quality Validation with data_quality_validator.py
  • ETL Performance Optimization via etl_performance_optimizer.py
  • Expertise in advanced production patterns and architectures
  • Scalable system design and implementation
  • Performance optimization at scale
  • MLOps and DataOps best practices
  • Access to reference documentation on data pipeline architecture, modeling patterns, and DataOps

Example prompts

  • "Can you orchestrate a data pipeline processing files from /data/ and outputting results to /results/?"
  • "Analyze the data quality of the project/ directory."
  • "Optimize the ETL process using the configuration in config.yaml and deploy the changes."

Tips & gotchas

  • This skill is intended for complex data engineering challenges, not basic scripting tasks.
  • Familiarity with Python and common data tools (Spark, Airflow, dbt) will enhance usability.
  • Refer to the reference documentation (references/data_pipeline_architecture.md, references/data_modeling_patterns.md, references/dataops_best_practices.md) for detailed guidance on specific topics.

Tags

🛡️

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust HubPass
SocketPass
SnykPass

Details

Version
vlatest
License
Author
ovachiever
Installs
29

🌐 Community

Passed automated security scans.