Senior Data Engineer

Name: Senior Data Engineer
Author: ovachiever

🌐Community

by ovachiever · vlatest · Repository

Analyzes complex data engineering challenges, designs scalable solutions, and optimizes pipelines for senior-level performance.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

Run in terminal (recommended)

terminal

claude mcp add ovachiever-senior-data-engineer npx -- -y @trustedskills/ovachiever-senior-data-engineer

Or manually add to ~/.claude/settings.json

~/.claude/settings.json

{
  "mcpServers": {
    "ovachiever-senior-data-engineer": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/ovachiever-senior-data-engineer"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

This skill provides AI agents with expertise equivalent to a senior data engineer, enabling them to design, implement, and optimize production-grade AI/ML/Data systems. It leverages Python scripts for pipeline orchestration, data quality validation, and ETL performance optimization. The skill also offers access to reference documentation covering advanced data pipeline architecture, data modeling patterns, and DataOps best practices.

When to use it

Designing scalable data architectures for new AI/ML projects.
Optimizing existing data pipelines for improved performance and efficiency.
Implementing MLOps and DataOps best practices within a development workflow.
Troubleshooting complex issues related to data quality, pipeline failures, or system bottlenecks.
Reviewing and improving the security and compliance of data infrastructure.

Key capabilities

Pipeline orchestration using pipeline_orchestrator.py
Data Quality Validation with data_quality_validator.py
ETL Performance Optimization via etl_performance_optimizer.py
Expertise in advanced production patterns and architectures
Scalable system design and implementation
Performance optimization at scale
MLOps and DataOps best practices
Access to reference documentation on data pipeline architecture, modeling patterns, and DataOps

Example prompts

"Can you orchestrate a data pipeline processing files from /data/ and outputting results to /results/?"
"Analyze the data quality of the project/ directory."
"Optimize the ETL process using the configuration in config.yaml and deploy the changes."

Tips & gotchas

This skill is intended for complex data engineering challenges, not basic scripting tasks.
Familiarity with Python and common data tools (Spark, Airflow, dbt) will enhance usability.
Refer to the reference documentation (references/data_pipeline_architecture.md, references/data_modeling_patterns.md, references/dataops_best_practices.md) for detailed guidance on specific topics.

View Repository →

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust Hub	Pass
Socket	Pass
Snyk	Pass

Details

Version: vlatest
License
Author: ovachiever
Installs: 29

Repository (canonical source) →

🌐 Community

Passed automated security scans.