Senior Data Engineer
Analyzes complex data engineering challenges, designs scalable solutions, and optimizes pipelines for senior-level performance.
Install on your platform
We auto-selected Claude Code based on this skill’s supported platforms.
Run in terminal (recommended)
claude mcp add ovachiever-senior-data-engineer npx -- -y @trustedskills/ovachiever-senior-data-engineer
Or manually add to ~/.claude/settings.json
{
"mcpServers": {
"ovachiever-senior-data-engineer": {
"command": "npx",
"args": [
"-y",
"@trustedskills/ovachiever-senior-data-engineer"
]
}
}
}Requires Claude Code (claude CLI). Run claude --version to verify your install.
About This Skill
What it does
This skill provides AI agents with expertise equivalent to a senior data engineer, enabling them to design, implement, and optimize production-grade AI/ML/Data systems. It leverages Python scripts for pipeline orchestration, data quality validation, and ETL performance optimization. The skill also offers access to reference documentation covering advanced data pipeline architecture, data modeling patterns, and DataOps best practices.
When to use it
- Designing scalable data architectures for new AI/ML projects.
- Optimizing existing data pipelines for improved performance and efficiency.
- Implementing MLOps and DataOps best practices within a development workflow.
- Troubleshooting complex issues related to data quality, pipeline failures, or system bottlenecks.
- Reviewing and improving the security and compliance of data infrastructure.
Key capabilities
- Pipeline orchestration using
pipeline_orchestrator.py - Data Quality Validation with
data_quality_validator.py - ETL Performance Optimization via
etl_performance_optimizer.py - Expertise in advanced production patterns and architectures
- Scalable system design and implementation
- Performance optimization at scale
- MLOps and DataOps best practices
- Access to reference documentation on data pipeline architecture, modeling patterns, and DataOps
Example prompts
- "Can you orchestrate a data pipeline processing files from
/data/and outputting results to/results/?" - "Analyze the data quality of the
project/directory." - "Optimize the ETL process using the configuration in
config.yamland deploy the changes."
Tips & gotchas
- This skill is intended for complex data engineering challenges, not basic scripting tasks.
- Familiarity with Python and common data tools (Spark, Airflow, dbt) will enhance usability.
- Refer to the reference documentation (
references/data_pipeline_architecture.md,references/data_modeling_patterns.md,references/dataops_best_practices.md) for detailed guidance on specific topics.
Tags
TrustedSkills Verification
Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.
Security Audits
| Gen Agent Trust Hub | Pass |
| Socket | Pass |
| Snyk | Pass |
🌐 Community
Passed automated security scans.