Creating Openlineage Extractors

Name: Creating Openlineage Extractors
Author: astronomer

🌐Community

by astronomer · vlatest · Repository

This skill generates OpenLineage extractor scripts to track data lineage, improving data observability and trust across your systems.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

Run in terminal (recommended)

terminal

claude mcp add creating-openlineage-extractors npx -- -y @trustedskills/creating-openlineage-extractors

Or manually add to ~/.claude/settings.json

~/.claude/settings.json

{
  "mcpServers": {
    "creating-openlineage-extractors": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/creating-openlineage-extractors"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

This skill enables the creation of OpenLineage extractors, which are used to collect and track metadata about data processing workflows. It allows users to define how data lineage is captured from various data processing systems, ensuring transparency and traceability in data pipelines.

When to use it

You need to implement data lineage tracking for ETL processes or data transformations.
Your organization requires audit trails or compliance reporting on data flows.
You are integrating with tools like Apache Airflow or other orchestration platforms that support OpenLineage.

Key capabilities

Generate custom extractors for different data processing frameworks.
Define metadata schemas and event types for lineage tracking.
Integrate with existing data infrastructure to capture lineage events automatically.

Example prompts

"Create an OpenLineage extractor for Apache Spark jobs running in our cluster."
"Generate a Python-based OpenLineage extractor that captures schema changes during data transformations."
"Set up an extractor to log metadata from our Flink pipelines into the OpenLineage API."

Tips & gotchas

Ensure your environment has the necessary dependencies, such as the OpenLineage SDK and compatible data processing tools.
Custom extractors may require domain-specific knowledge of the systems they are monitoring for accurate lineage tracking.

View Repository →

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust Hub	Pass
Socket	Pass
Snyk	Pass

Details

Version: vlatest
License
Author: astronomer
Installs: 290

Repository (canonical source) →

🌐 Community

Passed automated security scans.