Creating Openlineage Extractors

🌐Community
by astronomer · vlatest · Repository

This skill generates OpenLineage extractor scripts to track data lineage, improving data observability and trust across your systems.

Install on your platform

We auto-selected Claude Code based on this skill’s supported platforms.

1

Run in terminal (recommended)

terminal
claude mcp add creating-openlineage-extractors npx -- -y @trustedskills/creating-openlineage-extractors
2

Or manually add to ~/.claude/settings.json

~/.claude/settings.json
{
  "mcpServers": {
    "creating-openlineage-extractors": {
      "command": "npx",
      "args": [
        "-y",
        "@trustedskills/creating-openlineage-extractors"
      ]
    }
  }
}

Requires Claude Code (claude CLI). Run claude --version to verify your install.

About This Skill

What it does

This skill enables the creation of OpenLineage extractors, which are used to collect and track metadata about data processing workflows. It allows users to define how data lineage is captured from various data processing systems, ensuring transparency and traceability in data pipelines.

When to use it

  • You need to implement data lineage tracking for ETL processes or data transformations.
  • Your organization requires audit trails or compliance reporting on data flows.
  • You are integrating with tools like Apache Airflow or other orchestration platforms that support OpenLineage.

Key capabilities

  • Generate custom extractors for different data processing frameworks.
  • Define metadata schemas and event types for lineage tracking.
  • Integrate with existing data infrastructure to capture lineage events automatically.

Example prompts

  • "Create an OpenLineage extractor for Apache Spark jobs running in our cluster."
  • "Generate a Python-based OpenLineage extractor that captures schema changes during data transformations."
  • "Set up an extractor to log metadata from our Flink pipelines into the OpenLineage API."

Tips & gotchas

  • Ensure your environment has the necessary dependencies, such as the OpenLineage SDK and compatible data processing tools.
  • Custom extractors may require domain-specific knowledge of the systems they are monitoring for accurate lineage tracking.

Tags

🛡️

TrustedSkills Verification

Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.

Security Audits

Gen Agent Trust HubPass
SocketPass
SnykPass

Details

Version
vlatest
License
Author
astronomer
Installs
290

🌐 Community

Passed automated security scans.