Dataset Comparer
Compares datasets for discrepancies, highlighting differences and ensuring data consistency for accurate analysis and decision-making.
Install on your platform
We auto-selected Claude Code based on this skill’s supported platforms.
Run in terminal (recommended)
claude mcp add dataset-comparer npx -- -y @trustedskills/dataset-comparer
Or manually add to ~/.claude/settings.json
{
"mcpServers": {
"dataset-comparer": {
"command": "npx",
"args": [
"-y",
"@trustedskills/dataset-comparer"
]
}
}
}Requires Claude Code (claude CLI). Run claude --version to verify your install.
About This Skill
What it does
The Dataset Comparer skill allows AI agents to identify differences between two CSV or Excel datasets. It can detect added, removed, and modified rows, as well as changes in values within matching rows. The tool also provides schema comparison capabilities and generates reports summarizing these discrepancies for data consistency checks and analysis.
When to use it
- Data Migration Validation: Verify that a dataset migration was successful by comparing the original and migrated datasets.
- A/B Testing Analysis: Compare datasets representing different versions of a product or service to identify performance changes.
- Data Quality Assurance: Identify inconsistencies between data sources for improved data quality.
- Regulatory Compliance: Ensure adherence to data standards by comparing datasets against expected formats and values.
Key capabilities
- Row Comparison: Identifies added, removed, and matching rows.
- Value Changes Detection: Detects changes in values within matching rows.
- Column Comparison (Schema Differences): Highlights differences in the dataset structure.
- Statistics Summary: Provides a summary of the identified differences.
- Diff Reports: Generates reports in HTML, CSV, and JSON formats.
- Flexible Matching: Allows comparison by key columns or row position.
Example prompts
- "Compare 'old_data.csv' and 'new_data.csv', focusing on the 'id' column as a key."
- "Generate an HTML report summarizing the differences between 'version1.xlsx' and 'version2.xlsx'."
- "Find added rows in 'source_data.csv' compared to 'baseline_data.csv'."
Tips & gotchas
- The skill requires familiarity with CSV or Excel file formats.
- Specify key columns for accurate matching; otherwise, row position will be used.
- Consider using the
--ignoreflag when comparing datasets that contain irrelevant columns.
Tags
TrustedSkills Verification
Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.
Security Audits
| Gen Agent Trust Hub | Pass |
| Socket | Pass |
| Snyk | Pass |
🌐 Community
Passed automated security scans.