Document Processing
Extracts key data like dates, names, and values from Dirnbauer-formatted documents for automated workflows.
Install on your platform
We auto-selected Claude Code based on this skill’s supported platforms.
Run in terminal (recommended)
claude mcp add dirnbauer-document-processing npx -- -y @trustedskills/dirnbauer-document-processing
Or manually add to ~/.claude/settings.json
{
"mcpServers": {
"dirnbauer-document-processing": {
"command": "npx",
"args": [
"-y",
"@trustedskills/dirnbauer-document-processing"
]
}
}
}Requires Claude Code (claude CLI). Run claude --version to verify your install.
About This Skill
What it does
This skill enables AI agents to create, edit, and analyze common office document formats including PDFs, Word documents (.docx), PowerPoint presentations (.pptx), and Excel spreadsheets (.xlsx). It leverages various Python libraries like pdfplumber, pypdf, pandas, and openpyxl to perform tasks such as text extraction, table extraction, merging/splitting PDFs, filling forms, data analysis within spreadsheets, and OCR (Optical Character Recognition) on scanned documents. The skill allows for manipulation of document content and structure.
When to use it
- Automating the extraction of data from invoices or reports in PDF format.
- Creating new PowerPoint presentations based on textual instructions.
- Analyzing data stored within Excel spreadsheets, such as calculating totals or identifying trends.
- Merging multiple PDF documents into a single file for easier distribution.
- Extracting text from scanned PDFs using OCR to make them searchable and editable.
Key capabilities
- PDF Processing: Text extraction, table extraction, merging, splitting, form filling, creation, and rotation of pages.
- DOCX (Word) Processing: Text extraction, creation, editing.
- PPTX (PowerPoint) Processing: Text extraction, creation, editing.
- XLSX (Excel) Processing: Data analysis, formula manipulation, formatting.
- OCR for Scanned PDFs: Extracts text from scanned PDF documents using Optical Character Recognition.
Example prompts
- "Extract all tables from this document and save them to an Excel file."
- "Create a new PowerPoint presentation with the title 'Project Update' and three slides summarizing key findings."
- "Merge these three PDFs into a single document named 'CombinedReport.pdf'."
- "Perform data analysis on this spreadsheet, calculating the average value in column B."
Tips & gotchas
- The skill relies on external Python libraries; ensure they are installed for proper functionality.
- OCR accuracy depends heavily on the quality of the scanned document. Poor scans will result in inaccurate text extraction.
- Editing DOCX and PPTX files involves unpacking, modifying, and repacking the XML structure, which can be complex.
Tags
TrustedSkills Verification
Unlike other registries that point to live repositories, TrustedSkills pins every skill to a verified commit hash. This protects you from malicious updates — what you install today is exactly what was reviewed and verified.
Security Audits
| Gen Agent Trust Hub | Pass |
| Socket | Pass |
| Snyk | Pass |
🌐 Community
Passed automated security scans.