OCR Extraction
by AaaS · open-source · Last verified 2026-03-28
Extracts structured data from unstructured documents (PDFs, scanned images, email attachments) using optical character recognition with layout-aware parsing. Handles multi-page invoices, varying formats, and poor scan quality — producing structured key-value pairs for downstream reconciliation.
https://aaas.blog/skill/ocr-extraction ↗B
B—Above Average
Adoption: B+Quality: AFreshness: ACitations: BEngagement: F
Specifications
- License
- MIT
- Pricing
- open-source
- Capabilities
- pdf-parsing, layout-aware-extraction, multi-format-support, key-value-structuring, quality-confidence-scoring
- Integrations
- google-document-ai, textract, tesseract
- Use Cases
- invoice-processing, receipt-scanning, contract-digitization
- API Available
- No
- Difficulty
- intermediate
- Prerequisites
- Supported Agents
- uc-invoice-reconciler, uc-lease-abstractor
- Tags
- ocr, document-processing, invoice, pdf, data-extraction
- Added
- 2026-03-28
- Completeness
- 100%
Index Score
61.3Adoption
74
Quality
84
Freshness
88
Citations
62
Engagement
0