AnyParse

AnyParse is a powerful multimodal document parsing and understanding engine designed to seamlessly convert complex files into structured Markdown and JSON formats. Whether it's basic text processing, professional document conversion, or advanced Vision-Language Models (VLM) and OCR recognition, AnyParse provides a comprehensive, one-stop solution.

Core Capabilities

Multimodal Document Understanding: Supports cross-modal parsing of images and documents. By combining OCR and VLM technologies, it accurately extracts unstructured data.
Comprehensive Format Coverage: Easily parses office documents, web pages, spreadsheets, e-books, and emails with a single tool.
Structured Output: Transforms complex files into standardized Markdown and JSON, streamlining downstream data processing and Large Language Model (LLM) applications.

Key Features

Documents & Layouts: PDF, DOCX, PPTX, XLSX, EPUB, IPYNB
Text & Markup: TXT, MD, RST, HTML/XHTML/HTM/SHTML
Spreadsheets & Data: CSV, TSV
Images & Multimedia: PNG, JPEG/JPG
Others: EML (Emails)
Built-in CLI, FastAPI
Supports running in a pure CPU environment, and also supports GPU
Output text in human reading order, suitable for single-column, multi-column and complex layouts
Retain the original document structure, including titles, paragraphs, lists, etc.
Extract images, image descriptions, tables, table titles and footnotes
Automatically identify and convert formulas in documents to LaTeX format
Automatically identify and convert tables in documents to HTML format

All Thanks To Our Contributors

License Information

This repository is licensed under the AnyParse Open Source License, based on Apache 2.0 with additional conditions.

AnyParse

Core Capabilities

Key Features

All Thanks To Our Contributors

License Information

Acknowledgments

Star History