A collection of models, datasets, and spaces in the VDR series
AI & ML interests
None defined yet.
Recent Activity
Organization Card
LlamaIndex
LlamaIndex delivers the world’s most accurate agentic OCR and document-specific AI workflows, powering complete enterprise automation.
On HuggingFace and in our open-source repos, we are pushing forward document parsing and OCR with several projects:
- ParseBench -- A comprehensive dataset for evaluating document parsing pipelines and OCR models
- LiteParse -- Our lightweight, open-source document parser. Handles multiple formats, runs locally, and integrates with any OCR model
- Visual Document Retrieval -- An exploration into training models purely for document screenshot retrieval
LlamaParse
LlamaParse provides end-to-end document understanding with AI-powered parsing, extraction, and indexing. Transform complex layouts, tables, and handwriting into actionable insights with industry-leading accuracy.
datasets 6
llamaindex/ParseBench
Viewer • Updated • 169k • 1.34k • 6
llamaindex/test-bench-blind-results
Preview • Updated • 27
llamaindex/liteparse_cicd_data
Updated • 467
llamaindex/liteparse_bench_small
Viewer • Updated • 48 • 139 • 1
llamaindex/vdr-multilingual-train
Viewer • Updated • 424k • 2.36k • 29
llamaindex/vdr-multilingual-test
Viewer • Updated • 15k • 114 • 3