AI & ML interests

Retrieval, Computer Vision, LLM

Recent Activity

vidore 's collections 11

ViDoRe Chunk OCR (baseline)
The ViDoRe benchmark was passed to Unstructured to partition each page into text chunks. Detected figures/tables were captioned with Claude 3-Sonnet.
ViDoRe Page OCR (artifact)
ViDoRe benchmark with the full OCR text of each page. ⚠️ This dataset serves a intermediate step → use "ViDoRe Chunk OCR (baseline)" for evaluation!
ViDoRe Chunk OCR (baseline)
The ViDoRe benchmark was passed to Unstructured to partition each page into text chunks. Detected figures/tables were captioned with Claude 3-Sonnet.
ViDoRe Page OCR (artifact)
ViDoRe benchmark with the full OCR text of each page. ⚠️ This dataset serves a intermediate step → use "ViDoRe Chunk OCR (baseline)" for evaluation!