Model Card for Model ID

This model card aims to be a baseline model for using RVL-CDIP with Donut. The model has been trained on small scale dataset of RVL-CDIP (specically 100 images from this dataset).

Model Details

The model using Donut with VisionEncoderDecoder and Transformers as the backbone model for an end-to-end Document Classification task

Downstream Use [optional]

This model can be use for fine-tuning task related Document Classification in different area like Food Document, Financial Document, etc. For further task downstream fine-tune, please related to the orignal model from Naver.

Downloads last month
12
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Datasets used to train sitloboi2012/donut-finetune-rvl-cdip