Visual Question Answering
Transformers
Safetensors
llava_llama
text-generation


sharegpt4video-8b Model Card

Model details

Model type: sharegpt4video-8b is an open-source video chatbot trained by fine-tuning the entire model on open-source video instruction data. The training process takes around 5 hour on 32xA100 GPUs.

Model date: sharegpt4video-8b was trained in May 2024.

Paper or resources for more information: [Code] [Project Page]

Usage

You can utilize this model as we provide in our [repository].

Training dataset

All training data are open-sourced, you can find the usage in our repository.

  • 153K collection of various video instruction data
  • 28K high-quality video caption data from [ShareGPT4Video]

Intended use

Primary intended uses: The primary use of sharegpt4video-8b is research on large video-language models and video chatbots.

Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.

Paper

arxiv.org/abs/2406.04325

Downloads last month
2,531
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Datasets used to train Lin-Chen/sharegpt4video-8b

Spaces using Lin-Chen/sharegpt4video-8b 2

Collection including Lin-Chen/sharegpt4video-8b