File size: 6,215 Bytes
a8a9c2e 4f3f7de a8a9c2e 4f3f7de 5ba3983 4f3f7de 5ba3983 4f3f7de 5ba3983 4f3f7de 5ba3983 4f3f7de 55783c0 4f3f7de 55783c0 4f3f7de 5ba3983 4f3f7de |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
---
base_model:
- lmsys/vicuna-7b-v1.1
datasets:
- MovieCORE/MovieCORE
- Enxin/MovieChat-1K-test
license: mit
pipeline_tag: video-text-to-text
---
<div align="center">
<img src="https://github.com/joslefaure/MovieCORE/raw/main/assets/moviecore_icon.png" alt="MovieCORE Icon" width="150"/>
# MovieCORE: COgnitive REasoning in Movies
**A Video Question Answering Dataset for Probing Deeper Cognitive Understanding of Movie Content**
[](https://arxiv.org/abs/2508.19026)
[](https://huggingface.co/papers/2508.19026)
[](https://huggingface.co/datasets/MovieCORE/MovieCORE)
[](https://github.com/joslefaure/moviecore)
[](https://joslefaure.github.io/assets/html/moviecore.html)
[](https://github.com/joslefaure/MovieCORE/blob/main/LICENSE)

</div>
## 📖 Overview
MovieCORE is a comprehensive video question answering (VQA) dataset specifically designed to evaluate and probe deeper cognitive understanding of movie content. Unlike traditional VQA datasets that focus on surface-level visual understanding, MovieCORE challenges models to demonstrate sophisticated reasoning about narrative structures, character development, thematic elements, and complex temporal relationships within cinematic content.
## 🗂️ Data Preparation
The MovieCORE dataset builds upon video content from MovieChat. To get started:
### Video Data
Download the video files from MovieChat's HuggingFace repositories:
- **Training Data**: [MovieChat-1K Train](https://huggingface.co/datasets/Enxin/MovieChat-1K_train)
- **Test Data**: [MovieChat-1K Test](https://huggingface.co/datasets/Enxin/MovieChat-1K-test)
### Annotations
Access our annotations on HuggingFace:
- **MovieCORE Annotations**: [🤗 HuggingFace Dataset](https://huggingface.co/datasets/MovieCORE/MovieCORE/tree/main)
Extract and organize the data according to your model's requirements, then use our annotations for evaluation.
## 🚀 Quick Start
### Installation
```bash
git clone https://github.com/joslefaure/MovieCORE.git
cd MovieCORE
```
## 🎯 Baselines
- We have provided the script to run [HERMES](https://github.com/joslefaure/HERMES) (ICCV'25) on MovieCORE. Please check out the linked project.
## 📊 Evaluation Dimensions
MovieCORE employs a comprehensive multi-dimensional evaluation framework to assess model performance across different aspects of cognitive understanding:
| Dimension | Description |
|-----------|-------------|
| **🎯 Accuracy** | Measures semantic similarity between predicted and ground truth answers |
| **📋 Comprehensiveness** | Assesses coverage of all key aspects mentioned in the ground truth |
| **🧠 Depth** | Evaluates level of reasoning and insight demonstrated in predictions |
| **🔍 Evidence** | Checks quality and relevance of supporting evidence provided |
| **🔗 Coherence** | Measures logical flow, organization, and clarity of responses |
Each dimension provides unique insights into different cognitive capabilities required for deep video understanding.
## 💻 Usage
### Evaluation Script
Evaluate your model's performance on MovieCORE using our evaluation script:
```bash
export OPENAI_API_KEY='your_openai_api_key'
python evaluate_moviecore.py --pred_path path/to/your/predictions.json
```
### 📝 Input Format
Your predictions should follow this JSON structure:
```json
{
"video_1.mp4": [
{
"question": "How does the video depict the unique adaptations of the species in the Sahara Desert, and what roles do these species play in their ecosystem?",
"answer": "The ground truth answer.",
"pred": "Your model's prediction.",
"classification": "the question classification"
},
{
"question": "The second question for video 1?",
"answer": "The ground truth answer.",
"pred": "Your model's prediction.",
"classification": "the question classification"
}
],
"video_2.mp4": [
{
"question": "The only question for video 2",
"answer": "The ground truth answer.",
"pred": "Your model's prediction.",
"classification": "the question classification"
}
]
}
```
### 📈 Output
The evaluation script provides:
- Overall scores across all dimensions
- Classification-specific performance metrics
- Detailed breakdowns for comprehensive analysis
## 📚 Citation
If you use MovieCORE in your research, please cite our paper:
```bibtex
@misc{faure2025moviecorecognitivereasoningmovies,
title={MovieCORE: COgnitive REasoning in Movies},
author={Gueter Josmy Faure and Min-Hung Chen and Jia-Fong Yeh and Ying Cheng and Hung-Ting Su and Yung-Hao Tang and Shang-Hong Lai and Winston H. Hsu},
year={2025},
eprint={2508.19026},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.19026},
}
```
## 🤝 Contributing
We welcome contributions to MovieCORE! Please feel free to:
- Report issues or bugs
- Suggest improvements or new features
- Submit baseline implementations
- Provide feedback on the evaluation framework
## 📄 License
This dataset is provided under the MIT License. See [LICENSE](https://github.com/joslefaure/MovieCORE/blob/main/LICENSE) for more details.
---
<div align="center">
<p>🎬 <strong>Advancing Video Understanding Through Cognitive Evaluation</strong> 🎬</p>
**[\ud83d\udcd6 Paper](https://arxiv.org/abs/2508.19026v1) | [\ud83e\udd17 Dataset](https://huggingface.co/datasets/MovieCORE/MovieCORE) | [\ud83d\udcbb Code](https://github.com/joslefaure/moviecore)**
</div> |