MLAdaptiveIntelligence
/

LLaVAction-0.5B

Video-Text-to-Text

text-generation

text-generation-inference

Model card Files Files and versions Community

mwmathis commited on Mar 24

Commit

abff0ca

·

verified ·

1 Parent(s): 19054f0

Update README.md

Files changed (1) hide show

README.md +20 -0

README.md CHANGED Viewed

@@ -10,6 +10,8 @@ tags:
 - Video
 - MQA
 - multimodal
 metrics:
 - accuracy
 library_name: transformers
@@ -17,6 +19,24 @@ library_name: transformers
 # LLaVAction-0.5B
 ## Model Summary
 The LLaVAction-0.5B model is trained on EPIC-KITCHENS-100-MQA, based on Qwen2 language model with a context window of 32K tokens.

 - Video
 - MQA
 - multimodal
+- MLLMs
+- LLaVAction
 metrics:
 - accuracy
 library_name: transformers
 # LLaVAction-0.5B
+<div align="center">
+<h2>LLaVAction: evaluating and training multi-modal large language models for action recognition
+</h2>
+[Shaokai Ye](https://yeshaokai.github.io/)<sup>1**</sup>&nbsp;
+[Haozhe Qi](https://people.epfl.ch/haozhe.qi)<sup>1**</sup>&nbsp;
+[Alexander Mathis](https://mathislab.org/)<sup>1</sup><sup>†</sup>&nbsp;
+[Mackenzie Weygandt Mathis](https://www.mackenziemathislab.org/mackenziemathis)<sup>1</sup><sup>†</sup><sup>‡</sup>&nbsp;
+<sup>1</sup> EPFL
+<sup>**</sup> First authors  <sup>†</sup> Senior Authors  <sup>‡</sup> Corresponding Author
+\[[arXiv Paper](https://www.arxiv.org/tbd)\] &nbsp; \[[Project Page](https://mmathislab.github.io/llavaction/)\] &nbsp; \[[Github Repo](https://github.com/AdaptiveMotorControlLab/LLaVAction)\] &nbsp;
+</div>
 ## Model Summary
 The LLaVAction-0.5B model is trained on EPIC-KITCHENS-100-MQA, based on Qwen2 language model with a context window of 32K tokens.