Update README.md

Files changed (3) hide show

README.md CHANGED Viewed

@@ -3,4 +3,13 @@ datasets:
 - SurplusDeficit/MultiHop-EgoQA
 ---
-# Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos

 - SurplusDeficit/MultiHop-EgoQA
 ---
+# Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
+## GeLM Model
+We propose a novel architecture, termed as <b><u>GeLM</u></b> for *MH-VidQA*, to leverage the world knowledge reasoning capabilities of multi-modal large language models (LLMs), while incorporating a grounding module to retrieve temporal evidence in the video with flexible grounding tokens.
+<div align="center">
+   <img src="./assets/architecture_v3.jpeg" style="width: 80%;">
+</div>

RTL-GeLM-7B/tokenizer.model CHANGED Viewed

Binary files a/RTL-GeLM-7B/tokenizer.model and b/RTL-GeLM-7B/tokenizer.model differ

assets/architecture_v3.jpeg ADDED Viewed