Kwai-Keye nielsr HF Staff commited on
Commit
3921b3d
·
verified ·
1 Parent(s): eff15e9

Improve model card with abstract summary and GitHub link (#3)

Browse files

- Improve model card with abstract summary and GitHub link (25028e1171ab6df4acede6017ee62449396cdfcf)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +8 -5
README.md CHANGED
@@ -8,16 +8,19 @@ tags:
8
  - multimodal
9
  ---
10
 
11
- # Kwai Keye-VL
12
 
13
 
14
  <div align="center">
15
  <img src="asset/keye_logo_2.png" width="100%" alt="Kwai Keye-VL Logo">
16
  </div>
17
 
 
 
18
  <font size=3><div align='center' >
19
  [[🍎 Home Page](https://kwai-keye.github.io/)]
20
- [[📖 Technique Report](https://arxiv.org/abs/2509.01563)]
 
21
  [[📊 Keye-VL-8B-Preview](https://huggingface.co/Kwai-Keye/Keye-VL-8B-Preview) ]
22
  [[📊 Keye-VL-1.5-8B](https://huggingface.co/Kwai-Keye/Keye-VL-1_5-8B/) ]
23
  [[🚀 Demo](https://huggingface.co/spaces/Kwai-Keye/Keye-VL-8B-Preview)]
@@ -38,7 +41,7 @@ tags:
38
 
39
  ## Contents <!-- omit in toc -->
40
 
41
- - [Kwai Keye-VL](#kwai-keye-vl)
42
  - [🔥 News](#-news)
43
  - [📐 Quick Start](#-quick-start)
44
  - [Preprocess and Inference](#preprocess-and-inference)
@@ -409,7 +412,7 @@ def prepare_message_for_vllm(content_messages):
409
  new_content_list = []
410
  for part_message in message_content_list:
411
  if 'video' in part_message:
412
- video_message = [{'content': [part_message]}]
413
  image_inputs, video_inputs, video_kwargs = process_vision_info(video_message, return_video_kwargs=True)
414
  assert video_inputs is not None, "video_inputs should not be None"
415
  video_input = (video_inputs.pop()).permute(0, 2, 3, 1).numpy().astype(np.uint8)
@@ -515,4 +518,4 @@ If you find our work helpful for your research, please consider citing our work.
515
 
516
  ## Acknowledgement
517
 
518
- Kwai Keye-VL is developed based on the codebases of the following projects: [SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384), [Qwen3](https://github.com/QwenLM/Qwen3), [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL), [VLMEvalKit](https://github.com/open-compass/VLMEvalKit). We sincerely thank these projects for their outstanding work.
 
8
  - multimodal
9
  ---
10
 
11
+ # Kwai Keye-VL 1.5
12
 
13
 
14
  <div align="center">
15
  <img src="asset/keye_logo_2.png" width="100%" alt="Kwai Keye-VL Logo">
16
  </div>
17
 
18
+ Keye-VL-1.5 is a cutting-edge Multimodal Large Language Model (MLLM) that addresses fundamental challenges in video comprehension. It features a novel Slow-Fast video encoding strategy, a progressive four-stage pre-training methodology to extend context length up to 128K tokens, and a comprehensive post-training pipeline focusing on reasoning enhancement and human preference alignment. The model demonstrates significant improvements in video understanding tasks and maintains competitive performance on general multimodal benchmarks.
19
+
20
  <font size=3><div align='center' >
21
  [[🍎 Home Page](https://kwai-keye.github.io/)]
22
+ [[📖 Technical Report](https://arxiv.org/abs/2509.01563)]
23
+ [[💻 GitHub Repository](https://github.com/Kwai-Keye/Keye)]
24
  [[📊 Keye-VL-8B-Preview](https://huggingface.co/Kwai-Keye/Keye-VL-8B-Preview) ]
25
  [[📊 Keye-VL-1.5-8B](https://huggingface.co/Kwai-Keye/Keye-VL-1_5-8B/) ]
26
  [[🚀 Demo](https://huggingface.co/spaces/Kwai-Keye/Keye-VL-8B-Preview)]
 
41
 
42
  ## Contents <!-- omit in toc -->
43
 
44
+ - [Kwai Keye-VL 1.5](#kwai-keye-vl-15)
45
  - [🔥 News](#-news)
46
  - [📐 Quick Start](#-quick-start)
47
  - [Preprocess and Inference](#preprocess-and-inference)
 
412
  new_content_list = []
413
  for part_message in message_content_list:
414
  if 'video' in part_message:
415
+ video_message = [{'content': [part_message]}]\
416
  image_inputs, video_inputs, video_kwargs = process_vision_info(video_message, return_video_kwargs=True)
417
  assert video_inputs is not None, "video_inputs should not be None"
418
  video_input = (video_inputs.pop()).permute(0, 2, 3, 1).numpy().astype(np.uint8)
 
518
 
519
  ## Acknowledgement
520
 
521
+ Kwai Keye-VL is developed based on the codebases of the following projects: [SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384), [Qwen3](https://github.com/QwenLM/Qwen3), [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL), [VLMEvalKit](https://github.com/open-compass/VLMEvalKit). We sincerely thank these projects for their outstanding work.