nielsr HF Staff commited on
Commit
25028e1
·
verified ·
1 Parent(s): eff15e9

Improve model card with abstract summary and GitHub link

Browse files

This PR enhances the model card for Kwai Keye-VL 1.5 by:

- Updating the main title to "Kwai Keye-VL 1.5" for better precision, aligning with the paper and news.
- Adding a concise summary of the model's key innovations and capabilities, derived from the paper's abstract, to provide an immediate overview for users.
- Integrating a direct link to the official GitHub repository (`https://github.com/Kwai-Keye/Keye`) in the prominent initial links section, improving code discoverability.
- Correcting the typo "Technique Report" to "Technical Report" in the introductory links.

All existing metadata, code snippets, and other detailed content remain unchanged to preserve functionality and user experience.

Files changed (1) hide show
  1. README.md +8 -5
README.md CHANGED
@@ -8,16 +8,19 @@ tags:
8
  - multimodal
9
  ---
10
 
11
- # Kwai Keye-VL
12
 
13
 
14
  <div align="center">
15
  <img src="asset/keye_logo_2.png" width="100%" alt="Kwai Keye-VL Logo">
16
  </div>
17
 
 
 
18
  <font size=3><div align='center' >
19
  [[🍎 Home Page](https://kwai-keye.github.io/)]
20
- [[📖 Technique Report](https://arxiv.org/abs/2509.01563)]
 
21
  [[📊 Keye-VL-8B-Preview](https://huggingface.co/Kwai-Keye/Keye-VL-8B-Preview) ]
22
  [[📊 Keye-VL-1.5-8B](https://huggingface.co/Kwai-Keye/Keye-VL-1_5-8B/) ]
23
  [[🚀 Demo](https://huggingface.co/spaces/Kwai-Keye/Keye-VL-8B-Preview)]
@@ -38,7 +41,7 @@ tags:
38
 
39
  ## Contents <!-- omit in toc -->
40
 
41
- - [Kwai Keye-VL](#kwai-keye-vl)
42
  - [🔥 News](#-news)
43
  - [📐 Quick Start](#-quick-start)
44
  - [Preprocess and Inference](#preprocess-and-inference)
@@ -409,7 +412,7 @@ def prepare_message_for_vllm(content_messages):
409
  new_content_list = []
410
  for part_message in message_content_list:
411
  if 'video' in part_message:
412
- video_message = [{'content': [part_message]}]
413
  image_inputs, video_inputs, video_kwargs = process_vision_info(video_message, return_video_kwargs=True)
414
  assert video_inputs is not None, "video_inputs should not be None"
415
  video_input = (video_inputs.pop()).permute(0, 2, 3, 1).numpy().astype(np.uint8)
@@ -515,4 +518,4 @@ If you find our work helpful for your research, please consider citing our work.
515
 
516
  ## Acknowledgement
517
 
518
- Kwai Keye-VL is developed based on the codebases of the following projects: [SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384), [Qwen3](https://github.com/QwenLM/Qwen3), [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL), [VLMEvalKit](https://github.com/open-compass/VLMEvalKit). We sincerely thank these projects for their outstanding work.
 
8
  - multimodal
9
  ---
10
 
11
+ # Kwai Keye-VL 1.5
12
 
13
 
14
  <div align="center">
15
  <img src="asset/keye_logo_2.png" width="100%" alt="Kwai Keye-VL Logo">
16
  </div>
17
 
18
+ Keye-VL-1.5 is a cutting-edge Multimodal Large Language Model (MLLM) that addresses fundamental challenges in video comprehension. It features a novel Slow-Fast video encoding strategy, a progressive four-stage pre-training methodology to extend context length up to 128K tokens, and a comprehensive post-training pipeline focusing on reasoning enhancement and human preference alignment. The model demonstrates significant improvements in video understanding tasks and maintains competitive performance on general multimodal benchmarks.
19
+
20
  <font size=3><div align='center' >
21
  [[🍎 Home Page](https://kwai-keye.github.io/)]
22
+ [[📖 Technical Report](https://arxiv.org/abs/2509.01563)]
23
+ [[💻 GitHub Repository](https://github.com/Kwai-Keye/Keye)]
24
  [[📊 Keye-VL-8B-Preview](https://huggingface.co/Kwai-Keye/Keye-VL-8B-Preview) ]
25
  [[📊 Keye-VL-1.5-8B](https://huggingface.co/Kwai-Keye/Keye-VL-1_5-8B/) ]
26
  [[🚀 Demo](https://huggingface.co/spaces/Kwai-Keye/Keye-VL-8B-Preview)]
 
41
 
42
  ## Contents <!-- omit in toc -->
43
 
44
+ - [Kwai Keye-VL 1.5](#kwai-keye-vl-15)
45
  - [🔥 News](#-news)
46
  - [📐 Quick Start](#-quick-start)
47
  - [Preprocess and Inference](#preprocess-and-inference)
 
412
  new_content_list = []
413
  for part_message in message_content_list:
414
  if 'video' in part_message:
415
+ video_message = [{'content': [part_message]}]\
416
  image_inputs, video_inputs, video_kwargs = process_vision_info(video_message, return_video_kwargs=True)
417
  assert video_inputs is not None, "video_inputs should not be None"
418
  video_input = (video_inputs.pop()).permute(0, 2, 3, 1).numpy().astype(np.uint8)
 
518
 
519
  ## Acknowledgement
520
 
521
+ Kwai Keye-VL is developed based on the codebases of the following projects: [SigLIP](https://huggingface.co/google/siglip-so400m-patch14-384), [Qwen3](https://github.com/QwenLM/Qwen3), [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL), [VLMEvalKit](https://github.com/open-compass/VLMEvalKit). We sincerely thank these projects for their outstanding work.