AIDC-AI
/

Ovis1.6-Llama3.2-3B

Image-Text-to-Text

text-generation

Model card Files Files and versions Community

xxyyy123 commited on Oct 17, 2024

Commit

7f4d6cd

·

1 Parent(s): bd81d21

Updata README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -18,8 +18,9 @@ language:
 ## Introduction
 [GitHub](https://github.com/AIDC-AI/Ovis) | [Demo](https://huggingface.co/spaces/AIDC-AI/Ovis1.6-Llama3.2-3B) | [Paper](https://arxiv.org/abs/2405.20797)
-We are excited to announce the open-sourcing of **Ovis-1.6**, our latest multi-modal large language model. Ovis is a novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
 <div align="center">
     <img src="https://cdn-uploads.huggingface.co/production/uploads/658a8a837959448ef5500ce5/TIlymOb86R6_Mez3bpmcB.png" width="100%" />
@@ -46,6 +47,9 @@ Below is a code snippet to run Ovis with multimodal inputs. For additional usage
 ```bash
 pip install torch==2.2.0 transformers==4.44.2 numpy==1.24.3 pillow==10.3.0
 ```
 ```python
 import torch
 from PIL import Image

 ## Introduction
 [GitHub](https://github.com/AIDC-AI/Ovis) | [Demo](https://huggingface.co/spaces/AIDC-AI/Ovis1.6-Llama3.2-3B) | [Paper](https://arxiv.org/abs/2405.20797)
+We are thrilled to announce the open-sourcing of **Ovis1.6-Llama3.2-3B**, an integral part of the Ovis1.6 family. This cutting-edge model currently sets the benchmark as the state-of-the-art (SOTA) solution for edge-side multimodal tasks.
+The Ovis family employs an innovative Multimodal Large Language Model (MLLM) architecture, specifically designed to structurally align visual and textual embeddings. Ovis1.6-Llama3.2-3B excels in common industry benchmarks, surpassing numerous open-source and proprietary multimodal models. Moreover, it is also particularly well-suited for local intelligence, on-device computing, and edge computing scenarios.
 <div align="center">
     <img src="https://cdn-uploads.huggingface.co/production/uploads/658a8a837959448ef5500ce5/TIlymOb86R6_Mez3bpmcB.png" width="100%" />
 ```bash
 pip install torch==2.2.0 transformers==4.44.2 numpy==1.24.3 pillow==10.3.0
 ```
+```bash
+pip install flash-attn --no-build-isolation
+```
 ```python
 import torch
 from PIL import Image