Update README.md
Browse files
README.md
CHANGED
|
@@ -65,7 +65,7 @@ To construct this dataset, we propose an efficient data construction pipeline. S
|
|
| 65 |
|
| 66 |
- **For samples with clear ground truths:**
|
| 67 |
the model is prompted to first provide the reasoning process and then give the final answer in the format like `Final Answer: ***`.
|
| 68 |
-
Responses matching the ground truth answer constitute the positive set \\(mathcal{Y}_p\\), while those that do not match make up the negative set \\(\mathcal{Y}_n\\). Additionally, responses that fail to provide a clear final answer are also merged into \\(\mathcal{Y}_n\\).
|
| 69 |
Given these responses labeled as positive or negative, we build the preference pairs by selecting a chosen response \\(y_c\\) from \\(\mathcal{Y}_p\\) and a negative response \\(y_r\\) from \\(\mathcal{Y}_n\\).
|
| 70 |
|
| 71 |
- **For samples without clear ground truths:**
|
|
@@ -160,7 +160,7 @@ To comprehensively compare InternVL's performance before and after MPO, we emplo
|
|
| 160 |
|
| 161 |
## Quick Start
|
| 162 |
|
| 163 |
-
We provide an example code to run `InternVL2_5-
|
| 164 |
|
| 165 |
> Please use transformers>=4.37.2 to ensure the model works normally.
|
| 166 |
|
|
@@ -171,7 +171,7 @@ We provide an example code to run `InternVL2_5-1B` using `transformers`.
|
|
| 171 |
```python
|
| 172 |
import torch
|
| 173 |
from transformers import AutoTokenizer, AutoModel
|
| 174 |
-
path = "OpenGVLab/InternVL2_5-
|
| 175 |
model = AutoModel.from_pretrained(
|
| 176 |
path,
|
| 177 |
torch_dtype=torch.bfloat16,
|
|
@@ -185,7 +185,7 @@ model = AutoModel.from_pretrained(
|
|
| 185 |
```python
|
| 186 |
import torch
|
| 187 |
from transformers import AutoTokenizer, AutoModel
|
| 188 |
-
path = "OpenGVLab/InternVL2_5-
|
| 189 |
model = AutoModel.from_pretrained(
|
| 190 |
path,
|
| 191 |
torch_dtype=torch.bfloat16,
|
|
@@ -230,8 +230,8 @@ def split_model(model_name):
|
|
| 230 |
|
| 231 |
return device_map
|
| 232 |
|
| 233 |
-
path = "OpenGVLab/InternVL2_5-
|
| 234 |
-
device_map = split_model('InternVL2_5-
|
| 235 |
model = AutoModel.from_pretrained(
|
| 236 |
path,
|
| 237 |
torch_dtype=torch.bfloat16,
|
|
@@ -327,7 +327,7 @@ def load_image(image_file, input_size=448, max_num=12):
|
|
| 327 |
return pixel_values
|
| 328 |
|
| 329 |
# If you want to load a model using multiple GPUs, please refer to the `Multiple GPUs` section.
|
| 330 |
-
path = 'OpenGVLab/InternVL2_5-
|
| 331 |
model = AutoModel.from_pretrained(
|
| 332 |
path,
|
| 333 |
torch_dtype=torch.bfloat16,
|
|
@@ -510,7 +510,7 @@ LMDeploy abstracts the complex inference process of multi-modal Vision-Language
|
|
| 510 |
from lmdeploy import pipeline, TurbomindEngineConfig
|
| 511 |
from lmdeploy.vl import load_image
|
| 512 |
|
| 513 |
-
model = 'OpenGVLab/InternVL2_5-
|
| 514 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
|
| 515 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
| 516 |
response = pipe(('describe this image', image))
|
|
@@ -528,7 +528,7 @@ from lmdeploy import pipeline, TurbomindEngineConfig
|
|
| 528 |
from lmdeploy.vl import load_image
|
| 529 |
from lmdeploy.vl.constants import IMAGE_TOKEN
|
| 530 |
|
| 531 |
-
model = 'OpenGVLab/InternVL2_5-
|
| 532 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
| 533 |
|
| 534 |
image_urls=[
|
|
@@ -550,7 +550,7 @@ Conducting inference with batch prompts is quite straightforward; just place the
|
|
| 550 |
from lmdeploy import pipeline, TurbomindEngineConfig
|
| 551 |
from lmdeploy.vl import load_image
|
| 552 |
|
| 553 |
-
model = 'OpenGVLab/InternVL2_5-
|
| 554 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
| 555 |
|
| 556 |
image_urls=[
|
|
@@ -570,7 +570,7 @@ There are two ways to do the multi-turn conversations with the pipeline. One is
|
|
| 570 |
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
|
| 571 |
from lmdeploy.vl import load_image
|
| 572 |
|
| 573 |
-
model = 'OpenGVLab/InternVL2_5-
|
| 574 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
| 575 |
|
| 576 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg')
|
|
@@ -586,7 +586,7 @@ print(sess.response.text)
|
|
| 586 |
LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
|
| 587 |
|
| 588 |
```shell
|
| 589 |
-
lmdeploy serve api_server OpenGVLab/InternVL2_5-
|
| 590 |
```
|
| 591 |
|
| 592 |
To use the OpenAI-style interface, you need to install OpenAI:
|
|
@@ -625,7 +625,7 @@ print(response)
|
|
| 625 |
|
| 626 |
## License
|
| 627 |
|
| 628 |
-
This project is released under the MIT License. This project uses the pre-trained Qwen2.5-
|
| 629 |
|
| 630 |
## Citation
|
| 631 |
|
|
|
|
| 65 |
|
| 66 |
- **For samples with clear ground truths:**
|
| 67 |
the model is prompted to first provide the reasoning process and then give the final answer in the format like `Final Answer: ***`.
|
| 68 |
+
Responses matching the ground truth answer constitute the positive set \\(\mathcal{Y}_p\\), while those that do not match make up the negative set \\(\mathcal{Y}_n\\). Additionally, responses that fail to provide a clear final answer are also merged into \\(\mathcal{Y}_n\\).
|
| 69 |
Given these responses labeled as positive or negative, we build the preference pairs by selecting a chosen response \\(y_c\\) from \\(\mathcal{Y}_p\\) and a negative response \\(y_r\\) from \\(\mathcal{Y}_n\\).
|
| 70 |
|
| 71 |
- **For samples without clear ground truths:**
|
|
|
|
| 160 |
|
| 161 |
## Quick Start
|
| 162 |
|
| 163 |
+
We provide an example code to run `InternVL2_5-4B-MPO` using `transformers`.
|
| 164 |
|
| 165 |
> Please use transformers>=4.37.2 to ensure the model works normally.
|
| 166 |
|
|
|
|
| 171 |
```python
|
| 172 |
import torch
|
| 173 |
from transformers import AutoTokenizer, AutoModel
|
| 174 |
+
path = "OpenGVLab/InternVL2_5-4B-MPO"
|
| 175 |
model = AutoModel.from_pretrained(
|
| 176 |
path,
|
| 177 |
torch_dtype=torch.bfloat16,
|
|
|
|
| 185 |
```python
|
| 186 |
import torch
|
| 187 |
from transformers import AutoTokenizer, AutoModel
|
| 188 |
+
path = "OpenGVLab/InternVL2_5-4B-MPO"
|
| 189 |
model = AutoModel.from_pretrained(
|
| 190 |
path,
|
| 191 |
torch_dtype=torch.bfloat16,
|
|
|
|
| 230 |
|
| 231 |
return device_map
|
| 232 |
|
| 233 |
+
path = "OpenGVLab/InternVL2_5-4B-MPO"
|
| 234 |
+
device_map = split_model('InternVL2_5-4B')
|
| 235 |
model = AutoModel.from_pretrained(
|
| 236 |
path,
|
| 237 |
torch_dtype=torch.bfloat16,
|
|
|
|
| 327 |
return pixel_values
|
| 328 |
|
| 329 |
# If you want to load a model using multiple GPUs, please refer to the `Multiple GPUs` section.
|
| 330 |
+
path = 'OpenGVLab/InternVL2_5-4B-MPO'
|
| 331 |
model = AutoModel.from_pretrained(
|
| 332 |
path,
|
| 333 |
torch_dtype=torch.bfloat16,
|
|
|
|
| 510 |
from lmdeploy import pipeline, TurbomindEngineConfig
|
| 511 |
from lmdeploy.vl import load_image
|
| 512 |
|
| 513 |
+
model = 'OpenGVLab/InternVL2_5-4B-MPO'
|
| 514 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
|
| 515 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
| 516 |
response = pipe(('describe this image', image))
|
|
|
|
| 528 |
from lmdeploy.vl import load_image
|
| 529 |
from lmdeploy.vl.constants import IMAGE_TOKEN
|
| 530 |
|
| 531 |
+
model = 'OpenGVLab/InternVL2_5-4B-MPO'
|
| 532 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
| 533 |
|
| 534 |
image_urls=[
|
|
|
|
| 550 |
from lmdeploy import pipeline, TurbomindEngineConfig
|
| 551 |
from lmdeploy.vl import load_image
|
| 552 |
|
| 553 |
+
model = 'OpenGVLab/InternVL2_5-4B-MPO'
|
| 554 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
| 555 |
|
| 556 |
image_urls=[
|
|
|
|
| 570 |
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
|
| 571 |
from lmdeploy.vl import load_image
|
| 572 |
|
| 573 |
+
model = 'OpenGVLab/InternVL2_5-4B-MPO'
|
| 574 |
pipe = pipeline(model, backend_config=TurbomindEngineConfig(session_len=8192))
|
| 575 |
|
| 576 |
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/demo/resources/human-pose.jpg')
|
|
|
|
| 586 |
LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
|
| 587 |
|
| 588 |
```shell
|
| 589 |
+
lmdeploy serve api_server OpenGVLab/InternVL2_5-4B-MPO --server-port 23333
|
| 590 |
```
|
| 591 |
|
| 592 |
To use the OpenAI-style interface, you need to install OpenAI:
|
|
|
|
| 625 |
|
| 626 |
## License
|
| 627 |
|
| 628 |
+
This project is released under the MIT License. This project uses the pre-trained Qwen2.5-3B-Instruct as a component, which is licensed under the Apache License 2.0.
|
| 629 |
|
| 630 |
## Citation
|
| 631 |
|