PAPOGalaxy nielsr HF Staff commited on
Commit
5415fff
·
verified ·
1 Parent(s): f4c9b7f

Improve model card: Add pipeline tag, library, links, and usage example (#1)

Browse files

- Improve model card: Add pipeline tag, library, links, and usage example (774ee033792b795656cbc0015f5cfea70ef20099)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +47 -5
README.md CHANGED
@@ -1,14 +1,56 @@
1
  ---
2
- license: mit
3
  datasets:
4
  - PAPOGalaxy/PAPO_train
 
 
 
5
  ---
6
 
 
7
 
8
- # PAPO Model
9
 
10
- ## Model Source
11
- This is the official model released for our paper **PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning** (arxiv.org/abs/2507.06448)
12
 
13
  ## Model Version
14
- PAPO (γ=0.01)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  datasets:
3
  - PAPOGalaxy/PAPO_train
4
+ license: mit
5
+ pipeline_tag: image-text-to-text
6
+ library_name: transformers
7
  ---
8
 
9
+ # PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning
10
 
11
+ This is the official model released for our paper [Perception-Aware Policy Optimization for Multimodal Reasoning](https://huggingface.co/papers/2507.06448).
12
 
13
+ **Project Page:** [https://mikewangwzhl.github.io/PAPO/](https://mikewangwzhl.github.io/PAPO/)
14
+ **Code:** [https://github.com/mikewangwzhl/PAPO](https://github.com/mikewangwzhl/PAPO)
15
 
16
  ## Model Version
17
+ PAPO (γ=0.01)
18
+
19
+ ## Usage
20
+
21
+ You can use this model with the Hugging Face `transformers` library.
22
+
23
+ ```python
24
+ from transformers import AutoProcessor, AutoModelForCausalLM
25
+ from PIL import Image
26
+ import requests
27
+
28
+ # Replace "PAPOGalaxy/PAPO" with the actual model ID if different
29
+ # For example, if it's PAPOGalaxy/PAPO-7B or PAPOGalaxy/PAPO-3B
30
+ model_id = "PAPOGalaxy/PAPO"
31
+
32
+ processor = AutoProcessor.from_pretrained(model_id)
33
+ model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
34
+
35
+ # Example image (replace with your own image path or URL)
36
+ image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/bee.JPG"
37
+ image = Image.open(requests.get(image_url, stream=True).raw)
38
+
39
+ # Example prompt
40
+ prompt = "What is in the image?"
41
+
42
+ # Prepare inputs following the model's chat template
43
+ messages = [
44
+ {"role": "user", "content": [
45
+ {"type": "image", "image": image},
46
+ {"type": "text", "text": prompt}
47
+ ]}
48
+ ]
49
+ text = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
50
+ inputs = processor(text=text, images=image, return_tensors="pt").to(model.device)
51
+
52
+ # Generate response
53
+ generated_ids = model.generate(**inputs, max_new_tokens=100)
54
+ generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
55
+ print(generated_text)
56
+ ```