nielsr HF Staff commited on
Commit
8d65bb9
·
verified ·
1 Parent(s): cf6f7d9

Improve model card: Add metadata, usage example, and comprehensive content

Browse files

This PR significantly enhances the model card for ChartCoder by:
- Adding key metadata: `pipeline_tag: image-text-to-text`, `library_name: transformers`, `license: cc-by-nc-4.0`, and `tags: - code-generation`. This improves discoverability on the Hub (e.g., at https://huggingface.co/models?pipeline_tag=image-text-to-text) and indicates compatibility with the Hugging Face Transformers library.
- Enriching the introductory section with an "About ChartCoder" summary derived from the paper's abstract.
- Integrating comprehensive details from the project's GitHub repository, including "Notes", "News", "Overview", "Models", and "Data" sections, along with updated image links to ensure proper rendering.
- Providing a clear "Installation" and "Training" guide for local setup.
- Adding a practical "Inference (Sample Usage)" code snippet using the `transformers` library, enabling users to easily load and run the model.
- Updating the top links with badges for datasets, the model itself, the arXiv paper, and the GitHub repository.

These changes provide a much richer and more actionable resource for users exploring ChartCoder on the Hugging Face Hub.

Files changed (1) hide show
  1. README.md +145 -20
README.md CHANGED
@@ -1,46 +1,167 @@
1
- # ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation (ACL25 Main)
 
 
 
 
 
 
2
 
3
- <a href="https://huggingface.co/datasets/xxxllz/Chart2Code-160k" target="_blank">🤗 Dataset(HuggingFace)</a> | <a href="https://modelscope.cn/datasets/Noct25/Chart2Code-160k" target="_blank">🤖 Dataset(ModelScope)</a> | <a href="https://arxiv.org/abs/2501.06598" target="_blank">📑 Paper </a>
4
 
5
- This repository contains the code to train and infer ChartCoder. **Our Github repository [ChartCoder](https://github.com/thunlp/ChartCoder) updates with more details and news.**
6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
 
8
  ## Installation
9
- 1. Clone this repo
10
- ```
 
 
11
  git clone https://github.com/thunlp/ChartCoder.git
12
- ```
13
- 2. Create environment
14
- ```
15
  conda create -n chartcoder python=3.10 -y
16
  conda activate chartcoder
17
  pip install --upgrade pip # enable PEP 660 support
18
  pip install -e .
19
  ```
20
- 3. Additional packages required for training
21
- ```
 
 
22
  pip install -e ".[train]"
23
  pip install flash-attn --no-build-isolation
24
  ```
25
 
26
- ## Train
27
- The whole training process consists of two stages. To train the ChartCoder, ```siglip-so400m-patch14-384``` and ```deepseek-coder-6.7b-instruct``` should be downloaded first.
28
 
29
- For **Pre-training**, run
30
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  bash scripts/train/pretrain_siglip.sh
32
  ```
33
- For **SFT**, run
34
- ```
 
 
35
  bash scripts/train/finetune_siglip_a4.sh
36
  ```
37
- Please change the model path to your local path. See the corresponding ```.sh ``` file for details.
38
- We also provide other training scripts, such as using CLIP ```_clip``` and multiple machines ```_m```. See ``` scripts/train ``` for further information.
39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
  ## Citation
 
42
  If you find this work useful, consider giving this repository a star ⭐️ and citing 📝 our paper as follows:
43
- ```
 
44
  @misc{zhao2025chartcoderadvancingmultimodallarge,
45
  title={ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation},
46
  author={Xuanle Zhao and Xianzhen Luo and Qi Shi and Chi Chen and Shuo Wang and Wanxiang Che and Zhiyuan Liu and Maosong Sun},
@@ -50,4 +171,8 @@ If you find this work useful, consider giving this repository a star ⭐️ and
50
  primaryClass={cs.AI},
51
  url={https://arxiv.org/abs/2501.06598},
52
  }
53
- ```
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: image-text-to-text
3
+ library_name: transformers
4
+ license: cc-by-nc-4.0
5
+ tags:
6
+ - code-generation
7
+ ---
8
 
9
+ # ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation (ACL 2025 Main)
10
 
11
+ [![🤗 Dataset (HuggingFace)](https://img.shields.io/badge/Dataset-HuggingFace-FFD21E.svg?logo=huggingface&logoColor=yellow)](https://huggingface.co/datasets/xxxllz/Chart2Code-160k) [![🤖 Dataset (ModelScope)](https://img.shields.io/badge/Dataset-ModelScope-00A0E9.svg)](https://modelscope.cn/datasets/Noct25/Chart2Code-160k) [![🤗 Model (HuggingFace)](https://img.shields.io/badge/Model-HuggingFace-FFD21E.svg?logo=huggingface&logoColor=yellow)](https://huggingface.co/xxxllz/ChartCoder) [![📑 Paper (arXiv:2501.06598)](https://img.shields.io/badge/arXiv-2501.06598-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2501.06598) [![GitHub Repo](https://img.shields.io/badge/GitHub-Repo-181717.svg?logo=github)](https://github.com/thunlp/ChartCoder)
12
 
13
+ This repository is the official implementation of [ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation](https://arxiv.org/abs/2501.06598).
14
+
15
+ ## About ChartCoder
16
+
17
+ ChartCoder is the first dedicated multimodal large language model (MLLM) designed for **chart-to-code generation**. It leverages Code LLMs as its language backbone to significantly enhance the executability of generated code. This model addresses two key challenges in chart interpretation:
18
+
19
+ 1. **Low executability and poor detail restoration** in generated code from existing MLLMs.
20
+ 2. **Lack of large-scale and diverse training data** for chart-to-code tasks.
21
+
22
+ To overcome these, ChartCoder introduces:
23
+ - **Chart2Code-160k**: The first large-scale and diverse dataset for chart-to-code generation.
24
+ - **Snippet-of-Thought (SoT)**: A method that transforms direct chart-to-code generation into a step-by-step process.
25
+
26
+ With only 7B parameters, ChartCoder surpasses existing open-source MLLMs on chart-to-code benchmarks, achieving superior chart restoration and code executability.
27
+
28
+ ## Notes
29
+ 1. ChartCoder is tested on the new version of Chartmimic, which contains 600 samples. The iclr version of ChartMimic is https://huggingface.co/datasets/ChartMimic/ChartMimic/blob/main/dataset-iclr.tar.gz.
30
+ 2. The code we utilize for evaluating is the Supplementary Material of https://openreview.net/forum?id=sGpCzsfd1K.
31
+
32
+ All the results (including the baseline and our models) in the Table 3 in the paper is evaluated based on above two settings. When conducting the assessment in other settings, there may be performance differences. If you want to replicate the performance in the paper, it is recommended to achieve it under the aforementioned settings.
33
+
34
+ ## News
35
+
36
+ **[2025.5.17]** ChartCoder has been accepted by **ACL 2025 Main**.
37
+
38
+ **[2025.3.13]** We have upload our dataset [Chart2Code-160k(HF)](https://huggingface.co/datasets/xxxllz/Chart2Code-160k) to Huggingface.
39
+
40
+ **[2025.2.19]** We have released our dataset [Chart2Code-160k](https://modelscope.cn/datasets/Noct25/Chart2Code-160k) to ModelScope.
41
+
42
+ **[2025.1.16]** We have updated our data generation code [data_generator](https://github.com/thunlp/ChartCoder/tree/main/data_generator), built on [Multi-modal-Self-instruct](https://github.com/zwq2018/Multi-modal-Self-instruct). Please follow their instructions and our code to generate the <chart, code> data pairs.
43
+
44
+ ## Overview
45
+
46
+ ![main](https://github.com/thunlp/ChartCoder/raw/main/fig/main.png)
47
 
48
  ## Installation
49
+
50
+ To get started with ChartCoder, clone the repository and set up the environment:
51
+
52
+ ```bash
53
  git clone https://github.com/thunlp/ChartCoder.git
54
+ cd ChartCoder
 
 
55
  conda create -n chartcoder python=3.10 -y
56
  conda activate chartcoder
57
  pip install --upgrade pip # enable PEP 660 support
58
  pip install -e .
59
  ```
60
+
61
+ For training, additional packages are required:
62
+
63
+ ```bash
64
  pip install -e ".[train]"
65
  pip install flash-attn --no-build-isolation
66
  ```
67
 
68
+ ## Models
 
69
 
70
+ | Model | Download Link |
71
+ |---|---|
72
+ | MLP Connector | [projector](https://drive.google.com/file/d/1S_LwG65TIz_miW39rFPhuEAb5ClgopYi/view?usp=drive_link) |
73
+ | ChartCoder | [ChartCoder](https://huggingface.co/xxxllz/ChartCoder) |
74
+
75
+ The MLP Connector is our pre-trained MLP weights, which you could directly use for SFT.
76
+
77
+ ## Data
78
+
79
+ | Dataset | Download Link |
80
+ |---|---|
81
+ | Chart2Code-160k | [HuggingFace](https://huggingface.co/datasets/xxxllz/Chart2Code-160k) |
82
+ | Chart2Code-160k | [ModelScope](https://modelscope.cn/datasets/Noct25/Chart2Code-160k) |
83
+
84
+ ## Training
85
+
86
+ The whole training process consists of two stages. To train the ChartCoder, `siglip-so400m-patch14-384` and `deepseek-coder-6.7b-instruct` should be downloaded first.
87
+
88
+ For **Pre-training**, run:
89
+
90
+ ```bash
91
  bash scripts/train/pretrain_siglip.sh
92
  ```
93
+
94
+ For **SFT**, run:
95
+
96
+ ```bash
97
  bash scripts/train/finetune_siglip_a4.sh
98
  ```
 
 
99
 
100
+ Please change the model path to your local path. See the corresponding `.sh` file for details. We also provide other training scripts, such as using CLIP `_clip` and multiple machines `_m`. See `scripts/train` for further information.
101
+
102
+ ## Inference (Sample Usage)
103
+
104
+ You can easily use ChartCoder with the Hugging Face `transformers` library. Ensure you have `transformers` and `torch` installed.
105
+
106
+ ```python
107
+ from transformers import AutoProcessor, AutoModelForCausalLM
108
+ import torch
109
+ from PIL import Image
110
+ import requests
111
+ from io import BytesIO
112
+
113
+ # Load model and processor
114
+ model_id = "xxxllz/ChartCoder" # The model's Hugging Face ID
115
+ model = AutoModelForCausalLM.from_pretrained(
116
+ model_id,
117
+ torch_dtype=torch.bfloat16,
118
+ low_cpu_mem_usage=True,
119
+ trust_remote_code=True
120
+ )
121
+ processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
122
+
123
+ # Example image (replace with a real chart image path or URL)
124
+ # For demonstration, let's use a placeholder image. In a real scenario, load your chart image.
125
+ # Example: image = Image.open("path/to/your/chart_image.png").convert("RGB")
126
+ # Or from a URL:
127
+ image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/chart_example.png"
128
+ image = Image.open(BytesIO(requests.get(image_url).content)).convert("RGB")
129
+
130
+ # Define your prompt for chart-to-code generation
131
+ prompt = "Generate Python code to recreate the given chart. Provide only the code, no explanations."
132
+
133
+ # Prepare messages in the chat template format
134
+ messages = [
135
+ {"role": "user", "content": f"<image>\
136
+ {prompt}"}
137
+ ]
138
+ text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
139
+
140
+ # Move inputs to GPU if available
141
+ inputs = processor(text=text, images=image, return_tensors="pt").to(model.device)
142
+
143
+ # Generate code
144
+ output_ids = model.generate(**inputs, max_new_tokens=512)
145
+ output_text = processor.batch_decode(output_ids, skip_special_tokens=True)[0].strip()
146
+
147
+ # Print the generated code
148
+ print(output_text)
149
+ ```
150
+
151
+ ## Results
152
+
153
+ Please refer to our paper for detailed performance on ChartMimic, Plot2Code and ChartX benchmarks. Thanks for these contributions to the chart-to-code field.
154
+ ![results](https://github.com/thunlp/ChartCoder/raw/main/fig/results.png)
155
+
156
+ ## Contact
157
+
158
+ For any questions, you can contact [[email protected]](mailto:[email protected]).
159
 
160
  ## Citation
161
+
162
  If you find this work useful, consider giving this repository a star ⭐️ and citing 📝 our paper as follows:
163
+
164
+ ```bibtex
165
  @misc{zhao2025chartcoderadvancingmultimodallarge,
166
  title={ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation},
167
  author={Xuanle Zhao and Xianzhen Luo and Qi Shi and Chi Chen and Shuo Wang and Wanxiang Che and Zhiyuan Liu and Maosong Sun},
 
171
  primaryClass={cs.AI},
172
  url={https://arxiv.org/abs/2501.06598},
173
  }
174
+ ```
175
+
176
+ ## Acknowledgement
177
+
178
+ The code is based on the [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT). Thanks for these great works and open sourcing!