kunato commited on
Commit
a04476a
·
verified ·
1 Parent(s): 99facfd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -5
README.md CHANGED
@@ -78,6 +78,22 @@ from typhoon_ocr import ocr_document
78
  markdown = ocr_document("test.png")
79
  print(markdown)
80
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
  **Run Manually**
82
 
83
  Below is a partial snippet. You can run inference using either the API or a local model.
@@ -149,7 +165,8 @@ response = openai.chat.completions.create(
149
  text_output = response.choices[0].message.content
150
  print(text_output)
151
  ```
152
- *Local Model (GPU Required)*:
 
153
  ```python
154
  # Initialize the model
155
  model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-7b", torch_dtype=torch.bfloat16 ).eval()
@@ -191,7 +208,7 @@ print(text_output[0])
191
 
192
  This model only works with the specific prompts defined below, where `{base_text}` refers to information extracted from the PDF metadata using the `get_anchor_text` function from the `typhoon-ocr` package. It will not function correctly with any other prompts.
193
 
194
- ```
195
  PROMPTS_SYS = {
196
  "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
197
  f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
@@ -212,16 +229,23 @@ PROMPTS_SYS = {
212
  ### Generation Parameters
213
 
214
  We suggest using the following generation parameters. Since this is an OCR model, we do not recommend using a high temperature. Make sure the temperature is set to 0 or 0.1, not higher.
215
- ```
216
  temperature=0.1,
217
  top_p=0.6,
218
  repetition_penalty: 1.2
219
  ```
220
 
221
  ## Hosting
222
- ```
 
 
223
  vllm serve scb10x/typhoon-ocr-7b --max-model-len 32000 # OpenAI Compatible at http://localhost:8000
224
- # then you can supply base_url in to ocr_document('image.jpg', base_url='http://localhost:8000/v1')
 
 
 
 
 
225
  ```
226
 
227
  ## **Intended Uses & Limitations**
 
78
  markdown = ocr_document("test.png")
79
  print(markdown)
80
  ```
81
+
82
+ **(Recommended): Local Model via vllm (GPU Required)**:
83
+
84
+ ```bash
85
+ pip install vllm
86
+ vllm serve scb10x/typhoon-ocr-7b --max-model-len 32000 --served-model-name typhoon-ocr-preview # OpenAI Compatible at http://localhost:8000 (or other port)
87
+ # then you can supply base_url in to ocr_document
88
+ ```
89
+
90
+ ```python
91
+ from typhoon_ocr import ocr_document
92
+ markdown = ocr_document('image.png', base_url='http://localhost:8000/v1', api_key='anything-is-ok')
93
+ print(markdown)
94
+ ```
95
+ To read more about [vllm](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)
96
+
97
  **Run Manually**
98
 
99
  Below is a partial snippet. You can run inference using either the API or a local model.
 
165
  text_output = response.choices[0].message.content
166
  print(text_output)
167
  ```
168
+
169
+ *(Not Recommended): Local Model - Transformers (GPU Required)*:
170
  ```python
171
  # Initialize the model
172
  model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-7b", torch_dtype=torch.bfloat16 ).eval()
 
208
 
209
  This model only works with the specific prompts defined below, where `{base_text}` refers to information extracted from the PDF metadata using the `get_anchor_text` function from the `typhoon-ocr` package. It will not function correctly with any other prompts.
210
 
211
+ ```python
212
  PROMPTS_SYS = {
213
  "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
214
  f"Simply return the markdown representation of this document, presenting tables in markdown format as they naturally appear.\n"
 
229
  ### Generation Parameters
230
 
231
  We suggest using the following generation parameters. Since this is an OCR model, we do not recommend using a high temperature. Make sure the temperature is set to 0 or 0.1, not higher.
232
+ ```python
233
  temperature=0.1,
234
  top_p=0.6,
235
  repetition_penalty: 1.2
236
  ```
237
 
238
  ## Hosting
239
+
240
+ We recommend to inference typhoon-ocr using [vllm](https://github.com/vllm-project/vllm) instead of huggingface transformers, and using typhoon-ocr library to ocr documents. To read more about [vllm](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)
241
+ ```bash
242
  vllm serve scb10x/typhoon-ocr-7b --max-model-len 32000 # OpenAI Compatible at http://localhost:8000
243
+ # then you can supply base_url in to ocr_document
244
+ ```
245
+
246
+ ```python
247
+ from typhoon_ocr import ocr_document
248
+ ocr_document('image.jpg', base_url='http://localhost:8000/v1')
249
  ```
250
 
251
  ## **Intended Uses & Limitations**