Update README.md
Browse files
README.md
CHANGED
@@ -167,7 +167,7 @@ The architecture of granite-vision-3.1-2b-preview consists of the following comp
|
|
167 |
|
168 |
(3) Large language model: granite-3.1-2b-instruct with 128k context length (https://huggingface.co/ibm-granite/granite-3.1-2b-instruct).
|
169 |
|
170 |
-
We built upon
|
171 |
|
172 |
|
173 |
**Training Data:**
|
|
|
167 |
|
168 |
(3) Large language model: granite-3.1-2b-instruct with 128k context length (https://huggingface.co/ibm-granite/granite-3.1-2b-instruct).
|
169 |
|
170 |
+
We built upon LLaVA (https://llava-vl.github.io) to train our model. We use multi-layer encoder features and a denser grid resolution in AnyRes to enhance the model's ability to understand nuanced visual content, which is essential for accurately interpreting document images.
|
171 |
|
172 |
|
173 |
**Training Data:**
|