Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,44 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: cc-by-nc-4.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-4.0
|
| 3 |
+
base_model:
|
| 4 |
+
- Qwen/Qwen3-1.7B
|
| 5 |
+
- google/siglip2-so400m-patch14-384
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
# [Isaac-0.1 by Perceptron](https://www.perceptron.inc/blog/introducing-isaac-0-1)
|
| 9 |
+
*Note this is the Post-trained model* [Try out the model on our playground](https://www.perceptron.inc/demo)
|
| 10 |
+
|
| 11 |
+
We're introducing Isaac 0.1, our first perceptive-language model and a major step toward building AI systems that can understand and interact with the physical world. Isaac 0.1 is an open-source, 2B-parameter model built for real-world applications. It sets a new standard for efficiency, delivering capabilities that meet or exceed those of models over 50 times its size.
|
| 12 |
+
|
| 13 |
+
Founded by the team behind Meta's Chameleon multimodal models, Perceptron is tackling a fundamental challenge: bringing the power of physical AI to the dynamic, multimodal, and real-time environments we live and work in.
|
| 14 |
+
|
| 15 |
+
Isaac 0.1 is the first in our family of models built to be the intelligence layer for the physical world. It's now available open source for researchers and developers everywhere.
|
| 16 |
+
|
| 17 |
+
## What’s new in Isaac 0.1
|
| 18 |
+
**Visual QA, simply trained**
|
| 19 |
+
Strong results on standard understanding benchmarks with a straightforward, reproducible training recipe.
|
| 20 |
+
|
| 21 |
+
**Grounded spatial intelligence**
|
| 22 |
+
Precise pointing and localization with robust spatial reasoning. Ask “what’s broken in this machine?” and get grounded answers with highlighted regions—handling occlusions, relationships, and object interactions.
|
| 23 |
+
|
| 24 |
+
**In-context learning for perception**
|
| 25 |
+
Show a few annotated examples (defects, safety conditions, etc.) in the prompt and the model adapts—no YOLO-style fine-tuning or custom detector stacks required.
|
| 26 |
+
|
| 27 |
+
**OCR & fine-grained detail**
|
| 28 |
+
Reads small text and dense scenes reliably, across resolutions, with dynamic image handling for tiny features and cluttered layouts.
|
| 29 |
+
|
| 30 |
+
**Conversational Pointing**
|
| 31 |
+
A new interaction pattern where language and vision stay in lockstep: every claim is grounded and visually cited, reducing hallucinations and making reasoning auditable.
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
## Benchmarks
|
| 35 |
+
|
| 36 |
+

|
| 37 |
+

|
| 38 |
+
|
| 39 |
+
## Example
|
| 40 |
+
```bash
|
| 41 |
+
pip install perceptron
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
[Huggingface Example Repo](https://github.com/perceptron-ai-inc/perceptron/tree/main/huggingface)
|