metadata
license: apache-2.0
base_model:
- apple/aimv2-large-patch14-448
pipeline_tag: zero-shot-object-detection
tags:
- t5gemma
- vision
TMoon
- 3D positional embedding (as in the text encoder replacement, object detection and others)
- Supports multiple image aspect ratios
- Contrastive prediction model
- AIMv2 vision model
- T5Gemma text encoder
- Moondream2 dataset