This model is the stage 2 checkpoint of one of the thirteen settings, CLIP+DINOv2@336, used in the Law of Vision Representation in MLLMs.

Safetensors

Model size

7.37B params

Tensor type

BF16

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.