ViG Model Card

Model Details

ViG is a generic backbone trained on the ImageNet-1K dataset for vision tasks.

  • Developed by: HUST, Horizon Robotics
  • Model type: A generic vision backbone based on the Gated Linear Attention (GLA) architecture.
  • License: Non-commercial license

Model Sources

Uses

The primary use of ViG is research on vision tasks, e.g., classification, segmentation, detection, and instance segmentation, with an GLA-based backbone. The primary intended users of the model are researchers and hobbyists in computer vision, machine learning, and artificial intelligence.

Training Details

ViG is pretrained on ImageNet-1K with classification supervision. The training data is around 1.3M images from ImageNet-1K dataset. See more details in this paper.

Evaluation

ViG is evaluated on ImageNet-1K val set, more details can be found in this paper.

Additional Information

Citation Information

 @article{vig,
  title={ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention},
  author={Bencheng Liao and Xinggang Wang and Lianghui Zhu and Qian Zhang and Chang Huang},
  journal={arXiv preprint arXiv:2405.18425},
  year={2024}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.