ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video(ECCV2024)

This repo is the official model checkpoints of "ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video"(ECCV2024)

Models

We provide the checkpoints before reparameterization, you could reparameter the weight refer to tools\weight_reparam.py in our codes.

Kinetics 400

Backbone Pretrain GFLOPs Param New Param (M) acc@1 Views
ViT-B/16 CLIP 422 86 0 83.0 8x1x3
ViT-L/14 CLIP 1946 304 0 86.3 8x1x3
ViT-L/14 CLIP 7783 304 0 87.2 32x1x3

Something Something V2

Backbone Pretrain GFLOPs Param New Param (M) acc@1 Views
ViT-L/14 CLIP 7783 304 0 72.2 32x3x1

If you find our work useful in your research, please cite:

@article{li2023zeroi2v,
  title={ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video},
  author={Li, Xinhao and Zhu, Yuhan and Wang, Limin},
  journal={arXiv preprint arXiv:2310.01324},
  year={2023}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.