Usage

from transformers import AutoModel
import PIL
import torch

batch_of_images = [PIL.Image.open("image1.jpg"), PIL.Image.open("image2.jpg")]
model = AutoModel.from_pretrained("ragavsachdeva/magiv2-crop-embedder", trust_remote_code=True).cuda().eval()
with torch.no_grad():
    embeddings = model(batch_of_images)

print(embeddings.shape)

License and Citation

The provided model is available for unrestricted use in personal, research, non-commercial, and not-for-profit endeavors. For any other usage scenarios, kindly contact me via email, providing a detailed description of your requirements, to establish a tailored licensing arrangement. My contact information can be found on my website: ragavsachdeva [dot] github [dot] io

@misc{magiv2,
      title={Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names}, 
      author={Ragav Sachdeva and Gyungin Shin and Andrew Zisserman},
      year={2024},
      eprint={2408.00298},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.00298}, 
}