Hello everyone, am I wrong or is the zero shot classification not really working. You can take a look at the provided example, we replaced the original image with another one and added the text "golf" to the text corpus. Also with other examples, and different backbones, it seems like that zero shot is not really working.