CLIP ViT Base Patch32 Fine-Tuned on PatchCamelyon (PCAM)

Overview

This repository contains a fine-tuned version of the CLIP ViT Base Patch32 model on the PatchCamelyon (PCAM) dataset. The model is optimized for histopathological image classification.

Model Details

Base Model: openai/clip-vit-base-patch32
Dataset: PatchCamelyon
Fine-tuned for: Medical image classification (tumor vs. non-tumor)
Evaluation Results:
- Train Accuracy: 94.35%
- Validation Accuracy: 95.16%
Hardware: Trained on GPU-A100

Training Performance

Train Loss: 0.1520
Train Accuracy: 94.35%
Validation Accuracy: 95.16%

Usage

Installation

Ensure you have transformers, torch, and safetensors installed:

pip install transformers torch safetensors

Loading the Model

from transformers import CLIPProcessor, CLIPModel
import torch

model_path = "lens-ai/clip-vit-base-patch32_pcam_finetuned"
model = CLIPModel.from_pretrained(model_path)
processor = CLIPProcessor.from_pretrained(model_path)

Running Inference

from PIL import Image

image = Image.open("sample_image.png")
inputs = processor(images=image, return_tensors="pt")
outputs = model.get_image_features(**inputs)

Evaluation

We plan to release additional metrics, including robustness evaluation with adversarial attacks in future updates.

License

This model is released under the MIT License.

Contact

For any questions, please reach out to Venkata Tej at LensAI.

lens-ai
/

clip-vit-base-patch32_pcam_finetuned