CLIP ViT Base Patch32 Fine-Tuned on PatchCamelyon (PCAM)

Overview

This repository contains a fine-tuned version of the CLIP ViT Base Patch32 model on the PatchCamelyon (PCAM) dataset. The model is optimized for histopathological image classification.

Model Details

  • Base Model: openai/clip-vit-base-patch32
  • Dataset: PatchCamelyon
  • Fine-tuned for: Medical image classification (tumor vs. non-tumor)
  • Evaluation Results:
    • Train Accuracy: 94.35%
    • Validation Accuracy: 95.16%
  • Hardware: Trained on GPU-A100

Training Performance

  • Train Loss: 0.1520
  • Train Accuracy: 94.35%
  • Validation Accuracy: 95.16%

Usage

Installation

Ensure you have transformers, torch, and safetensors installed:

pip install transformers torch safetensors

Loading the Model

from transformers import CLIPProcessor, CLIPModel
import torch

model_path = "lens-ai/clip-vit-base-patch32_pcam_finetuned"
model = CLIPModel.from_pretrained(model_path)
processor = CLIPProcessor.from_pretrained(model_path)

Running Inference

from PIL import Image

image = Image.open("sample_image.png")
inputs = processor(images=image, return_tensors="pt")
outputs = model.get_image_features(**inputs)

Evaluation

We plan to release additional metrics, including robustness evaluation with adversarial attacks in future updates.

License

This model is released under the MIT License.

Contact

For any questions, please reach out to Venkata Tej at LensAI.

Downloads last month
12
Safetensors
Model size
87.5M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train lens-ai/clip-vit-base-patch32_pcam_finetuned