metadata
license: apache-2.0
datasets:
- prithivMLmods/Deepfake-vs-Real
language:
- en
base_model:
- prithivMLmods/Deepfake-Detection-Exp-02-21
pipeline_tag: image-classification
library_name: transformers
tags:
- Deepfake
Deepfake-Detection-Exp-02-21-ONNX
Deepfake-Detection-Exp-02-21 is a minimalist, high-quality dataset trained on a ViT-based model for image classification, distinguishing between deepfake and real images. The model is based on Google's google/vit-base-patch16-224-in21k
.
Mapping of IDs to Labels: {0: 'Deepfake', 1: 'Real'}
Mapping of Labels to IDs: {'Deepfake': 0, 'Real': 1}
Classification report:
precision recall f1-score support
Deepfake 0.9962 0.9806 0.9883 1600
Real 0.9809 0.9962 0.9885 1600
accuracy 0.9884 3200
macro avg 0.9886 0.9884 0.9884 3200
weighted avg 0.9886 0.9884 0.9884 3200
Inference with Hugging Face Pipeline
from transformers import pipeline
# Load the model
pipe = pipeline('image-classification', model="prithivMLmods/Deepfake-Detection-Exp-02-21", device=0)
# Predict on an image
result = pipe("path_to_image.jpg")
print(result)
Inference with PyTorch
from transformers import ViTForImageClassification, ViTImageProcessor
from PIL import Image
import torch
# Load the model and processor
model = ViTForImageClassification.from_pretrained("prithivMLmods/Deepfake-Detection-Exp-02-21")
processor = ViTImageProcessor.from_pretrained("prithivMLmods/Deepfake-Detection-Exp-02-21")
# Load and preprocess the image
image = Image.open("path_to_image.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
# Perform inference
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class = torch.argmax(logits, dim=1).item()
# Map class index to label
label = model.config.id2label[predicted_class]
print(f"Predicted Label: {label}")
Limitations
- Generalization Issues β The model may not perform well on deepfake images generated by unseen or novel deepfake techniques.
- Dataset Bias β The training data might not cover all variations of real and fake images, leading to biased predictions.
- Resolution Constraints β Since the model is based on
vit-base-patch16-224-in21k
, it is optimized for 224x224 image resolution, which may limit its effectiveness on high-resolution images. - Adversarial Vulnerabilities β The model may be susceptible to adversarial attacks designed to fool vision transformers.
- False Positives & False Negatives β The model may occasionally misclassify real images as deepfake and vice versa, requiring human validation in critical applications.
Intended Use
- Deepfake Detection β Designed for identifying deepfake images in media, social platforms, and forensic analysis.
- Research & Development β Useful for researchers studying deepfake detection and improving ViT-based classification models.
- Content Moderation β Can be integrated into platforms to detect and flag manipulated images.
- Security & Forensics β Assists in cybersecurity applications where verifying the authenticity of images is crucial.
- Educational Purposes β Can be used in training AI practitioners and students in the field of computer vision and deepfake detection.