Towards Training-free Anomaly Detection with Vision and Language Foundation Models (CVPR 2025)
System Requirements
Hardware Requirements:
- GPU Memory: 32GB VRAM (for running complete experiments)
- Storage: 70GB free disk space (for models, datasets, and results)
- CUDA: Compatible GPU with CUDA 12.1 support
Software Requirements:
- Python 3.10
- Conda (recommended for environment management)
- CUDA 12.1 runtime
Note: The memory and storage requirements are for running the full experimental pipeline on all categories with visualization enabled. Smaller experiments on individual categories may require less resources.
Installation
Automated Setup (Recommended)
Run the setup script to automatically configure the complete environment:
bash scripts/setup_environment.sh
This script will:
- Create a conda environment named
logsad
with Python 3.10 - Install PyTorch with CUDA 12.1 support
- Install all required dependencies from
requirements.txt
- Configure numpy compatibility
Manual Setup
If you prefer manual setup, download the checkpoint for ViT-H SAM model and put it in the checkpoint folder.
After installation, activate the environment:
conda activate logsad
Instructions for MVTEC LOCO dataset
Quick Start (Recommended)
Run evaluation for all categories using the provided shell scripts:
Few-shot Protocol:
bash scripts/run_few_shot.sh
Full-data Protocol:
bash scripts/run_full_data.sh
Manual Execution
Few-shot Protocol
Run the script for few-shot protocal:
python evaluation.py --module_path model_ensemble_few_shot --category CATEGORY --dataset_path DATASET_PATH
Full-data Protocol
Run the script to compute coreset for full-data scenarios:
python compute_coreset.py --module_path model_ensemble --category CATEGORY --dataset_path DATASET_PATH
Run the script for full-data protocol:
python evaluation.py --module_path model_ensemble --category CATEGORY --dataset_path DATASET_PATH
Available categories: breakfast_box, juice_bottle, pushpins, screw_bag, splicing_connectors
Acknowledgement
We are grateful for the following awesome projects when implementing LogSAD:
Citation
If you find our paper is helpful in your research or applications, generously cite with
@inproceedings{zhang2025logsad,
title={Towards Training-free Anomaly Detection with Vision and Language Foundation Models},
author={Jinjin Zhang, Guodong Wang, Yizhou Jin, Di Huang},
year={2025},
booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
}