YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Image Similarity Search Engine

A deep learning-based image similarity search engine that uses EfficientNetB0 for feature extraction and FAISS for fast similarity search. The application provides a web interface built with Streamlit for easy interaction.

Features

  • Deep Feature Extraction: Uses EfficientNetB0 (pre-trained on ImageNet) to extract meaningful features from images
  • Fast Similarity Search: Implements FAISS for efficient nearest-neighbor search
  • Interactive Web Interface: User-friendly interface built with Streamlit
  • Real-time Processing: Shows progress and time estimates during feature extraction
  • Scalable Architecture: Designed to handle large image datasets efficiently

Installation

Prerequisites

Python 3.8 or higher pip package manager

Setup

  1. Clone the repository:
git clone https://github.com/yourusername/image-similarity-search.git
cd image-similarity-search
  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate
  1. Install required packages:
pip install -r requirements.txt

Project Structure

image-similarity-search/
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ images/                     # Directory for train dataset images
β”‚   β”œβ”€β”€ sample-test-images/         # Directory for test dataset images
β”‚   └── embeddings.pkl              # Pre-computed image embeddings
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ feature_extractor.py    # EfficientNetB0 feature extraction
β”‚   β”œβ”€β”€ preprocessing.py        # Image preprocessing and embedding computation
β”‚   β”œβ”€β”€ similarity_search.py    # FAISS-based similarity search
β”‚   └── main.py                 # Streamlit web interface
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ README.md
└── .gitignore

Usage

  1. Prepare Your Dataset: Get training image dataset from drive:
https://drive.google.com/file/d/1U2PljA7NE57jcSSzPs21ZurdIPXdYZtN/view?usp=drive_link

Place your image dataset in the data/images directory Supported formats: JPG, JPEG, PNG

  1. Generate Embeddings:
python -m src.preprocessing

This will:

  • Process all images in the dataset
  • Show progress and time estimates
  • Save embeddings to data/embeddings.pkl
  1. Run the Web Interface:
streamlit run src/main.py
  1. Using the Interface:
  • Upload a query image using the file uploader
  • Click "Search Similar Images"
  • View the most similar images from your dataset

Technical Details

Feature Extraction

  • Uses EfficientNetB0 without top layers
  • Input image size: 224x224 pixels
  • Output feature dimension: 1280

Similarity Search

  • Uses FAISS IndexFlatL2 for L2 distance-based search
  • Returns top-k most similar images (default k=5)

Web Interface

  • Responsive design with Streamlit
  • Displays original and similar images with similarity scores
  • Progress tracking during processing

Dependencies

  • TensorFlow 2.x
  • FAISS-cpu (or FAISS-gpu for GPU support)
  • Streamlit
  • Pillow
  • NumPy
  • tqdm

Performance

  • Feature extraction: ~1 second per image on CPU
  • Similarity search: Near real-time for datasets up to 100k images
  • Memory usage depends on dataset size (approximately 5KB per image embedding)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.