autoencoder / README.md
AndrewMayesPrezzee
Readme Updated
f7451e7
|
raw
history blame
16.6 kB
metadata
library_name: transformers
pipeline_tag: feature-extraction
license: apache-2.0
tags:
  - autoencoder
  - pytorch
  - reconstruction
  - preprocessing
  - normalizing-flow
  - scaler

Autoencoder Implementation for Hugging Face Transformers

A complete autoencoder implementation that integrates seamlessly with the Hugging Face Transformers ecosystem, providing all the standard functionality you expect from transformer models.

Install-and-Use from the Hub (code repo)

If you want to use the implementation directly from the Hub code repository (without a packaged pip install), you can download the repo and add it to sys.path:

from huggingface_hub import snapshot_download
import sys, torch

# 1) Download the code+weights for your repo “as is”
repo_dir = snapshot_download(
    repo_id="amaye15/autoencoder",
    repo_type="model",
    allow_patterns=["*.py", "config.json", "*.safetensors"],  # note the * wildcards
)

# 2) Add to import path so plain imports work
sys.path.append(repo_dir)

# 3) Import your classes from the repo code
from configuration_autoencoder import AutoencoderConfig
from modeling_autoencoder import AutoencoderForReconstruction

# 4) Load the placeholder weights from the local folder (no internet, no code refresh)
model = AutoencoderForReconstruction.from_pretrained(repo_dir)

# 5) Quick smoke test
x = torch.randn(8, 20)
out = model(input_values=x)
print("latent:", out.last_hidden_state.shape, "reconstructed:", out.reconstructed.shape)

🚀 Features

  • Full Hugging Face Integration: Compatible with AutoModel, AutoConfig, and AutoTokenizer patterns
  • Standard Training Workflows: Works with Trainer, TrainingArguments, and all HF training utilities
  • Model Hub Compatible: Save and share models on Hugging Face Hub with push_to_hub()
  • Flexible Architecture: Configurable encoder-decoder architecture with various activation functions
  • Multiple Loss Functions: Support for MSE, BCE, L1, Huber, Smooth L1, KL Divergence, Cosine, Focal, Dice, Tversky, SSIM, and Perceptual loss
  • Multiple Autoencoder Types (7): Classic, Variational (VAE), Beta-VAE, Denoising, Sparse, Contractive, and Recurrent autoencoders
  • Extended Activation Functions: 18+ activation functions including ReLU, GELU, Swish, Mish, ELU, and more
  • Learnable Preprocessing: Neural Scaler, Normalizing Flow, MinMax Scaler (learnable), Robust Scaler (learnable), and Yeo-Johnson preprocessors (2D and 3D tensors)
  • Extensible Design: Easy to extend for new autoencoder variants and custom loss functions
  • Production Ready: Proper serialization, checkpointing, and inference support

🏗️ Architecture

The implementation consists of three main components:

1. AutoencoderConfig

Configuration class that inherits from PretrainedConfig:

  • Defines model architecture parameters
  • Handles validation and serialization
  • Enables AutoConfig.from_pretrained() functionality

2. AutoencoderModel

Base model class that inherits from PreTrainedModel:

  • Implements encoder-decoder architecture
  • Provides latent space representation
  • Returns structured outputs with AutoencoderOutput

3. AutoencoderForReconstruction

Task-specific model for reconstruction:

  • Adds reconstruction loss calculation
  • Compatible with Trainer for easy training
  • Returns AutoencoderForReconstructionOutput with loss

🔧 Quick Start

Basic Usage

from configuration_autoencoder import AutoencoderConfig
from modeling_autoencoder import AutoencoderForReconstruction
import torch

# Create configuration
config = AutoencoderConfig(
    input_dim=784,              # Input dimensionality (e.g., 28x28 images flattened)
    hidden_dims=[512, 256],     # Encoder hidden layers
    latent_dim=64,              # Latent space dimension
    activation="gelu",          # Activation function (18+ options available)
    reconstruction_loss="mse",  # Loss function (12+ options available)
    autoencoder_type="classic", # Autoencoder type (7 types available)
    # Optional learnable preprocessing
    use_learnable_preprocessing=True,
    preprocessing_type="neural_scaler",  # or "normalizing_flow", "minmax_scaler", "robust_scaler", "yeo_johnson"
)

# Create model
model = AutoencoderForReconstruction(config)

# Forward pass
input_data = torch.randn(32, 784)  # Batch of 32 samples
outputs = model(input_values=input_data)

print(f"Reconstruction loss: {outputs.loss}")
print(f"Latent shape: {outputs.last_hidden_state.shape}")
print(f"Reconstructed shape: {outputs.reconstructed.shape}")

Training with Hugging Face Trainer

from transformers import Trainer, TrainingArguments
from torch.utils.data import Dataset

class AutoencoderDataset(Dataset):
    def __init__(self, data):
        self.data = torch.FloatTensor(data)

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        return {
            "input_values": self.data[idx],
            "labels": self.data[idx]  # For autoencoder, input = target
        }

# Prepare data
train_dataset = AutoencoderDataset(your_training_data)
val_dataset = AutoencoderDataset(your_validation_data)

# Training arguments
training_args = TrainingArguments(
    output_dir="./autoencoder_output",
    num_train_epochs=10,
    per_device_train_batch_size=64,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    evaluation_strategy="steps",
    eval_steps=500,
    save_steps=1000,
    load_best_model_at_end=True,
)

# Create trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
)

# Train
trainer.train()

# Save model
model.save_pretrained("./my_autoencoder")
config.save_pretrained("./my_autoencoder")

Using AutoModel Framework

from register_autoencoder import register_autoencoder_models
from transformers import AutoConfig, AutoModel

# Register models with AutoModel framework
register_autoencoder_models()

# Now you can use standard HF patterns
config = AutoConfig.from_pretrained("./my_autoencoder")
model = AutoModel.from_pretrained("./my_autoencoder")

# Use the model
outputs = model(input_values=your_data)

⚙️ Configuration Options

The AutoencoderConfig class supports extensive customization:

config = AutoencoderConfig(
    input_dim=784,                    # Input dimension
    hidden_dims=[512, 256, 128],      # Encoder hidden layers
    latent_dim=64,                    # Latent space dimension
    activation="gelu",                # Activation function (see full list below)
    dropout_rate=0.1,                 # Dropout rate (0.0 to 1.0)
    use_batch_norm=True,              # Use batch normalization
    tie_weights=False,                # Tie encoder/decoder weights
    reconstruction_loss="mse",        # Loss function (see full list below)
    autoencoder_type="variational",   # Autoencoder type (see types below)
    beta=0.5,                         # Beta parameter for β-VAE
    temperature=1.0,                  # Temperature for Gumbel softmax
    noise_factor=0.1,                 # Noise factor for denoising AE
    # Recurrent autoencoder parameters
    rnn_type="lstm",                  # RNN type: "lstm", "gru", "rnn"
    num_layers=2,                     # Number of RNN layers
    bidirectional=True,               # Bidirectional encoding
    sequence_length=None,             # Fixed sequence length (None for variable)
    teacher_forcing_ratio=0.5,        # Teacher forcing ratio during training
    # Learnable preprocessing parameters
    use_learnable_preprocessing=False, # Enable learnable preprocessing
    preprocessing_type="none",        # "none", "neural_scaler", "normalizing_flow"
    preprocessing_hidden_dim=64,      # Hidden dimension for preprocessing networks
    preprocessing_num_layers=2,       # Number of layers in preprocessing networks
    learn_inverse_preprocessing=True, # Learn inverse transformation
    flow_coupling_layers=4,           # Number of coupling layers for flows
)

🎛️ Available Activation Functions

Standard Activations:

  • relu, leaky_relu, relu6, elu, prelu
  • tanh, sigmoid, hardsigmoid, hardtanh
  • gelu, swish, silu, hardswish
  • mish, softplus, softsign, tanhshrink, threshold

📊 Available Loss Functions

Regression Losses:

  • mse - Mean Squared Error
  • l1 - L1/MAE Loss
  • huber - Huber Loss
  • smooth_l1 - Smooth L1 Loss

Classification/Probability Losses:

  • bce - Binary Cross Entropy
  • kl_div - KL Divergence
  • focal - Focal Loss

Similarity Losses:

  • cosine - Cosine Similarity Loss
  • ssim - Structural Similarity Loss
  • perceptual - Perceptual Loss

Segmentation Losses:

  • dice - Dice Loss
  • tversky - Tversky Loss

🏗️ Available Autoencoder Types

Classic Autoencoder (classic)

  • Standard encoder-decoder architecture
  • Direct reconstruction loss minimization

Variational Autoencoder (variational)

  • Probabilistic latent space with mean and variance
  • KL divergence regularization
  • Reparameterization trick for sampling

Beta-VAE (beta_vae)

  • Variational autoencoder with adjustable β parameter
  • Better disentanglement of latent factors

Denoising Autoencoder (denoising)

  • Adds noise to input during training
  • Learns robust representations
  • Configurable noise factor

Sparse Autoencoder (sparse)

  • Encourages sparse latent representations
  • L1 regularization on latent activations
  • Useful for feature selection

Contractive Autoencoder (contractive)

  • Penalizes large gradients of latent w.r.t. input
  • Learns smooth manifold representations
  • Robust to small input perturbations

Recurrent Autoencoder (recurrent)

  • LSTM/GRU/RNN encoder-decoder architecture
  • Bidirectional encoding for better sequence representations
  • Variable length sequence support with padding
  • Teacher forcing during training for stable learning
  • Sequence-to-sequence reconstruction

## 📊 Model Outputs

### AutoencoderOutput

The base model `AutoencoderModel` returns the following output:

@dataclass
class AutoencoderOutput(ModelOutput):
    last_hidden_state: torch.FloatTensor = None    # Latent representation
    reconstructed: torch.FloatTensor = None        # Reconstructed input
    hidden_states: Tuple[torch.FloatTensor] = None # Intermediate states
    attentions: Tuple[torch.FloatTensor] = None    # Not used

AutoencoderForReconstructionOutput

@dataclass
class AutoencoderForReconstructionOutput(ModelOutput):
    loss: torch.FloatTensor = None                 # Reconstruction loss
    reconstructed: torch.FloatTensor = None        # Reconstructed input
    last_hidden_state: torch.FloatTensor = None    # Latent representation
    hidden_states: Tuple[torch.FloatTensor] = None # Intermediate states

🔬 Advanced Usage

Custom Loss Functions

You can easily extend the model with custom loss functions:

class CustomAutoencoder(AutoencoderForReconstruction):
    def _compute_reconstruction_loss(self, reconstructed, target):
        # Custom loss implementation
        return your_custom_loss(reconstructed, target)

Recurrent Autoencoder for Sequences

Perfect for time series, text, and sequential data:

config = AutoencoderConfig(
    input_dim=50,              # Feature dimension per timestep
    latent_dim=32,             # Compressed representation size
    autoencoder_type="recurrent",
    rnn_type="lstm",           # or "gru", "rnn"
    num_layers=2,              # Number of RNN layers
    bidirectional=True,        # Bidirectional encoding
    teacher_forcing_ratio=0.7, # Teacher forcing during training
    sequence_length=None       # Variable length sequences
)

# Usage with sequence data
model = AutoencoderForReconstruction(config)
sequence_data = torch.randn(batch_size, seq_len, input_dim)
outputs = model(input_values=sequence_data)

Learnable Preprocessing

Deep learning-based data normalization that adapts to your data:

# Neural Scaler - Learnable alternative to StandardScaler
config = AutoencoderConfig(
    input_dim=20,
    latent_dim=10,
    use_learnable_preprocessing=True,
    preprocessing_type="neural_scaler",
    preprocessing_hidden_dim=64
)

# Normalizing Flow - Invertible transformations
config = AutoencoderConfig(
    input_dim=20,
    latent_dim=10,
    use_learnable_preprocessing=True,
    preprocessing_type="normalizing_flow",
    flow_coupling_layers=4
)

# Works with all autoencoder types and sequence data
model = AutoencoderForReconstruction(config)
outputs = model(input_values=data)
print(f"Preprocessing loss: {outputs.preprocessing_loss}")
# Learnable MinMax Scaler - scales to [0, 1] with learnable bounds
config = AutoencoderConfig(
    input_dim=20,
    latent_dim=10,
    use_learnable_preprocessing=True,
    preprocessing_type="minmax_scaler",
)

# Learnable Robust Scaler - robust to outliers using median/IQR
config = AutoencoderConfig(
    input_dim=20,
    latent_dim=10,
    use_learnable_preprocessing=True,
    preprocessing_type="robust_scaler",
)

# Learnable Yeo-Johnson - power transform for skewed distributions
config = AutoencoderConfig(
    input_dim=20,
    latent_dim=10,
    use_learnable_preprocessing=True,
    preprocessing_type="yeo_johnson",
)

Variational Autoencoder Extension

The configuration supports variational autoencoders:

config = AutoencoderConfig(
    autoencoder_type="variational",
    beta=0.5,  # β-VAE parameter
    # ... other parameters
)

Integration with Datasets Library

from datasets import Dataset

# Convert your data to HF Dataset
dataset = Dataset.from_dict({
    "input_values": your_data_list
})

# Use with Trainer
trainer = Trainer(
    model=model,
    train_dataset=dataset,
    # ... other arguments
)

📁 Project Structure

autoencoder/
├── __init__.py                    # Package initialization
├── configuration_autoencoder.py   # Configuration class
├── modeling_autoencoder.py        # Model implementations
├── register_autoencoder.py        # AutoModel registration
├── pyproject.toml                 # Project metadata and dependencies
└── README.md                      # This file

🤝 Contributing

This implementation follows Hugging Face conventions and can be easily extended:

  1. Adding new architectures: Extend AutoencoderModel or create new model classes
  2. Custom configurations: Add parameters to AutoencoderConfig
  3. Task-specific heads: Create new classes like AutoencoderForReconstruction
  4. Integration: Register new models with the AutoModel framework

📚 References

🎯 Use Cases

This autoencoder implementation is perfect for:

  • Dimensionality Reduction: Compress high-dimensional data to lower dimensions
  • Anomaly Detection: Identify outliers based on reconstruction error
  • Data Denoising: Remove noise from corrupted data
  • Feature Learning: Learn meaningful representations for downstream tasks
  • Data Generation: Generate new samples similar to training data
  • Pretraining: Initialize encoders for other tasks

🔍 Model Comparison

Feature Standard PyTorch This Implementation
HF Integration
AutoModel Support
Trainer Compatible
Hub Integration
Config Management Manual ✅ Automatic
Serialization Manual ✅ Built-in
Checkpointing Manual ✅ Built-in

🚀 Performance Tips

  1. Batch Size: Use larger batch sizes for better GPU utilization
  2. Learning Rate: Start with 1e-3 and adjust based on convergence
  3. Architecture: Gradually decrease hidden dimensions for better compression
  4. Regularization: Use dropout and batch normalization for better generalization
  5. Loss Function: Choose appropriate loss based on your data type

📄 License

This implementation is provided as an example and follows the same license terms as Hugging Face Transformers.