ComplexityDiT - Diffusion Transformer with INL Dynamics

Diffusion Transformer enhanced with PID-style dynamics control for smoother denoising.

Architecture

Input -> [Attention -> MLP -> Dynamics] x 12 -> Output

Core equations:

  • Attention: softmax(QK^T/sqrt(d)) * V
  • MLP: W2 * GELU(W1 * x)
  • Dynamics: h += dt * gate * (alpha*v - beta*(h - mu))

Model Details

Parameter Value
Architecture ComplexityDiT-S
Parameters 114M
Layers 12
Hidden dim 384
Heads 6
Experts 4
Dynamics Enabled

Training

  • Dataset: huggan/wikiart
  • Steps: 20,000
  • Batch size: 16
  • Mixed precision: FP16

Usage

from safetensors.torch import load_file
from complexity_diffusion import ComplexityDiT

# Load model
model = ComplexityDiT.from_config('S', context_dim=768)
state_dict = load_file('model.safetensors')
model.load_state_dict(state_dict)

INL Dynamics

The dynamics layer adds robotics-grade control to stabilize denoising trajectories:

  • mu - learnable equilibrium (target position)
  • alpha - inertia (momentum)
  • beta - correction strength (spring constant)
  • gate - amplitude control

This creates smooth, stable trajectories like a PID controller guiding the model toward clean images.

Links

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support