Prompt
A studio portrait of a woman with bold makeup, highly detailed eyelashes, and glossy red lips, dramatic directional lighting creating high contrast shadows, perfect catchlights in the eyes, smooth skin texture without blemishes

Prompt
A dramatic full-body shot of a ballet dancer tying her pointe shoes backstage, perfect detail in the satin ribbons, diffused warm light from dressing room bulbs, realistic texture in tutu layers and skin.

Prompt
A detailed environmental portrait of a scientist in a high-tech lab, subtle reflections on safety goggles, complex machinery softly blurred in the background, crisp rendering of lab coat fabric and illuminated screens.

Prompt
A sharply detailed full-body shot of a male dancer mid-leap against a minimalist concrete backdrop, muscles and sweat droplets rendered with photorealistic clarity, dramatic side lighting creating bold shadows, sense of motion frozen in time, 8K resolution

Prompt
A lifelike environmental portrait of an elderly man sitting on a wooden bench in a sun-dappled park, finely rendered wrinkles and silver hair strands, warm afternoon light filtering through leaves, natural pastel color palette, shallow depth of field

Prompt
A photorealistic street-style portrait of a man leaning against a graffiti wall, crisp textures of leather jacket and denim, moody overcast lighting, vibrant but natural color tones, shallow focus on his face with bokeh city lights behind

Prompt
A detailed lifestyle portrait of a mother and child playing in a sunlit living room, lifelike skin tones, soft natural window light, intricate details in clothing textures and hair, emotive candid moment captured with a fast shutter look

Prompt
A hyper-realistic editorial-style shot of a man in a tailored suit walking along a rain-slicked street, reflections on the wet pavement, overcast sky mood, crisp drops of rain frozen mid-air, rich tonal contrast

Prompt
A cinematic close-up of a male violinist playing on stage, wood grain of the instrument and fine hairs on his bow tie rendered in sharp detail, warm spotlight glow, rich color depth, black background fading into darkness

flux.1-lite-8B-GRPO (DanceGRPO + PickScore)

A reward-aligned transformer for Flux 1‑Lite 8B, fine-tuned with DanceGRPO using PickScore as the reward signal.
This repository hosts the FluxTransformer2DModel weights only; load them on top of the original Freepik/flux.1-lite-8B base pipeline.

Highlights

🔁 RL post-training (DanceGRPO) for preference alignment

🏆 PickScore as the reward model to boost aesthetics and prompt faithfulness

⚡ Drop-in replacement for the base model’s transformer in Diffusers

Quick Start (Diffusers)

import torch
from PIL import Image
from diffusers.pipelines.flux.pipeline_flux import FluxPipeline
from diffusers.models.transformers.transformer_flux import FluxTransformer2DModel

# 1) Load the base Flux 1-Lite pipeline
base_model = "Freepik/flux.1-lite-8B"
pipe = FluxPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16).to("cuda")

# 2) Swap in the GRPO-tuned transformer from this repo
grpo_transformer = "Owen777/flux.1-lite-8B-GRPO"
transformer = FluxTransformer2DModel.from_pretrained(grpo_transformer, torch_dtype=torch.bfloat16).to("cuda")
pipe.transformer = transformer  # replace in-place


prompt="A studio portrait of a woman with bold makeup, highly detailed eyelashes, and glossy red lips, dramatic directional lighting creating high contrast shadows, perfect catchlights in the eyes, smooth skin texture without blemishes"
image = pipeline(prompt, 
    num_inference_steps=50, 
    guidance_scale=2.5,
    height=1024,
    width=1024,
    generator=torch.Generator(device="cuda").manual_seed(42)
    ).images[0]
image.save("flux_1_lite_grpo.png")

Tips

Start with num_inference_steps=28–60, guidance_scale=2.5–4.5.
Use BF16 on modern NVIDIA GPUs (A100/H100/H200/4090).
For strict reproducibility, set a fixed generator seed.

What’s in this repo?

Weights: FluxTransformer2DModel only (no VAE, no tokenizer, no text encoder).
How to use: Instantiate the base Flux 1‑Lite 8B pipeline and replace its transformer with ours (see Quick Start).

Training Summary

Base model: Freepik/flux.1-lite-8B
Algorithm: DanceGRPO (a GRPO-style preference optimization method)
Reward model: PickScore (image–text aesthetic & preference scoring)

Objective (high level)

We frame post-training as preference optimization: for a batch of prompts, multiple candidates are generated and scored by PickScore. DanceGRPO then updates the policy (the transformer) to increase the relative likelihood of higher-scoring samples while decreasing it for lower-scoring ones. This tends to:

Improve aesthetic quality favored by PickScore,
Encourage closer adherence to prompt intent, and
Stabilize detail rendition at higher resolutions.

Note: Dataset specifics, compute budget, and exact hyperparameters are not disclosed here. The method is model-agnostic and can be applied to other Flux variants.

Inference Recommendations

Resolution: 768–1024 on the long side is a good default.
Guidance: Lower guidance can look more natural; higher guidance is more literal but may reduce diversity.
Seeds: Fix a seed for deterministic results; sweep seeds for variety.
Batching: Use smaller batches when memory is tight; enable torch.backends.cuda.matmul.allow_tf32 = True if your environment supports it.

Limitations & Notes

Reward alignment reflects what PickScore prefers; it may over-optimize for aesthetics in some edge cases.
The model may still struggle with complex spatial compositing, tiny text, or very long prompts.
Safety and content filtering should be handled upstream (prompting) and/or downstream (moderation) according to your use case.

Compatibility

Diffusers Flux pipeline with transformer swap-in (as shown above).
Torch dtype: BF16 preferred on modern NVIDIA GPUs.
OS/Env: Standard PyTorch CUDA environments; no custom CUDA ops are required for basic inference.

Ethical & Responsible Use

This model inherits capabilities and risks from the base Flux family. You are responsible for complying with local laws, platform policies, and the license of the base model. Avoid generating harmful, biased, or misleading content. Apply additional filters if deploying to end users.

License

Usage of these weights inherits the license and usage restrictions of:

the base model (Freepik/flux.1-lite-8B) and
the reward model used during alignment (PickScore).
Review those licenses before commercial use or redistribution.

Acknowledgements

Freepik / Flux team for the original flux.1-lite-8B.
PickScore authors for open-sourcing a practical preference signal.
The broader community for tools in Diffusers & PyTorch that make rapid iteration possible.

FAQ

Q: Why only the transformer?
A: Keeping this repo to the transformer keeps downloads light and makes it easy to swap into existing Flux pipelines.

Q: Do I need to change schedulers or tokenizers?
A: No. Use the same components shipped with the base Flux 1‑Lite 8B pipeline.

Owen777
/

flux.1-lite-8B-GRPO