Can we teach a model to think completely on its own without reinforcement learning? Actually, yes.
We can do straightforward supervised fine-tuning using a relatively simple trick: blurring a part of CoT thoughts. But why is this effective?
We observed that various models differ in their thinking processes, and fine-tuning one model on another model’s thoughts (CoT) can sometimes be inefficient—often resulting in the model simply memorizing reasoning rather than learning how to actually think.
I discovered that this process can still be efficient if we clearly indicate when the model should start and stop thinking and uncover only a part of CoT and the expected answer, blurring the other part of CoT. This approach allows the model to learn only a portion of the thought process while still arriving at an expected answer.
To demonstrate this, you can watch my experimental BT-SFT on meditsolutions/Llama-3.2-SUN-2.5B-chat model, which was fine-tuned on 151 million tokens from the Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B dataset.
Enjoy! 🚀
PS. If you were curious enough to read this, leave me a comment. It's always nice to chat with open-minded and intelligent ppl.
FLUX Tools Complete Tutorial with SwarmUI (as easy as Automatic1111 or Forge) : Outpainting, Inpainting, Redux Style Transfer + Re-Imagine + Combine Multiple Images, Depth and Canny - More info at the oldest comment - No-paywall : https://youtu.be/hewDdVJEqOQ
FLUX.1 Tools by BlackForestLabs changed the #AI field forever. They became the number 1 Open Source community provider after this massive release. In this tutorial, I will show you step by step how use FLUX.1 Fill model (inpainting model) to do perfect outpainting (yes this model used for outpainting) and inpainting. Moreover, I will show all features of FLUX Redux model to do style transfer / re-imagine 1 and more than 1 images combination. Furthermore, I will show you step by step how to convert input image into Depth or Canny maps and then how to use them on #FLUX Depth and Canny models. Both LoRA and full checkpoints of FLUX Depth and Canny.
Preparation of this tutorial took more than 1 week and this will be the very best and easiest to follow tutorial since it is made with famous #SwarmUI. SwarmUI is as easy and as advanced as Automatic1111 SD Web UI. Biggest advantage of SwarmUI is that, it uses ComfyUI as a back-end. Therefore, It is extremely fast, VRAM optimized and supports all of the newest SOTA models as soon as they are published.
So in this tutorial I will show you how to setup SwarmUI and FLUX Dev tools on your Windows Computer, Massed Compute, RunPod and Kaggle. I will step by step explanatin and show you every tips and tricks that you need to properly do style transfer, re-imagine, inpaint, outpaint, depth and canny with FLUX.
I trained this model on a new spot I'm really excited to share (soon!)
This Monday I will be posting my first beginning to end blog showing the tool I've used, dataset, captioning techniques, and parameters to finetune this LoRA.