Update README.md

9492a8c verified 13 days ago

4.71 kB

	---
	base_model:
	- saishshinde15/Clyrai_Base_Reasoning
	tags:
	- text-generation-inference
	- transformers
	- qwen2
	- trl
	- reasoning
	- deepseekR1
	- advanced-finetuning
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	---

	# Clyrai Vortex Reasoning

	- Developed by: clyrai
	- License: apache-2.0
	- Fine-tuned from: [saishshinde15/Clyrai_Base_Reasoning](https://huggingface.co/saishshinde15/TethysAI_Base_Reasoning)
	- Category: Experimental, Research

	## Introduction

	TethysAI Vortex Reasoning is an experimental model that advances the structured reasoning capabilities pioneered by [Clyrai_Base Reasoning](https://huggingface.co/saishshinde15/TethysAI_Base_Reasoning). While the Base Reasoning model utilized Generalized Reinforced Policy Optimization (GRPO) to enhance step-by-step logical thought processes similar to DeepSeek-R1, this model takes a different approach—eliminating GRPO and instead relying on high-end Supervised Fine-Tuning (SFT) techniques.

	The core objective was to investigate whether deep reasoning and self-questioning behavior could emerge purely through SFT on high-quality datasets. The results were highly promising: the model successfully questions itself internally, improves reasoning depth, and consistently generates structured, logical responses.

	---

	## Key Features

	### 1️⃣ Advanced Reasoning Without GRPO
	This model does not rely on GRPO yet achieves similar self-reflective thought processes, proving that structured reasoning can be induced through high-quality SFT alone.

	### 2️⃣ Self-Questioning and Iterative Thinking
	The model actively asks itself intermediate questions before answering, mimicking the deep reflection-based thought process of models like DeepSeek-R1. This leads to more reliable and well-structured responses.

	### 3️⃣ High-Quality SFT on a Curated Dataset
	To compensate for the lack of reinforcement learning, we used an extensive dataset tailored for deep reasoning. This dataset includes:
	- Mathematical proofs & logical puzzles
	- Complex multi-step problem-solving tasks
	- Philosophical and ethical reasoning
	- Scientific hypothesis evaluation

	### 4️⃣ Implicit Use of `<think>` and `<answer>` Tokens
	The model internally uses special reasoning markers (`<think>` and `<answer>`) to structure its responses, though these may not always be visible in the final output. This ensures a consistent and methodical approach to answering questions.

	### 5️⃣ Part of the TethysAI Vortex Family
	This model belongs to the Clyrai Vortex series, a collection of fine-tuned models pushing the boundaries of SFT-based reasoning without reinforcement learning.

	---

	## Breakthrough Insights

	\| Feature \| Base Reasoning (GRPO) ✅ \| Vortex Reasoning (SFT-Only) ✅ \|
	\|----------------------------------\|------------------------\|----------------------------\|
	\| Structured Thought Process \| ✅ Yes (GRPO) \| ✅ Yes (SFT) \|
	\| Self-Reflection & Questioning \| ✅ Strong \| ✅ Equally Strong \|
	\| GRPO-Free Optimization \| ❌ No \| ✅ Achieved via SFT \|
	\| Step-by-Step Problem Solving \| ✅ Yes \| ✅ Yes \|
	\| Use of `<think>` and `<answer>` \| ✅ Explicit \| ✅ Implicit (Internal Use) \|

	Key Takeaway: This experiment confirms that reinforcement learning is not the only pathway to advanced reasoning capabilities—with the right dataset and SFT strategies, models can self-reflect and logically deduce answers in a structured manner.

	---

	## How to Use

	### Running with Transformers

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	# Load model & tokenizer
	model_name = "saishshinde15/Clyrai_Vortex_Reasoning"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name).to("cuda")

	# Prepare input prompt
	messages = [
	{"role": "system", "content": "You are an advanced AI assistant. Provide answers in a clear, step-by-step manner."},
	{"role": "user", "content": "If x + 3 = 10, what is x?"}
	]

	# Apply chat template and tokenize
	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")

	# Generate response
	outputs = model.generate(input_ids, max_new_tokens=512)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)

	print(response)
	```