Model Card for Stefan Zweig Language Model

This model is a fine-tuned version of ibm-granite/granite-3.1-2b-instruct. It has been trained using TRL.

Model Details

This model is designed to emulate Stefan Zweig's distinctive writing and conversational style in chat format. Used a fine-tuning approach based on the methodology described in the DeepSeek-V3 technical report. The project aims to create a language model that emulates Stefan Zweig's distinctive writing and conversational style using a two-stage training process: Supervised Fine-Tuning (SFT) followed by Group Relative Policy Optimization (GRPO).

Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModelForCausalLM.from_pretrained("Chan-Y/Stefan-Zweig-Granite", device_map=device)
tokenizer = AutoTokenizer.from_pretrained("Chan-Y/Stefan-Zweig-Granite")

input_text = "As an experienced and famous writer Stefan Zweig, what's your opinion on artificial intelligence?"
inputs = tokenizer(input_text, return_tensors="pt").to(device)

with torch.no_grad():
  outputs = model.generate(
    **inputs,
    max_length=512,
    num_return_sequences=1,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
  )

# Decode the generated text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text.split(input_text)[-1])

Training procedure

train-loss

Dataset: Custom synthetic dataset generated using argilla/synthetic-data-generator with Qwen2.5:14b

Data Format: Structured conversations with specific role markers and custom tokens.

Data Processing: Implementation of special tokens and for style consistency

  • Training Type: Two-stage training pipeline

    1. Supervised Fine-Tuning (SFT)
    2. Group Relative Policy Optimization (GRPO)

Framework versions

  • TRL: 0.14.0.dev0
  • Transformers: 4.48.1
  • Pytorch: 2.5.1+cu124
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for Chan-Y/Stefan-Zweig-Granite-2B

Finetuned
(8)
this model

Dataset used to train Chan-Y/Stefan-Zweig-Granite-2B