WeDLM-7B

WeDLM-7B is a diffusion language model that performs parallel decoding under standard causal attention, initialized from Qwen2.5-7B.

This is the base (pretrained) version. For the instruction-tuned version, see WeDLM-7B-Instruct.

📄 Paper (Coming Soon) | 🌐 Project Page | 💻 GitHub

Model Details

Attribute Value
Initialized From Qwen2.5-7B
Parameters 7B
Context Length 32,768

Quick Start (Recommended)

For fast inference, use the wedlm engine:

pip install git+https://github.com/tencent/WeDLM.git
from wedlm import LLM, SamplingParams

llm = LLM(model="tencent/WeDLM-7B")

prompt = "The theory of relativity states that"
outputs = llm.generate([prompt], SamplingParams(temperature=0.2, max_tokens=256))

print(outputs[0]["text"])

HuggingFace Transformers

For training or simple forward passes, you can load via Transformers:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("tencent/WeDLM-7B", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    "tencent/WeDLM-7B", 
    trust_remote_code=True,
    torch_dtype="auto",
    device_map="auto"
)

inputs = tokenizer("The theory of relativity", return_tensors="pt").to(model.device)
outputs = model(**inputs)

⚠️ Note: The HuggingFace interface is for training/forward pass convenience. For optimized inference throughput, use the wedlm engine above.

Performance

Benchmark Qwen2.5-7B WeDLM-7B
ARC-C (0-shot) 89.93 90.70
GSM8K (3-shot) 79.23 84.76
MATH (4-shot) 43.40 48.20
HumanEval (4-shot) 59.14 68.90
MMLU (5-shot) 71.62 71.93

Citation

@article{liu2025wedlm,
  title={WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference},
  author={Liu, Aiwei and He, Minghua and Zeng, Shaoxun and Zhang, Linhao and Wu, Chuhan and Jia, Wei and Liu, Yuan and Yu, Yang and Zhou, Xiao and Zhou, Jie},
  year={2025}
}

License

Apache 2.0

Downloads last month
15
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tencent/WeDLM-7B-Base

Base model

Qwen/Qwen2.5-7B
Finetuned
(784)
this model

Collection including tencent/WeDLM-7B-Base