Qwen2.5-1.5B-DeepSeek-R1-Instruct

This model is a merged pre-trained language model created using MergeKit with the TIES merge method. It uses Qwen/Qwen2.5-1.5B-Instruct as the base and combines Qwen/Qwen2.5-1.5B and deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B with equal weight and density. The merge configuration includes normalization, int8 masking, and bfloat16 precision for optimized performance.

Merge

This is a merge of pre-trained language models created using mergekit.

Merge Method

This model was merged using the TIES merge method using Qwen/Qwen2.5-1.5B-Instruct as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
  - model: Qwen/Qwen2.5-1.5B
    parameters:
      weight: 1
      density: 1
merge_method: ties
base_model: Qwen/Qwen2.5-1.5B-Instruct
parameters:
  weight: 1
  density: 1
  normalize: true
  int8_mask: true
dtype: bfloat16
Downloads last month
362
Safetensors
Model size
1.78B params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for prithivMLmods/Qwen2.5-1.5B-DeepSeek-R1-Instruct

Spaces using prithivMLmods/Qwen2.5-1.5B-DeepSeek-R1-Instruct 3