Add library name and pipeline tag

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -3,7 +3,10 @@ datasets:
3
  - HuggingFaceTB/smollm-corpus
4
  language:
5
  - en
 
 
6
  ---
 
7
  # Outlier-Safe Pre-Training
8
 
9
  [![arXiv](https://img.shields.io/badge/arXiv-2506.19697-b31b1b?style=flat-square)](https://arxiv.org/abs/2506.19697)
@@ -25,8 +28,6 @@ A method that prevents outliers but significantly reduces efficiency is unlikely
25
  3. 🧩**Ensuring full compatibility with existing inference pipelines**<br/>
26
  We prioritize compatibility with widely adopted inference frameworks such as vLLM and SGLang. Rather than introducing architectural changes that break compatibility, OSP preserves computational invariance, allowing models to be directly integrated into existing pipelines without additional effort.
27
 
28
-
29
-
30
  ## Model Checkpoints
31
 
32
  ### Final Models
@@ -36,7 +37,6 @@ The models were trained on 1 trillion tokens, following the pre-training recipe
36
  - [🤗 OSP-1.4B-1T-Adam](https://huggingface.co/dmis-lab/OSP-1.4B-1T-Adam): Trained on the standard Adam optimizer, without any modifications.
37
  - [🤗 OSP-1.4B-1T-Muon-SSNorm-EmbProj](https://huggingface.co/dmis-lab/OSP-1.4B-1T-Muon-SSNorm-EmbProj): Trained on the OSP framework. This is our final model.
38
 
39
-
40
  ### Ablation Models
41
 
42
  <table>
@@ -177,7 +177,6 @@ The models were trained on 1 trillion tokens, following the pre-training recipe
177
  </table>
178
  &dagger;Model configuration that disables decoupled embedding optimization by training with Muon optimizer without Adam optimization on embedding layers
179
 
180
-
181
  ## Training
182
 
183
  ### Model
 
3
  - HuggingFaceTB/smollm-corpus
4
  language:
5
  - en
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
  ---
9
+
10
  # Outlier-Safe Pre-Training
11
 
12
  [![arXiv](https://img.shields.io/badge/arXiv-2506.19697-b31b1b?style=flat-square)](https://arxiv.org/abs/2506.19697)
 
28
  3. 🧩**Ensuring full compatibility with existing inference pipelines**<br/>
29
  We prioritize compatibility with widely adopted inference frameworks such as vLLM and SGLang. Rather than introducing architectural changes that break compatibility, OSP preserves computational invariance, allowing models to be directly integrated into existing pipelines without additional effort.
30
 
 
 
31
  ## Model Checkpoints
32
 
33
  ### Final Models
 
37
  - [🤗 OSP-1.4B-1T-Adam](https://huggingface.co/dmis-lab/OSP-1.4B-1T-Adam): Trained on the standard Adam optimizer, without any modifications.
38
  - [🤗 OSP-1.4B-1T-Muon-SSNorm-EmbProj](https://huggingface.co/dmis-lab/OSP-1.4B-1T-Muon-SSNorm-EmbProj): Trained on the OSP framework. This is our final model.
39
 
 
40
  ### Ablation Models
41
 
42
  <table>
 
177
  </table>
178
  &dagger;Model configuration that disables decoupled embedding optimization by training with Muon optimizer without Adam optimization on embedding layers
179
 
 
180
  ## Training
181
 
182
  ### Model