RabotniKuma
/

Fast-Math-R1-14B

@@ -16,23 +16,53 @@ which achieves up to 60% (on average approx. 30%) faster inference while maintai
 Technical details can be found in [Kaggle Discussion](https://www.kaggle.com/competitions/ai-mathematical-olympiad-progress-prize-2/discussion/571252) and [Github](https://github.com/analokmaus/kaggle-aimo2-fast-math-r1).
-<img src="https://www.googleapis.com/download/storage/v1/b/kaggle-forum-message-attachments/o/inbox%2F1973217%2F4f221ab914f3e950fa35bdab5723d462%2Fpass1_aime_all.png?generation=1744851665782759&alt=media" max-height="300px">
-|                              |              | AIME 2024        |               | AIME 2025        |               |
-| ---------------------------- | ------------ | ---------------- | ------------- | ---------------- | ------------- |
-| Model                        | Token budget | Pass@1 (avg. 64) | Output tokens | Pass@1 (avg. 64) | Output tokens |
-| DeepSeek-R1-Distill-Qwen-14B | 16384        | 63.3             | 9590          | 46.7             | 10602         |
-|                              | 12800        | 58               | 8632          | 41.9             | 9363          |
-|                              | 8192         | 45.6             | 6638          | 30.6             | 6897          |
-| Light-R1-14B-DS              | 16384        | **66.8**             | 10146         | **51.3**             | 11308         |
-|                              | 12800        | 59.2             | 9110          | 43.8             | 9834          |
-|                              | 8192         | 42.4             | 7020          | 30.4             | 7124          |
-| Fast-Math-R1-14B             | 16384        | 66               | **7932**          | 49.2             | **9066**          |
-|                              | 12800        | **63**               | **7449**          | **46.1**             | **8282**          |
-|                              | 8192         | **51.4**             | **5963**          | **37.2**             | **6256**          |
-| Fast-Math-R1-14B-SFT Only    | 16384        | 65.2             | 10268         | 49.7             | 11264         |
-|                              | 12800        | 57.2             | 9180          | 42.8             | 9805          |
-|                              | 8192         | 41.3             | 7015          | 30.1             | 7074          |
 # Dataset
@@ -61,7 +91,7 @@ sampling_params = SamplingParams(
     top_p=0.90,
     min_p=0.05,
     max_tokens=8192,
-    stop='</think>',  # Important!: early stop at </think> to save output tokens
 )
 messages = [
     {

 Technical details can be found in [Kaggle Discussion](https://www.kaggle.com/competitions/ai-mathematical-olympiad-progress-prize-2/discussion/571252) and [Github](https://github.com/analokmaus/kaggle-aimo2-fast-math-r1).
+# Evaluation
+<img src="https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true" max-height="400px">
+## DS-R1-Qwen-14B vs Fast-Math-R1-14B (Ours)
+|                              |              | AIME 2024        |                    | AIME 2025        |                    |
+| ---------------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
+| Model                        | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
+| DeepSeek-R1-Distill-Qwen-14B | 32000        | 66.9             | 11026              | 49.9             | 12310              |
+|                              | 24000        | 65.7             | 10784              | 49.7             | 11978              |
+|                              | 16000        | 61               | 9708               | 46.2             | 10567              |
+|                              | 12000        | 53.7             | 8472               | 39.9             | 9008               |
+|                              | 8000         | 41.8             | 6587               | 31.1             | 6788               |
+| Fast-Math-R1-14B             | 32000        | 68               | 8217               | 49.6             | 9663               |
+|                              | 24000        | 67.9             | 8209               | 49.6             | 9627               |
+|                              | 16000        | 66.7             | 8017               | 48.4             | 9083               |
+|                              | 12000        | 61.9             | 7362               | 45.2             | 8048               |
+|                              | 8000         | 51.4             | 5939               | 36.3             | 6174               |
+## OpenMath-Nemotron-14B vs Fast-OpenMath-Nemotron-14B (Ours)
+|                            |              | AIME 2024        |                    | AIME 2025        |                    |
+| -------------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
+| Model                      | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
+| OpenMath-Nemotron-14B      | 32000        | 76.2             | 11493              | 64.5             | 13414              |
+|                            | 24000        | 75.4             | 11417              | 63.4             | 13046              |
+|                            | 16000        | 66               | 10399              | 54.2             | 11422              |
+|                            | 12000        | 55               | 9053               | 40               | 9609               |
+|                            | 8000         | 36               | 6978               | 27.2             | 7083               |
+| [Fast-OpenMath-Nemotron-14B](https://huggingface.co/RabotniKuma/Fast-OpenMath-Nemotron-14B) | 32000        | 70.7             | 9603               | 61.4             | 11424              |
+|                            | 24000        | 70.6             | 9567               | 60.9             | 11271              |
+|                            | 16000        | 66.6             | 8954               | 55.3             | 10190              |
+|                            | 12000        | 59.4             | 7927               | 45.6             | 8752               |
+|                            | 8000         | 47.6             | 6282               | 33.8             | 6589               |
+## Qwen3-14B vs Fast-Math-Qwen-14B
+|                     |              | AIME 2024        |                    | AIME 2025        |                    |
+| ------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
+| Model               | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
+| Qwen3-14B           | 32000        | 79.3             | 13669              | 69.5             | 16481              |
+|                     | 24000        | 75.9             | 13168              | 65.6             | 15235              |
+|                     | 16000        | 64.5             | 11351              | 50.4             | 12522              |
+|                     | 12000        | 49.7             | 9746               | 36.3             | 10353              |
+|                     | 8000         | 28.4             | 7374               | 19.5             | 7485               |
+| [Fast-Math-Qwen3-14B](https://huggingface.co/RabotniKuma/Fast-Math-Qwen3-14B) | 32000        | 77.6             | 9740               | 66.6             | 12281              |
+|                     | 24000        | 76.5             | 9634               | 65.3             | 11847              |
+|                     | 16000        | 72.6             | 8793               | 60.1             | 10195              |
+|                     | 12000        | 65.1             | 7775               | 49.4             | 8733               |
+|                     | 8000         | 50.7             | 6260               | 36               | 6618               |
 # Dataset
     top_p=0.90,
     min_p=0.05,
     max_tokens=8192,
+    stop='</think>', # For even faster inference, applying early stopping at the </think> tag and extracting the final boxed content is recommended.
 )
 messages = [
     {