SparseLLM
/

BlockFFN-Large

Text Generation

Model card Files Files and versions

Raincleared commited on Jul 14

Commit

a8325ae

·

verified ·

1 Parent(s): 78fce82

Update README.md

Files changed (1) hide show

README.md +15 -1

README.md CHANGED Viewed

@@ -11,4 +11,18 @@ pipeline_tag: text-generation
 This is the original 0.8B BlockFFN checkpoint used in the paper *BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity* for acceleration tests.
 You can load and use this model simply by using `AutoTokenizer` and `AutoModelForCausalLM`.
-Links: [[Paper](https://arxiv.org/pdf/2507.08771)] [[Codes](https://github.com/thunlp/BlockFFN)]

 This is the original 0.8B BlockFFN checkpoint used in the paper *BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity* for acceleration tests.
 You can load and use this model simply by using `AutoTokenizer` and `AutoModelForCausalLM`.
+Links: [[Paper](https://arxiv.org/pdf/2507.08771)] [[Codes](https://github.com/thunlp/BlockFFN)]
+### Citation
+If you find our work useful for your research, please kindly cite our paper as follows:
+```
+@article{song2025blockffn,
+      title={{BlockFFN}: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity},
+      author={Chenyang Song and Weilin Zhao and Xu Han and Chaojun Xiao and Yingfa Chen and Yuxuan Li and Zhiyuan Liu and Maosong Sun},
+      journal={arXiv preprint arXiv:2507.08771},
+      year={2025},
+      url={https://arxiv.org/pdf/2507.08771},
+}
+```