The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 11 days ago • 181
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs Paper • 2309.03907 • Published May 18, 2023 • 12
Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts Paper • 2309.04354 • Published Sep 8, 2023 • 15
Towards Practical Capture of High-Fidelity Relightable Avatars Paper • 2309.04247 • Published Sep 8, 2023 • 10
FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning Paper • 2309.04663 • Published Sep 9, 2023 • 6
Uncovering mesa-optimization algorithms in Transformers Paper • 2309.05858 • Published Sep 11, 2023 • 13
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs Paper • 2309.05516 • Published Sep 11, 2023 • 10
When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale Paper • 2309.04564 • Published Sep 8, 2023 • 16
Large Language Model for Science: A Study on P vs. NP Paper • 2309.05689 • Published Sep 11, 2023 • 21
Neurons in Large Language Models: Dead, N-gram, Positional Paper • 2309.04827 • Published Sep 9, 2023 • 17
Learning Disentangled Avatars with Hybrid 3D Representations Paper • 2309.06441 • Published Sep 12, 2023 • 6
AstroLLaMA: Towards Specialized Foundation Models in Astronomy Paper • 2309.06126 • Published Sep 12, 2023 • 17
LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot Learning Paper • 2309.06440 • Published Sep 12, 2023 • 11
huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2 Text Generation • Updated 8 days ago • 5.94k • 113