Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
免费去水印
Log In
Sign Up
Building on HF
126.0
TFLOPS
7
27
40
Zixi "Oz" Li
PRO
OzTianlu
Follow
robtacconelli's profile picture
kaveeshwaran's profile picture
bjrise's profile picture
27 followers
·
30 following
https://github.com/lizixi-0x2F
lizixi-0x2F
AI & ML interests
My research focuses on deep reasoning with small language models, Transformer architecture innovation, and knowledge distillation for efficient alignment and transfer.
Recent Activity
reacted
to
reaperdoesntknow
's
post
with 👍
about 12 hours ago
We present a methodology for training small language models on CPU at FP32 precision that achieves capability-per-dollar efficiency orders of magnitude beyond GPU-based training. Across15modelsspanningfournovelarchitecturefamilies—MixtureofAttentions(MoA),cross- architecture fusion (Qemma), swarm intelligence (SAGI), and metric-space causal language models (DiscoverLM)—total compute cost was $24 on a single AMD EPYC 9454P proces- sor. We introduce seven methodological pillars: (1) FP32 precision preservation, with exper- iments demonstrating 5,810×single-operation error and 23,225×compounding error ratio for FP16 at network depth; (2) sparse cognitive architectures where 0.02–7% of parameters activate per token, matching CPU branching rather than GPU SIMD; (3) developmental curriculum training progressing from language to logic to transfer to depth; (4) continuous belt-fed data ingestion eliminating truncation waste; (5) hardware-native optimization for AMD Zen 4 via AOCL/OpenMP/NUMA-aware allocation; (6) self-regulating thermodynamic governance with emergent temperature measurement grounded in L2-star discrepancy; and (7) open-standard compute (AVX2 SIMD at FP32) free of proprietary vendor dependency. We argue that trans- formers were designed for GPU hardware rather than mathematical optimality, and that archi- tectures designed for geometric correctness—metric-space attention, triangle inequality enforce- ment, sparse expert routing—naturally favor CPU execution. For sub-2B parameter models, CPU training produces more capable models at a fraction of the cost.
updated
a model
3 days ago
NoesisLab/Kai-30B-Instruct
reacted
to
danielhanchen
's
post
with 🔥
14 days ago
We collaborated with NVIDIA to teach you about Reinforcement Learning and RL environments. 💚 Learn: • Why RL environments matter + how to build them • When RL is better than SFT • GRPO and RL best practices • How verifiable rewards and RLVR work Blog: https://unsloth.ai/blog/rl-environments
View all activity
Organizations
OzTianlu
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
authored
a paper
2 months ago
Reasoning: From Reflection to Solution
Paper
•
2511.11712
•
Published
Nov 12, 2025
•
2
×
Free Tool
Free AI Image Generator
Create images in seconds. No sign-up, no paywall, no setup.
No Sign-Up
Instant Results
Ready to Use
Create Images Free
Great for posters, avatars, covers, and social visuals.
Free AI Image Generator
No sign-up. Instant results.
Open Now