Deqing Fu PRO
deqing
AI & ML interests
None yet
Recent Activity
updated a model 1 day ago
deqing/fone-llama-3.2-1B-dclm-100BT-fone3d-hybrid-tile-v1 updated a dataset 2 days ago
deqing/addition_dataset published a model 3 days ago
deqing/convergent-llama-300M-adamw-addition_7digitOrganizations
Convergent Evolution (Addition)
-
deqing/convergent-llama-300M-muon-addition_3digit
Text Generation • 0.3B • Updated • 14 -
deqing/convergent-llama-300M-muon-addition_3digit_seed123
0.3B • Updated • 1 -
deqing/convergent-llama-300M-muon-addition
Text Generation • 0.3B • Updated • 30 -
deqing/convergent-llama-300M-adamw-addition_3digit
Text Generation • 0.3B • Updated • 12
Convergent Evolution (Data)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 27 -
deqing/convergent-llama-300M-muon-unigram
Text Generation • 0.3B • Updated -
deqing/convergent-llama-300M-muon-isolate-1
Text Generation • 0.3B • Updated • 24 -
deqing/convergent-llama-300M-muon-swap_numbers
Text Generation • 0.3B • Updated • 1
Convergent Evolution
Convergent Evolution (Architecture and Optimizer)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 27 -
deqing/convergent-gdn-300M-muon-original
Text Generation • 0.3B • Updated • 5 -
deqing/convergent-mamba2-300M-muon-original
Text Generation • 0.3B • Updated • 3 -
deqing/convergent-lstm-4layer-muon-original
Text Generation • 0.2B • Updated • 2
Fourier Language Model
Convergent Evolution
Convergent Evolution (Addition)
-
deqing/convergent-llama-300M-muon-addition_3digit
Text Generation • 0.3B • Updated • 14 -
deqing/convergent-llama-300M-muon-addition_3digit_seed123
0.3B • Updated • 1 -
deqing/convergent-llama-300M-muon-addition
Text Generation • 0.3B • Updated • 30 -
deqing/convergent-llama-300M-adamw-addition_3digit
Text Generation • 0.3B • Updated • 12
Convergent Evolution (Architecture and Optimizer)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 27 -
deqing/convergent-gdn-300M-muon-original
Text Generation • 0.3B • Updated • 5 -
deqing/convergent-mamba2-300M-muon-original
Text Generation • 0.3B • Updated • 3 -
deqing/convergent-lstm-4layer-muon-original
Text Generation • 0.2B • Updated • 2
Convergent Evolution (Data)
-
deqing/convergent-llama-300M-muon-original
Text Generation • 0.3B • Updated • 27 -
deqing/convergent-llama-300M-muon-unigram
Text Generation • 0.3B • Updated -
deqing/convergent-llama-300M-muon-isolate-1
Text Generation • 0.3B • Updated • 24 -
deqing/convergent-llama-300M-muon-swap_numbers
Text Generation • 0.3B • Updated • 1