Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training.
Asankhaya Sharma
AI & ML interests
Creator of OptiLLM, OpenEvolve, Adaptive Classifier, and Ellora. Pioneering a new category in AI infrastructure: inference-time compute for LLMs.
Recent Activity
commented on
their
article
3 days ago
Reverse Engineering a $500M Mystery: From HashHop to Memory-Augmented Language Models
updated
a Space
4 days ago
algorithmicsuperintelligence/README
commented on
their
article
8 days ago
Reverse Engineering a $500M Mystery: From HashHop to Memory-Augmented Language Models
Organizations
Ellora
Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement
-
codelion/Qwen3-0.6B-accuracy-recovery-lora
Text Generation β’ Updated β’ 176 β’ 4 -
codelion/Qwen3-0.6B-magpie
Viewer β’ Updated β’ 735 β’ 4 β’ 1 -
codelion/gemma-3-1b-it-reasoning-grpo-lora
Text Generation β’ Updated β’ 1 β’ 5 -
codelion/gemma-3-1b-it-magpie-reasoning
Viewer β’ Updated β’ 131 β’ 44 β’ 2
Internal Coherence Maximization
Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs
Pre-training Dataset Samples
A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations.
Pivotal Token Search
Pivotal Token Search (PTS) identifies tokens in a language model's generation that significantly impact the probability of success
-
RunningFeatured14
PTS Visualizer
π14Visualize pivotal tokens and thought anchors in language models
-
codelion/Qwen3-0.6B-pts
Viewer β’ Updated β’ 1.38k β’ 74 β’ 2 -
codelion/Qwen3-0.6B-pts-thought-anchors
Viewer β’ Updated β’ 148 β’ 11 β’ 2 -
codelion/Qwen3-0.6B-pts-steering-vectors
Viewer β’ Updated β’ 1.38k β’ 14 β’ 5
Securade.ai
All models, datasets and tools related to https://securade.ai
Dhara Foundational Models
Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training.
Pre-training Dataset Samples
A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations.
Ellora
Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement
-
codelion/Qwen3-0.6B-accuracy-recovery-lora
Text Generation β’ Updated β’ 176 β’ 4 -
codelion/Qwen3-0.6B-magpie
Viewer β’ Updated β’ 735 β’ 4 β’ 1 -
codelion/gemma-3-1b-it-reasoning-grpo-lora
Text Generation β’ Updated β’ 1 β’ 5 -
codelion/gemma-3-1b-it-magpie-reasoning
Viewer β’ Updated β’ 131 β’ 44 β’ 2
Pivotal Token Search
Pivotal Token Search (PTS) identifies tokens in a language model's generation that significantly impact the probability of success
-
RunningFeatured14
PTS Visualizer
π14Visualize pivotal tokens and thought anchors in language models
-
codelion/Qwen3-0.6B-pts
Viewer β’ Updated β’ 1.38k β’ 74 β’ 2 -
codelion/Qwen3-0.6B-pts-thought-anchors
Viewer β’ Updated β’ 148 β’ 11 β’ 2 -
codelion/Qwen3-0.6B-pts-steering-vectors
Viewer β’ Updated β’ 1.38k β’ 14 β’ 5
Internal Coherence Maximization
Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs
Securade.ai
All models, datasets and tools related to https://securade.ai