Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training.
Asankhaya Sharma
AI & ML interests
Creator of OptiLLM, OpenEvolve, Adaptive Classifier, and Ellora. Pioneering a new category in AI infrastructure: inference-time compute for LLMs.
Recent Activity
updated a collection 2 days ago
Dhara Foundational Models updated a model 2 days ago
mlx-community/NVIDIA-Nemotron-3-Nano-4B-BF16 published a model 2 days ago
mlx-community/NVIDIA-Nemotron-3-Nano-4B-BF16Organizations
Nano Language Models
A collection of really small language models pre-trained from scratch with open-data. Ideal for use in experimentation and evaluations.
Ellora
Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement
Internal Coherence Maximization
Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs
Sutra Pedagogical Datasets
High-quality synthetic educational datasets designed for LLM pretraining with structured pedagogical content across 9 knowledge domains.
Pre-training Dataset Samples
A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations.
Pivotal Token Search
Pivotal Token Search (PTS) identifies tokens in a language model's generation that significantly impact the probability of success
- SleepingFeatured14
PTS Visualizer
🔍14Visualize pivotal tokens and thought anchors in language models
-
codelion/Qwen3-0.6B-pts
Viewer • Updated • 1.38k • 13 • 2 -
codelion/Qwen3-0.6B-pts-thought-anchors
Viewer • Updated • 148 • 7 • 2 -
codelion/Qwen3-0.6B-pts-steering-vectors
Viewer • Updated • 1.38k • 24 • 5
Securade.ai
All models, datasets and tools related to https://securade.ai
Dhara Foundational Models
Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training.
Sutra Pedagogical Datasets
High-quality synthetic educational datasets designed for LLM pretraining with structured pedagogical content across 9 knowledge domains.
Nano Language Models
A collection of really small language models pre-trained from scratch with open-data. Ideal for use in experimentation and evaluations.
Pre-training Dataset Samples
A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations.
Ellora
Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement
Pivotal Token Search
Pivotal Token Search (PTS) identifies tokens in a language model's generation that significantly impact the probability of success
- SleepingFeatured14
PTS Visualizer
🔍14Visualize pivotal tokens and thought anchors in language models
-
codelion/Qwen3-0.6B-pts
Viewer • Updated • 1.38k • 13 • 2 -
codelion/Qwen3-0.6B-pts-thought-anchors
Viewer • Updated • 148 • 7 • 2 -
codelion/Qwen3-0.6B-pts-steering-vectors
Viewer • Updated • 1.38k • 24 • 5
Internal Coherence Maximization
Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs
Securade.ai
All models, datasets and tools related to https://securade.ai