| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - sequelbox/Celestia3-DeepSeek-R1-0528 |
| | base_model: |
| | - HuggingFaceTB/SmolLM2-135M-Instruct |
| | language: |
| | - en |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | tags: |
| | - trl |
| | - text-generation-inference |
| | - re-think |
| | - reasoning |
| | --- |
| | |
| |  |
| |
|
| | # **SmolLM2-Rethink-135M** |
| |
|
| | > **SmolLM2-Rethink-135M** is an experimental lightweight model trained on the **Celestia3-DeepSeek-R1-0528** reasoning dataset. Based on the **SmolLM2-135M-Instruct** architecture, this model is specifically optimized for reasoning, structured outputs, and efficient small-scale deployment. Despite its compact size (135M parameters), it demonstrates strong capabilities in logical deduction, conversational coherence, and lightweight inference tasks. |
| |
|
| | --- |
| |
|
| | ## **Key Highlights** |
| |
|
| | 1. **Compact & Efficient** |
| | Lightweight architecture (135M) suitable for fast inference, mobile applications, and edge deployment. |
| |
|
| | 2. **Reasoning-Centric Training** |
| | Fine-tuned on high-quality reasoning and instruction datasets like **Celestia3-DeepSeek-R1-0528**, focusing on multi-step logical thinking. |
| |
|
| | 3. **Low-Resource Optimization** |
| | Designed to run effectively on CPUs or single-GPU setups with minimal memory footprint. |
| |
|
| | 4. **Structured Outputs** |
| | Supports generation of clean, structured content including lists, steps, tables, and JSON-like responses. |
| |
|
| | --- |
| |
|
| | ## **Quickstart with 🤗 Transformers** |
| |
|
| | ```python |
| | %%capture |
| | !pip install transformers |
| | ``` |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | checkpoint = "prithivMLmods/SmolLM2-Rethink-135M" |
| | device = "cuda" # or "cpu" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(checkpoint) |
| | model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device) |
| | |
| | messages = [{"role": "user", "content": "What is gravity?"}] |
| | input_text = tokenizer.apply_chat_template(messages, tokenize=False) |
| | print(input_text) |
| | |
| | inputs = tokenizer.encode(input_text, return_tensors="pt").to(device) |
| | outputs = model.generate( |
| | inputs, |
| | max_new_tokens=1024, |
| | temperature=0.2, |
| | top_p=0.9, |
| | do_sample=True |
| | ) |
| | |
| | print(tokenizer.decode(outputs[0])) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## **Intended Use** |
| |
|
| | * **Instruction Following & QA** |
| | Good for answering simple questions, following short instructions, and general user interactions. |
| |
|
| | * **Educational Tools** |
| | Suitable for lightweight tutoring bots or classroom assistants on low-compute setups. |
| |
|
| | * **Reasoning Tasks** |
| | Performs well on logic puzzles, multi-step reasoning, and chain-of-thought queries. |
| |
|
| | * **Prototype Agents & Microservices** |
| | Can be deployed in memory-efficient environments or as modular AI components. |
| |
|
| | --- |
| |
|
| | ## **Limitations** |
| |
|
| | 1. **Limited Knowledge Capacity** |
| | Due to small parameter size, lacks the depth and breadth of large-scale models. |
| |
|
| | 2. **Short-Term Context Handling** |
| | Performs best with short to moderate-length prompts; lacks extended context support. |
| |
|
| | 3. **Creative Generation Limitations** |
| | Output may lack diversity or depth in open-ended storytelling or imaginative tasks. |
| |
|
| | 4. **Token Budget** |
| | Smaller output range; optimized for shorter and structured completions. |
| |
|
| | 5. **Basic Multilingual Support** |
| | Some support for multilingual input, but less accurate than larger multilingual models. |