Neural-Hacker
/

OpenMath

@@ -20,126 +20,102 @@ tags:
 Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning
 ## Overview
-OpenMath is an open-source project focused on fine-tuning a small language model for math reasoning using QLoRA (4-bit LoRA).
-This repository contains only a LoRA adapter trained on GSM8K. Users must load the base model separately and attach the adapter.
-The latest version of this model was trained on an AMD MI300X GPU using ROCm, showing that modern non-NVIDIA accelerators can successfully support large-scale fine-tuning with Hugging Face and PyTorch.
 ---
 ## Base Model
-Qwen/Qwen2.5-Math-1.5B
-This repository does not contain the base model weights — they must be loaded from Hugging Face.
 ---
 ## Hardware Used (Latest Training Run)
-GPU: AMD MI300X (ROCm 7.0)
-VRAM: 192 GB
-Operating System: Ubuntu 24.04
-Framework: PyTorch + Hugging Face
-Backend: ROCm
 ---
 ## Dataset
-GSM8K (Grade School Math 8K)
-Training samples: 1,000
-Evaluation: Full GSM8K test split (1,319 problems)
-Only the solution portion of each example was used for loss computation through loss masking.
 ---
 ## Training Configuration
-## Training Configuration (MI300X Run)
 **Method:** LoRA (full precision, bfloat16)
-**Precision:** bfloat16 (no 4-bit quantization)
-**LoRA settings**
 - Rank: 16
 - Alpha: 32
 - Dropout: 0.05
 - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`
-**Data & sequence**
 - Max sequence length: 1024
-**Optimization**
-- Batch size: 2
 - Gradient accumulation: 8
-- **Effective batch size:** 16
 - Learning rate: 1e-4
 - Optimizer: `adamw_torch`
 - Scheduler: cosine
 - Warmup: 5%
-**Training**
-- Epochs: 3
 ---
 ## Results
-GSM8K Accuracy (Full Test Set):
-750 out of 1319 correct, which equals 56.86 percent accuracy.
-This is significantly stronger than the earlier Colab T4 run and is a strong result for a 1.5B model trained with LoRA.
----
-## What This Repository Contains
-adapter_model.safetensors — LoRA weights
-adapter_config.json — LoRA configuration
-chat_template.jinja — chat formatting template
-tokenizer.json — tokenizer file
-tokenizer_config.json — tokenizer settings
-README.md — documentation
-This repository does not include checkpoints, optimizer states, or full base model weights.
 ---
 ## How to Use This Model
-Load the base model Qwen/Qwen2.5-Math-1.5B from Hugging Face, then attach this LoRA adapter using PEFT. Generate answers using a prompt that includes an instruction, problem, and solution section.
 ---
 ## Why This Matters
-This project demonstrates that AMD MI300X can train modern language models with Hugging Face and QLoRA.
-It shows that high-quality math reasoning is possible at 1.5B parameters using efficient fine-tuning.
-It provides a lightweight adapter instead of requiring users to download a massive full model.
 ---
 ## Limitations
-The model can make reasoning mistakes.
-It should not be used for exams, assignments, or professional decisions.
-Performance depends heavily on prompt formatting.
----
-## Future Work
-Train on 3,000 to 5,000 GSM8K samples.
-Add SVAMP and ASDiv datasets.
-Improve decoding to reduce repetition.
-Experiment with multi-GPU scaling on MI300X.
-Add a Streamlit demo for interactive use.
 ---
 ## License
-cc-by-nc-4.0

 Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning
 ## Overview
+OpenMath is an open-source project focused on fine-tuning a small language model for mathematical reasoning using parameter-efficient LoRA training.
+This repository contains **only a LoRA adapter** trained on the full GSM8K dataset. Users must load the base model separately and attach the adapter using PEFT.
+The latest version of this model was trained on an **AMD MI300X GPU using ROCm**, demonstrating that high-performance non-NVIDIA accelerators can successfully support modern large language model fine-tuning with PyTorch and Hugging Face.
 ---
 ## Base Model
+**Qwen/Qwen2.5-Math-1.5B**
+This repository **does not contain the base model weights** — they must be loaded directly from Hugging Face before applying this LoRA adapter.
 ---
 ## Hardware Used (Latest Training Run)
+- **GPU:** AMD MI300X (ROCm 7.0)
+- **VRAM:** 192 GB
+- **OS:** Ubuntu 24.04
+- **Framework:** PyTorch + Hugging Face
+- **Backend:** ROCm
 ---
 ## Dataset
+**GSM8K (Grade School Math 8K)**
+- **Training samples:** 7,473 (full training split)
+- **Evaluation:** Full GSM8K test split (1,319 problems)
+Only the solution portion of each example was used for loss computation via loss masking to encourage stronger reasoning behavior.
 ---
 ## Training Configuration
 **Method:** LoRA (full precision, bfloat16)
+**Precision:** bfloat16 (no 4-bit quantization in this run)
+### LoRA settings
 - Rank: 16
 - Alpha: 32
 - Dropout: 0.05
 - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`
+### Data & sequence
 - Max sequence length: 1024
+### Optimization
+- Per-device batch size: 2
 - Gradient accumulation: 8
+- Effective batch size: 16
 - Learning rate: 1e-4
 - Optimizer: `adamw_torch`
 - Scheduler: cosine
 - Warmup: 5%
+### Training
+- **Epochs:** 3
 ---
 ## Results
+**GSM8K Accuracy (Full Test Set):**
+750 / 1319 = **56.86% accuracy**
+This represents a substantial improvement over earlier small-scale Colab experiments and is a strong result for a 1.5B model trained with LoRA on the full dataset.
 ---
 ## How to Use This Model
+1. Load the base model **Qwen/Qwen2.5-Math-1.5B** from Hugging Face.
+2. Attach this LoRA adapter using PEFT.
+3. Use a structured prompt that includes an instruction, problem, and solution section for best results.
 ---
 ## Why This Matters
+- Demonstrates that **AMD MI300X** can effectively train modern LLMs with Hugging Face + LoRA.
+- Shows strong math reasoning at **1.5B parameters** with lightweight fine-tuning.
+- Provides a compact adapter instead of requiring users to download a massive full model.
 ---
 ## Limitations
+- The model can make reasoning mistakes.
+- It should not be used for exams, assignments, or professional decisions.
+- Performance depends heavily on prompt formatting.
 ---
 ## License
+**cc-by-nc-4.0**