Neural-Hacker commited on
Commit
89f5a4c
·
verified ·
1 Parent(s): 3540770

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -59
README.md CHANGED
@@ -20,126 +20,102 @@ tags:
20
  Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning
21
 
22
  ## Overview
23
- OpenMath is an open-source project focused on fine-tuning a small language model for math reasoning using QLoRA (4-bit LoRA).
24
 
25
- This repository contains only a LoRA adapter trained on GSM8K. Users must load the base model separately and attach the adapter.
26
 
27
- The latest version of this model was trained on an AMD MI300X GPU using ROCm, showing that modern non-NVIDIA accelerators can successfully support large-scale fine-tuning with Hugging Face and PyTorch.
28
 
29
  ---
30
 
31
  ## Base Model
32
- Qwen/Qwen2.5-Math-1.5B
33
 
34
- This repository does not contain the base model weights — they must be loaded from Hugging Face.
35
 
36
  ---
37
 
38
  ## Hardware Used (Latest Training Run)
39
 
40
- GPU: AMD MI300X (ROCm 7.0)
41
- VRAM: 192 GB
42
- Operating System: Ubuntu 24.04
43
- Framework: PyTorch + Hugging Face
44
- Backend: ROCm
45
 
46
  ---
47
 
48
  ## Dataset
49
 
50
- GSM8K (Grade School Math 8K)
51
- Training samples: 1,000
52
- Evaluation: Full GSM8K test split (1,319 problems)
53
 
54
- Only the solution portion of each example was used for loss computation through loss masking.
55
 
56
  ---
57
 
58
  ## Training Configuration
59
 
60
- ## Training Configuration (MI300X Run)
61
-
62
  **Method:** LoRA (full precision, bfloat16)
63
- **Precision:** bfloat16 (no 4-bit quantization)
64
 
65
- **LoRA settings**
66
  - Rank: 16
67
  - Alpha: 32
68
  - Dropout: 0.05
69
  - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`
70
 
71
- **Data & sequence**
72
  - Max sequence length: 1024
73
 
74
- **Optimization**
75
- - Batch size: 2
76
  - Gradient accumulation: 8
77
- - **Effective batch size:** 16
78
  - Learning rate: 1e-4
79
  - Optimizer: `adamw_torch`
80
  - Scheduler: cosine
81
  - Warmup: 5%
82
 
83
- **Training**
84
- - Epochs: 3
85
-
86
 
87
  ---
88
 
89
  ## Results
90
 
91
- GSM8K Accuracy (Full Test Set):
92
- 750 out of 1319 correct, which equals 56.86 percent accuracy.
93
 
94
- This is significantly stronger than the earlier Colab T4 run and is a strong result for a 1.5B model trained with LoRA.
95
-
96
- ---
97
-
98
- ## What This Repository Contains
99
-
100
- adapter_model.safetensors — LoRA weights
101
- adapter_config.json — LoRA configuration
102
- chat_template.jinja — chat formatting template
103
- tokenizer.json — tokenizer file
104
- tokenizer_config.json — tokenizer settings
105
- README.md — documentation
106
-
107
- This repository does not include checkpoints, optimizer states, or full base model weights.
108
 
109
  ---
110
 
111
  ## How to Use This Model
112
 
113
- Load the base model Qwen/Qwen2.5-Math-1.5B from Hugging Face, then attach this LoRA adapter using PEFT. Generate answers using a prompt that includes an instruction, problem, and solution section.
 
 
114
 
115
  ---
116
 
117
  ## Why This Matters
118
 
119
- This project demonstrates that AMD MI300X can train modern language models with Hugging Face and QLoRA.
120
- It shows that high-quality math reasoning is possible at 1.5B parameters using efficient fine-tuning.
121
- It provides a lightweight adapter instead of requiring users to download a massive full model.
122
 
123
  ---
124
 
125
  ## Limitations
126
 
127
- The model can make reasoning mistakes.
128
- It should not be used for exams, assignments, or professional decisions.
129
- Performance depends heavily on prompt formatting.
130
-
131
- ---
132
-
133
- ## Future Work
134
-
135
- Train on 3,000 to 5,000 GSM8K samples.
136
- Add SVAMP and ASDiv datasets.
137
- Improve decoding to reduce repetition.
138
- Experiment with multi-GPU scaling on MI300X.
139
- Add a Streamlit demo for interactive use.
140
 
141
  ---
142
 
143
  ## License
144
 
145
- cc-by-nc-4.0
 
20
  Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning
21
 
22
  ## Overview
23
+ OpenMath is an open-source project focused on fine-tuning a small language model for mathematical reasoning using parameter-efficient LoRA training.
24
 
25
+ This repository contains **only a LoRA adapter** trained on the full GSM8K dataset. Users must load the base model separately and attach the adapter using PEFT.
26
 
27
+ The latest version of this model was trained on an **AMD MI300X GPU using ROCm**, demonstrating that high-performance non-NVIDIA accelerators can successfully support modern large language model fine-tuning with PyTorch and Hugging Face.
28
 
29
  ---
30
 
31
  ## Base Model
32
+ **Qwen/Qwen2.5-Math-1.5B**
33
 
34
+ This repository **does not contain the base model weights** — they must be loaded directly from Hugging Face before applying this LoRA adapter.
35
 
36
  ---
37
 
38
  ## Hardware Used (Latest Training Run)
39
 
40
+ - **GPU:** AMD MI300X (ROCm 7.0)
41
+ - **VRAM:** 192 GB
42
+ - **OS:** Ubuntu 24.04
43
+ - **Framework:** PyTorch + Hugging Face
44
+ - **Backend:** ROCm
45
 
46
  ---
47
 
48
  ## Dataset
49
 
50
+ **GSM8K (Grade School Math 8K)**
51
+ - **Training samples:** 7,473 (full training split)
52
+ - **Evaluation:** Full GSM8K test split (1,319 problems)
53
 
54
+ Only the solution portion of each example was used for loss computation via loss masking to encourage stronger reasoning behavior.
55
 
56
  ---
57
 
58
  ## Training Configuration
59
 
 
 
60
  **Method:** LoRA (full precision, bfloat16)
61
+ **Precision:** bfloat16 (no 4-bit quantization in this run)
62
 
63
+ ### LoRA settings
64
  - Rank: 16
65
  - Alpha: 32
66
  - Dropout: 0.05
67
  - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`
68
 
69
+ ### Data & sequence
70
  - Max sequence length: 1024
71
 
72
+ ### Optimization
73
+ - Per-device batch size: 2
74
  - Gradient accumulation: 8
75
+ - Effective batch size: 16
76
  - Learning rate: 1e-4
77
  - Optimizer: `adamw_torch`
78
  - Scheduler: cosine
79
  - Warmup: 5%
80
 
81
+ ### Training
82
+ - **Epochs:** 3
 
83
 
84
  ---
85
 
86
  ## Results
87
 
88
+ **GSM8K Accuracy (Full Test Set):**
89
+ 750 / 1319 = **56.86% accuracy**
90
 
91
+ This represents a substantial improvement over earlier small-scale Colab experiments and is a strong result for a 1.5B model trained with LoRA on the full dataset.
 
 
 
 
 
 
 
 
 
 
 
 
 
92
 
93
  ---
94
 
95
  ## How to Use This Model
96
 
97
+ 1. Load the base model **Qwen/Qwen2.5-Math-1.5B** from Hugging Face.
98
+ 2. Attach this LoRA adapter using PEFT.
99
+ 3. Use a structured prompt that includes an instruction, problem, and solution section for best results.
100
 
101
  ---
102
 
103
  ## Why This Matters
104
 
105
+ - Demonstrates that **AMD MI300X** can effectively train modern LLMs with Hugging Face + LoRA.
106
+ - Shows strong math reasoning at **1.5B parameters** with lightweight fine-tuning.
107
+ - Provides a compact adapter instead of requiring users to download a massive full model.
108
 
109
  ---
110
 
111
  ## Limitations
112
 
113
+ - The model can make reasoning mistakes.
114
+ - It should not be used for exams, assignments, or professional decisions.
115
+ - Performance depends heavily on prompt formatting.
 
 
 
 
 
 
 
 
 
 
116
 
117
  ---
118
 
119
  ## License
120
 
121
+ **cc-by-nc-4.0**