CompactAI commited on
Commit
d244c8b
Β·
verified Β·
1 Parent(s): 1f1306f

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -5,44 +5,36 @@ tags:
5
  - python
6
  - optimized
7
  - wanda
8
- - activation-pruning
9
  base_model: Qwen/Qwen2.5-3B-Instruct
10
  pipeline_tag: text-generation
11
  ---
12
 
13
  # Qwen2.5-3B-Instruct-python-aggressive
14
 
15
- > 🎯 **PYTHON-optimized** | πŸ“¦ **Aggressive** pruning | ⚑ **20% weights pruned**
16
 
17
- This model is a **aggressively pruned** version of [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct), specialized for **PYTHON** tasks using activation-aware weight pruning (Wanda-style).
18
 
19
- ## ✨ Key Features
20
-
21
- - **Specialization**: Optimized for Python tasks
22
- - **Pruning Method**: Wanda-style (|W| Γ— |activation|) importance scoring
23
- - **Size Reduction**: 20% weights pruned
24
- - **Use Case**: Maximum compression for edge deployment
25
-
26
- ## πŸ“Š Performance Comparison
27
 
28
  | Category | Original | Pruned | Change |
29
  |----------|----------|--------|--------|
30
- | **Python** | 40.0% | 13.3% ⭐ | ↓ 26.7% |
31
- | Html | 6.7% | 0.0% | ↓ 6.7% |
32
- | Trivia | 88.9% | 73.3% | ↓ 15.6% |
33
- | Math | 57.8% | 62.2% | ↑ 4.4% |
34
- | Reasoning | 33.3% | 28.9% | ↓ 4.4% |
35
- | Medical | 93.3% | 84.4% | ↓ 8.9% |
36
- | Linux | 95.6% | 93.3% | ↓ 2.2% |
37
- | Writing | 62.2% | 60.0% | ↓ 2.2% |
38
 
39
- **Average**: 59.7% β†’ 51.9% (-7.8%)
40
 
41
- **Python Retention**: 33.3% of original performance
42
 
43
  ![Comparison Graph](comparison_graph.png)
44
 
45
- ## πŸš€ Quick Start
46
 
47
  ```python
48
  from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -50,31 +42,20 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
50
  model = AutoModelForCausalLM.from_pretrained("CompactAI/Qwen2.5-3B-Instruct-python-aggressive")
51
  tokenizer = AutoTokenizer.from_pretrained("CompactAI/Qwen2.5-3B-Instruct-python-aggressive")
52
 
53
- # Example usage
54
  inputs = tokenizer("Your prompt here", return_tensors="pt")
55
  outputs = model.generate(**inputs, max_new_tokens=100)
56
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
57
  ```
58
 
59
- ## πŸ“‹ Technical Details
60
 
61
  | Property | Value |
62
  |----------|-------|
63
  | Base Model | [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) |
64
  | Specialization | Python |
65
  | Prune Mode | Aggressive |
66
- | Pruning Method | Activation-based weight pruning (Wanda) |
67
- | Weight Reduction | 20% weights pruned |
68
-
69
- ## πŸ”— Related Models
70
 
71
- This model is part of the **Qwen2.5-3B-Instruct** pruned model collection. Variants:
72
- - **Safe** - Conservative pruning (~10-20%), high accuracy retention
73
- - **Aggressive** - Maximum compression (~40-50%), best for edge deployment
74
 
75
- ## πŸ“œ License
76
-
77
- This model inherits the license from the base model [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct).
78
-
79
- ---
80
- *Generated by ZANNPS [Zeto Automatic Neural Network Pruning System]*
 
5
  - python
6
  - optimized
7
  - wanda
 
8
  base_model: Qwen/Qwen2.5-3B-Instruct
9
  pipeline_tag: text-generation
10
  ---
11
 
12
  # Qwen2.5-3B-Instruct-python-aggressive
13
 
14
+ > 🎯 **PYTHON-optimized** | πŸ“¦ **Aggressive** pruning | ⚑ **35% weights pruned**
15
 
16
+ This model is a **aggressively pruned** version of [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct).
17
 
18
+ ## Performance Comparison
 
 
 
 
 
 
 
19
 
20
  | Category | Original | Pruned | Change |
21
  |----------|----------|--------|--------|
22
+ | **Python** | 92.3% | 84.6% ⭐ | ↓ 7.7% |
23
+ | Html | 40.0% | 30.0% | ↓ 10.0% |
24
+ | Trivia | 100.0% | 86.7% | ↓ 13.3% |
25
+ | Math | 100.0% | 100.0% | β†’ |
26
+ | Reasoning | 91.7% | 83.3% | ↓ 8.3% |
27
+ | Medical | 64.3% | 35.7% | ↓ 28.6% |
28
+ | Linux | 69.2% | 61.5% | ↓ 7.7% |
29
+ | Writing | 54.5% | 36.4% | ↓ 18.2% |
30
 
31
+ **Average**: 76.5% β†’ 64.8% (-11.7%)
32
 
33
+ **Python Retention**: 91.7%
34
 
35
  ![Comparison Graph](comparison_graph.png)
36
 
37
+ ## Quick Start
38
 
39
  ```python
40
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
42
  model = AutoModelForCausalLM.from_pretrained("CompactAI/Qwen2.5-3B-Instruct-python-aggressive")
43
  tokenizer = AutoTokenizer.from_pretrained("CompactAI/Qwen2.5-3B-Instruct-python-aggressive")
44
 
 
45
  inputs = tokenizer("Your prompt here", return_tensors="pt")
46
  outputs = model.generate(**inputs, max_new_tokens=100)
47
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
48
  ```
49
 
50
+ ## Technical Details
51
 
52
  | Property | Value |
53
  |----------|-------|
54
  | Base Model | [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) |
55
  | Specialization | Python |
56
  | Prune Mode | Aggressive |
57
+ | Weight Reduction | 35% weights pruned |
 
 
 
58
 
59
+ ## License
 
 
60
 
61
+ This model inherits the license from the base model.
 
 
 
 
 
comparison_graph.png CHANGED
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3053d788c32f12320275840c58dd6893508dda79b6ae02d90eb1bfbfa1c29393
3
  size 3995916600
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1f0de7a63cc12c5699235711a61bd485a2406bb055b05cd548cc55d2cf73c1c
3
  size 3995916600
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b3e7bf62f1bb6cfb8dcaef62af3bfcbc739ea854e47f2955093ce867f12a2634
3
  size 2176009944
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de96d4bc81ea58673a20ed6314e52c0eb10fcd5bf44d38f000338af96084b70c
3
  size 2176009944
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:51354673edf4300eb841665e1fb684cc1badea87c49d5de6ef09981151683508
3
  size 11422159
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b3e3adf18710ac3bd97b384b0d01b58205c4c5cd37c6c56d24c8fff86b0561d
3
  size 11422159