scthornton's picture
Upgrade model card: badges, quick start, training details, collection table, citations
95712b6 verified
---
library_name: peft
pipeline_tag: text-generation
license: apache-2.0
language:
- code
base_model:
- ibm-granite/granite-20b-code-instruct-8k
tags:
- securecode
- security
- owasp
- code-generation
- secure-coding
- lora
- qlora
- vulnerability-detection
- cybersecurity
datasets:
- scthornton/securecode
model-index:
- name: granite-20b-code-securecode
results: []
---
# Granite 20B Code SecureCode
[![Parameters](https://img.shields.io/badge/parameters-20B-blue.svg)](#model-details) [![Dataset](https://img.shields.io/badge/dataset-2,185_examples-green.svg)](https://huggingface.co/datasets/scthornton/securecode) [![OWASP](https://img.shields.io/badge/OWASP-Top_10_2021_+_LLM_Top_10-red.svg)](#security-coverage) [![Method](https://img.shields.io/badge/method-QLoRA-purple.svg)](#training-details) [![License](https://img.shields.io/badge/license-Apache_2.0-orange.svg)](https://opensource.org/licenses/Apache-2.0)
**Enterprise flagship security-aware code generation model (IBM-grade trust). Fine-tuned on 2,185 real-world vulnerability examples covering OWASP Top 10 2021 and OWASP LLM Top 10 2025.**
[Dataset](https://huggingface.co/datasets/scthornton/securecode) | [Paper](https://huggingface.co/papers/2512.18542) | [Model Collection](https://huggingface.co/collections/scthornton/securecode) | [perfecXion.ai](https://perfecxion.ai) | [Blog Post](https://huggingface.co/blog/scthornton/securecode-models)
---
## What This Model Does
Granite 20B Code SecureCode generates security-aware code by teaching the model to recognize vulnerability patterns and produce secure implementations. Every training example includes:
- **Real-world incident grounding** — Tied to documented CVEs and breach reports
- **Vulnerable + secure implementations** — Side-by-side comparison
- **Attack demonstrations** — Concrete exploit code
- **Defense-in-depth guidance** — SIEM rules, logging, monitoring, infrastructure hardening
---
## Model Details
| Property | Value |
|----------|-------|
| **Base Model** | [ibm-granite/granite-20b-code-instruct-8k](https://huggingface.co/ibm-granite/granite-20b-code-instruct-8k) |
| **Parameters** | 20B |
| **Architecture** | GPT (IBM Granite Code) |
| **Method** | QLoRA (4-bit quantization + LoRA) |
| **LoRA Rank** | 16 |
| **LoRA Alpha** | 32 |
| **Training Data** | [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode) (2,185 examples) |
| **Training Time** | ~1h 19min |
| **Hardware** | 2x NVIDIA A100 40GB (GCP) |
| **Framework** | PEFT 0.18.1, Transformers 5.1.0, PyTorch 2.7.1 |
---
## Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained(
"ibm-granite/granite-20b-code-instruct-8k",
device_map="auto",
load_in_4bit=True
)
model = PeftModel.from_pretrained(base_model, "scthornton/granite-20b-code-securecode")
tokenizer = AutoTokenizer.from_pretrained("scthornton/granite-20b-code-securecode")
prompt = "Write a secure JWT authentication handler in Python with proper token validation"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## Training Details
| Hyperparameter | Value |
|----------------|-------|
| Learning Rate | 2e-4 |
| Batch Size | 1 |
| Gradient Accumulation | 16 |
| Epochs | 3 |
| Scheduler | Cosine |
| Warmup Steps | 100 |
| Optimizer | paged_adamw_8bit |
| Final Loss | 1.639 |
---
## SecureCode Model Collection
| Model | Parameters | Base | Training Time | Link |
|-------|------------|------|---------------|------|
| Llama 3.2 3B | 3B | Meta Llama 3.2 | 1h 5min | [scthornton/llama-3.2-3b-securecode](https://huggingface.co/scthornton/llama-3.2-3b-securecode) |
| Qwen Coder 7B | 7B | Qwen 2.5 Coder | 1h 24min | [scthornton/qwen-coder-7b-securecode](https://huggingface.co/scthornton/qwen-coder-7b-securecode) |
| CodeGemma 7B | 7B | Google CodeGemma | 1h 27min | [scthornton/codegemma-7b-securecode](https://huggingface.co/scthornton/codegemma-7b-securecode) |
| DeepSeek Coder 6.7B | 6.7B | DeepSeek Coder | 1h 15min | [scthornton/deepseek-coder-6.7b-securecode](https://huggingface.co/scthornton/deepseek-coder-6.7b-securecode) |
| CodeLlama 13B | 13B | Meta CodeLlama | 1h 32min | [scthornton/codellama-13b-securecode](https://huggingface.co/scthornton/codellama-13b-securecode) |
| Qwen Coder 14B | 14B | Qwen 2.5 Coder | 1h 19min | [scthornton/qwen2.5-coder-14b-securecode](https://huggingface.co/scthornton/qwen2.5-coder-14b-securecode) |
| StarCoder2 15B | 15B | BigCode StarCoder2 | 1h 40min | [scthornton/starcoder2-15b-securecode](https://huggingface.co/scthornton/starcoder2-15b-securecode) |
| **Granite 20B** | **20B** | **IBM Granite Code** | **1h 19min** | **This model** |
---
## Citation
```bibtex
@misc{thornton2025securecode,
title={SecureCode v2.0: A Production-Grade Dataset for Training Security-Aware Code Generation Models},
author={Thornton, Scott},
year={2025},
publisher={perfecXion.ai},
url={https://perfecxion.ai/articles/securecode-v2-dataset-paper.html},
note={Model: https://huggingface.co/scthornton/granite-20b-code-securecode}
}
```
---
## Links
- **Dataset**: [scthornton/securecode](https://huggingface.co/datasets/scthornton/securecode) (2,185 examples)
- **Paper**: [SecureCode v2.0](https://huggingface.co/papers/2512.18542)
- **Model Collection**: [SecureCode Models](https://huggingface.co/collections/scthornton/securecode) (8 models)
- **Blog Post**: [Training Security-Aware Code Models](https://huggingface.co/blog/scthornton/securecode-models)
- **Publisher**: [perfecXion.ai](https://perfecxion.ai)
---
## License
Apache 2.0