Instructions to use SL-AI/CRePE-Mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SL-AI/CRePE-Mini with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SL-AI/CRePE-Mini")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("SL-AI/CRePE-Mini")
model = AutoModelForImageTextToText.from_pretrained("SL-AI/CRePE-Mini")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use SL-AI/CRePE-Mini with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SL-AI/CRePE-Mini"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SL-AI/CRePE-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/SL-AI/CRePE-Mini

SGLang

How to use SL-AI/CRePE-Mini with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SL-AI/CRePE-Mini" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SL-AI/CRePE-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SL-AI/CRePE-Mini" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SL-AI/CRePE-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use SL-AI/CRePE-Mini with Docker Model Runner:
```
docker model run hf.co/SL-AI/CRePE-Mini
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

The Code Reasoning Epert (for) Project Exploration

The Model

Attribute	Size	Modalities	Domain
CRePE Mini	3B	Text + Image + Video in, Text out	FIM / Autocomplete

Capabilities

CRePE Mini was trained on a massive corpus of code samples, from The Trellis to other open datasets.

This model is built for code assistance, not agentic coding. Capable of FIM tasks, light code review, and making smaller utility files

How to Run

I recommend using LM Studio for running GRaPE / CRePE Models, and have generally found these sampling parameters to work best:

Name	Value
Temperature	0.6
Top K Sampling	40
Repeat Penalty	1
Top P Sampling	0.85
Min P Sampling	0.05

CRePE Mini as a Model

CRePE Mini is more an experiment than anything. This model was trained on The Trellis dataset for code samples, and all code examples from the GRaPE Instruct dataset. And thus has become an apt coder for light tasks. It is in no way designed to replace coders, only to empower them.

Architecture

CRePE Mini: Built on the GRaPE Mini's architecture

Notes

The GRaPE Family started all the way back in August of 2025, meaning these models are severely out of date on architecture, and training data.

GRaPE 2 will come sooner than the GRaPE 1 family had, and will show multiple improvements.

There are no benchmarks for GRaPE 1 Models due to the costly nature of running them, as well as prioritization of newer models.

Updates for GRaPE 2 models will be posted here on Huggingface, as well as Skinnertopia

Downloads last month: 10

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for SL-AI/CRePE-Mini

Quantizations

2 models

Collection including SL-AI/CRePE-Mini

CRePE

Collection

Built on the first generation of GRaPE, CRePE, the Code Reasoning Expert for Project Exploration is designed solely for code assistance. • 1 item • Updated Mar 19 • 1