Instructions to use SL-AI/CRePE-Mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SL-AI/CRePE-Mini with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SL-AI/CRePE-Mini") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("SL-AI/CRePE-Mini") model = AutoModelForImageTextToText.from_pretrained("SL-AI/CRePE-Mini") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use SL-AI/CRePE-Mini with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SL-AI/CRePE-Mini" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SL-AI/CRePE-Mini", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/SL-AI/CRePE-Mini
- SGLang
How to use SL-AI/CRePE-Mini with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SL-AI/CRePE-Mini" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SL-AI/CRePE-Mini", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SL-AI/CRePE-Mini" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SL-AI/CRePE-Mini", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use SL-AI/CRePE-Mini with Docker Model Runner:
docker model run hf.co/SL-AI/CRePE-Mini
The Code Reasoning Epert (for) Project Exploration
The Model
| Attribute | Size | Modalities | Domain |
|---|---|---|---|
| CRePE Mini | 3B | Text + Image + Video in, Text out | FIM / Autocomplete |
Capabilities
CRePE Mini was trained on a massive corpus of code samples, from The Trellis to other open datasets.
This model is built for code assistance, not agentic coding. Capable of FIM tasks, light code review, and making smaller utility files
How to Run
I recommend using LM Studio for running GRaPE / CRePE Models, and have generally found these sampling parameters to work best:
| Name | Value |
|---|---|
| Temperature | 0.6 |
| Top K Sampling | 40 |
| Repeat Penalty | 1 |
| Top P Sampling | 0.85 |
| Min P Sampling | 0.05 |
CRePE Mini as a Model
CRePE Mini is more an experiment than anything. This model was trained on The Trellis dataset for code samples, and all code examples from the GRaPE Instruct dataset. And thus has become an apt coder for light tasks. It is in no way designed to replace coders, only to empower them.
Architecture
- CRePE Mini: Built on the GRaPE Mini's architecture
Notes
The GRaPE Family started all the way back in August of 2025, meaning these models are severely out of date on architecture, and training data.
GRaPE 2 will come sooner than the GRaPE 1 family had, and will show multiple improvements.
There are no benchmarks for GRaPE 1 Models due to the costly nature of running them, as well as prioritization of newer models.
Updates for GRaPE 2 models will be posted here on Huggingface, as well as Skinnertopia
- Downloads last month
- 10
