stable-code-instruct-3b-GGUF

Instructions to use lmstudio-community/stable-code-instruct-3b-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use lmstudio-community/stable-code-instruct-3b-GGUF with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="lmstudio-community/stable-code-instruct-3b-GGUF")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("lmstudio-community/stable-code-instruct-3b-GGUF", dtype="auto")

llama-cpp-python

How to use lmstudio-community/stable-code-instruct-3b-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="lmstudio-community/stable-code-instruct-3b-GGUF",
	filename="stable-code-instruct-3b-IQ3_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use lmstudio-community/stable-code-instruct-3b-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M

Use Docker

docker model run hf.co/lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M

LM Studio
Jan

vLLM

How to use lmstudio-community/stable-code-instruct-3b-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "lmstudio-community/stable-code-instruct-3b-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lmstudio-community/stable-code-instruct-3b-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M

SGLang

How to use lmstudio-community/stable-code-instruct-3b-GGUF with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "lmstudio-community/stable-code-instruct-3b-GGUF" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lmstudio-community/stable-code-instruct-3b-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "lmstudio-community/stable-code-instruct-3b-GGUF" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lmstudio-community/stable-code-instruct-3b-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use lmstudio-community/stable-code-instruct-3b-GGUF with Ollama:
```
ollama run hf.co/lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M
```

Unsloth Studio new

How to use lmstudio-community/stable-code-instruct-3b-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for lmstudio-community/stable-code-instruct-3b-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for lmstudio-community/stable-code-instruct-3b-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for lmstudio-community/stable-code-instruct-3b-GGUF to start chatting

Docker Model Runner
How to use lmstudio-community/stable-code-instruct-3b-GGUF with Docker Model Runner:
```
docker model run hf.co/lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M
```

Lemonade

How to use lmstudio-community/stable-code-instruct-3b-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull lmstudio-community/stable-code-instruct-3b-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.stable-code-instruct-3b-GGUF-Q4_K_M

List all available models

lemonade list

💫 Community Model> Stable Code Instruct 3B by Stability AI

👾 LM Studio Community models highlights program. Highlighting new & noteworthy models by the community. Join the conversation on Discord.

Model creator: Stability AI
Original model: stable-code-instruct-3b
GGUF quantization: provided by bartowski based on llama.cpp release b2440

Model Summary:

Stable Code Instruct 3B is part of the Stable Code family, instruction tuned using Direct Preference Optimization (DPO) from the base Stable Code 3B model. This model is great for general purpose coversational coding, as well as SQL generation.

Prompt Template:

Choose the ChatML preset in your LM Studio.

Under the hood, the model will see a prompt that's formatted like so:

<|im_start|>system
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Use case and examples

Stable Code Instruct 3B is trained primarily on coding, with a bit of conversational tuning to provide some assistant behavior. It also excels at SQL generation. This makes it great for light-weight coding purposes and its size makes it extremely fast for potential code-completion.

SQL generation with explanation:

<|im_start|>system<br>
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.<|im_end|><br>
<|im_start|>user<br>
Using python, how can I create an SQL table where the primary key is a combination of two columns?<|im_end|><br>
<|im_start|>assistant<br>
To create an SQL table with a composite primary key in Python, you can use the `sqlite3` module. Here's an example code snippet:

    import sqlite3
    # Connect to the database (or create it if it doesn't exist)
    conn = sqlite3.connect('example.db')
    # Create a cursor object to execute SQL commands
    c = conn.cursor()
    # Define the table schema with a composite primary key
    schema = '''
    CREATE TABLE example_table (
        column1 TEXT,
        column2 INTEGER,
        PRIMARY KEY (column1, column2)
    );
    '''
    # Execute the SQL command to create the table
    c.execute(schema)
    # Commit the changes to the database
    conn.commit()
    # Close the database connection
    conn.close()

In this example, we first import the `sqlite3` module and create a connection to the database (or create it if it doesn't exist). We then create a cursor object to execute SQL commands.
Next, we define the table schema with a composite primary key using the `CREATE TABLE` command. In this example, the table is named `example_table` and has two columns: `column1` (text data type) and `column2` (integer data type). The `PRIMARY KEY` constraint is used to specify that the combination of `column1` and `column2` should be unique for each row in the table.
After defining the table schema, we execute the SQL command using the cursor object. We then commit the changes to the database and close the database connection.
Note that this example assumes that you have already installed the `sqlite3` module in your Python environment. If you haven't done so already, you can install it by running the following command in your terminal or command prompt:

    pip install sqlite3

This will install the `sqlite3` module in your Python environment.<|im_end|>

General python coding questions:

<|im_start|>system<br>
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.<|im_end|><br>
<|im_start|>user<br>
Using python, how can I create an SQL table where the primary key is a combination of two columns?<|im_end|><br>
<|im_start|>assistant<br>
You can achieve this by using Python's built-in `sort()` function or `sorted()` function. Both these functions allow you to specify multiple sorting conditions. 

Here is an example code snippet that demonstrates how to sort a list of integer tuples based on the second item, then the first:

    # Sample list of integer tuples
    my_list = [(1, 2), (3, 1), (2, 3)]
    
    # Sort the list by the second item, then the first
    my_list.sort(key=lambda x: (x[1], x[0]))))
    
    # Print the sorted list
    print(my_list)

In this example, we use a lambda function as the `key` argument to the `sort()` function. The lambda function takes each tuple in the list and returns a tuple of the second item and the first item of the original tuple. This way, the tuples are sorted based on the second item first, then by the first item if there is a tie.

The output of this code will be:

    [(3, 1), (1, 2), (2, 3)]

This means that the list has been sorted in ascending order based on the second item of each tuple, and if there is a tie, it is resolved by sorting based on the first item of each tuple.<|im_end|>

General C coding questions:

<|im_start|>system<br>
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.<|im_end|><br>
<|im_start|>user<br>
<|im_start|>user
Can you write a C program that takes two numbers and prints their sum to the console?<|im_end|>
<|im_start|>assistant
Sure! Here is a simple C program that does exactly what you asked for:

    #include <stdio.h>
    
    int main() {
        int num1, num2, sum;
    
        printf("Enter first number: ");
        scanf("%d", &num1);
    
        printf("Enter second number: ");
        scanf("%d", &num2);
    
        sum = num1 + num2;
    
        printf("The sum of %d and %d is %d\n", num1, num2, sum);
    
        return 0;
    }

This program first prompts the user to enter two numbers. It then calculates the sum of those two numbers and prints the result to the console.<|im_end|>

Technical Details

The base model was trained from the following datasets:

Specializing in the following languages:

C
CPP
Java
JavaScript
CSS
Go
HTML
Ruby
Rust
Markdown
Shell
Php
Sql
R
Typescript
Python
Jupyter-Clean
RestructuredText

Special thanks

🙏 Special thanks to Georgi Gerganov and the whole team working on llama.cpp for making all of this possible.

Disclaimers

LM Studio is not the creator, originator, or owner of any Model featured in the Community Model Program. Each Community Model is created and provided by third parties. LM Studio does not endorse, support, represent or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand that Community Models can produce content that might be offensive, harmful, inaccurate or otherwise inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who originated such Model. LM Studio may not monitor or control the Community Models and cannot, and does not, take responsibility for any such Model. LM Studio disclaims all warranties or guarantees about the accuracy, reliability or benefits of the Community Models. LM Studio further disclaims any warranty that the Community Model will meet your requirements, be secure, uninterrupted or available at any time or location, or error-free, viruses-free, or that any errors will be corrected, or otherwise. You will be solely responsible for any damage resulting from your use of or access to the Community Models, your downloading of any Community Model, or use of any other Community Model provided by or through LM Studio.

Downloads last month: 344

GGUF

Model size

3B params

Architecture

stablelm

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Paper for lmstudio-community/stable-code-instruct-3b-GGUF

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 66

Evaluation results

pass@1 on MultiPL-HumanEval (Python)
self-reported

32.400
pass@1 on MultiPL-HumanEval (C++)
self-reported

30.900
pass@1 on MultiPL-HumanEval (Java)
self-reported

32.100
pass@1 on MultiPL-HumanEval (JavaScript)
self-reported

32.100
pass@1 on MultiPL-HumanEval (PHP)
self-reported

24.200
pass@1 on MultiPL-HumanEval (Rust)
self-reported

23.000