Instructions to use QuantLLM/functiongemma-270m-it-4bit-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use QuantLLM/functiongemma-270m-it-4bit-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir functiongemma-270m-it-4bit-mlx QuantLLM/functiongemma-270m-it-4bit-mlx
- Transformers
How to use QuantLLM/functiongemma-270m-it-4bit-mlx with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("QuantLLM/functiongemma-270m-it-4bit-mlx", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
π About This Model
This model is google/functiongemma-270m-it converted to MLX format optimized for Apple Silicon (M1/M2/M3/M4) Macs with native acceleration.
| Property | Value |
|---|---|
| Base Model | google/functiongemma-270m-it |
| Format | MLX |
| Quantization | Q4_K_M |
| License | apache-2.0 |
| Created With | QuantLLM |
π Quick Start
Generate Text with mlx-lm
from mlx_lm import load, generate
# Load the model
model, tokenizer = load("QuantLLM/functiongemma-270m-it-4bit-mlx")
# Simple generation
prompt = "Explain quantum computing in simple terms"
messages = [{"role": "user", "content": prompt}]
prompt_formatted = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True
)
# Generate response
text = generate(model, tokenizer, prompt=prompt_formatted, verbose=True)
print(text)
Streaming Generation
from mlx_lm import load, stream_generate
model, tokenizer = load("QuantLLM/functiongemma-270m-it-4bit-mlx")
prompt = "Write a haiku about coding"
messages = [{"role": "user", "content": prompt}]
prompt_formatted = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True
)
# Stream tokens as they're generated
for token in stream_generate(model, tokenizer, prompt=prompt_formatted, max_tokens=200):
print(token, end="", flush=True)
Command Line Interface
# Install mlx-lm
pip install mlx-lm
# Generate text
python -m mlx_lm.generate --model QuantLLM/functiongemma-270m-it-4bit-mlx --prompt "Hello!"
# Interactive chat
python -m mlx_lm.chat --model QuantLLM/functiongemma-270m-it-4bit-mlx
System Requirements
| Requirement | Minimum |
|---|---|
| Chip | Apple Silicon (M1/M2/M3/M4) |
| macOS | 13.0 (Ventura) or later |
| Python | 3.10+ |
| RAM | 8GB+ (16GB recommended) |
# Install dependencies
pip install mlx-lm
π Model Details
| Property | Value |
|---|---|
| Original Model | google/functiongemma-270m-it |
| Format | MLX |
| Quantization | Q4_K_M |
| License | apache-2.0 |
| Export Date | 2025-12-21 |
| Exported By | QuantLLM v2.0 |
π Created with QuantLLM
Convert any model to GGUF, ONNX, or MLX in one line!
from quantllm import turbo
# Load any HuggingFace model
model = turbo("google/functiongemma-270m-it")
# Export to any format
model.export("mlx", quantization="Q4_K_M")
# Push to HuggingFace
model.push("your-repo", format="mlx")
π Documentation Β· π Report Issue Β· π‘ Request Feature
- Downloads last month
- 8
Model size
0.3B params
Tensor type
F32
Β·
F16 Β·
I8 Β·
Hardware compatibility
Log In to add your hardware
Quantized
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for QuantLLM/functiongemma-270m-it-4bit-mlx
Base model
google/functiongemma-270m-it