How to use from
Unsloth Studio
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct-v1-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct-v1-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct-v1-GGUF to start chatting
Quick Links

MiniCPM5-1B-Hindi-Instruct v1 — GGUF Quantizations

GGUF quantized versions of pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct for efficient local inference with llama.cpp, Ollama, LM Studio, and other GGUF-compatible runtimes.

Part of the 🇮🇳 Hindi LLM Series by @pankajpandey-dev.

Available Quantizations

File Quant Size Recommended Use
MiniCPM5-1B-Hindi-Instruct-Q3_K_M.gguf Q3_K_M ~560 MB Mobile, low-RAM devices, fast inference
MiniCPM5-1B-Hindi-Instruct-Q4_K_M.gguf Q4_K_M ~670 MB Recommended — best size/quality balance
MiniCPM5-1B-Hindi-Instruct-Q5_K_M.gguf Q5_K_M ~790 MB Better quality, slightly larger
MiniCPM5-1B-Hindi-Instruct-Q6_K.gguf Q6_K ~900 MB Near-lossless quality
MiniCPM5-1B-Hindi-Instruct-Q8_0.gguf Q8_0 ~1.2 GB Highest quality, essentially full precision

Quick Start

llama.cpp

huggingface-cli download pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct-v1-GGUF MiniCPM5-1B-Hindi-Instruct-Q4_K_M.gguf --local-dir .

./llama-cli -m MiniCPM5-1B-Hindi-Instruct-Q4_K_M.gguf \
    -p "नमस्ते! बारिश के दिन पर एक छोटी कविता लिखो।" \
    -n 256 --temp 0.7 --top-p 0.9 --repeat-penalty 1.1

Ollama

Create a Modelfile: FROM ./MiniCPM5-1B-Hindi-Instruct-Q4_K_M.gguf PARAMETER temperature 0.7 PARAMETER top_p 0.9 PARAMETER repeat_penalty 1.1

Then run:

ollama create hindi-minicpm5 -f Modelfile
ollama run hindi-minicpm5 "मशीन लर्निंग क्या है?"

LM Studio

  1. Download any .gguf file from this repo
  2. Open LM Studio → Local Models → load the file
  3. Use chat template: ChatML (<|im_start|> / <|im_end|>)

Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
    model_path = "MiniCPM5-1B-Hindi-Instruct-Q4_K_M.gguf",
    n_ctx      = 2048,
    n_threads  = 4,
)

response = llm.create_chat_completion(
    messages = [
        {"role": "user", "content": "भारत के बारे में एक रोचक तथ्य बताइए।"}
    ],
    temperature = 0.7,
    top_p       = 0.9,
    max_tokens  = 256,
)
print(response["choices"][0]["message"]["content"])

Recommended Generation Parameters

  • temperature: 0.7 (range 0.5–0.9)
  • top_p: 0.9
  • repeat_penalty: 1.1
  • max_tokens: 256–512 depending on task

Choosing the Right Quant

  • Phone / Raspberry Pi / 2GB RAM: Q3_K_M or Q4_K_M
  • Laptop / desktop CPU: Q4_K_M or Q5_K_M (best default)
  • Quality-focused workflows: Q6_K or Q8_0
  • Research / reproducibility: Q8_0

Base Model & Training

These quants are derived from the full-precision merged 16-bit model at pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct, which was fine-tuned from openbmb/MiniCPM5-1B on AI4Bharat's Hindi instruction datasets.

See the main model card for full training details.

Acknowledgements

Citation

@misc{pandey2026minicpm5hindigguf,
  title  = {MiniCPM5-1B-Hindi-Instruct v1 GGUF},
  author = {Pankaj Pandey},
  year   = {2026},
  url    = {https://huggingface.co/pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct-v1-GGUF}
}

Part of an ongoing effort to bring strong open-source LLMs to Indian languages. Feedback welcome via the community tab.

Downloads last month
560
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct-v1-GGUF

Quantized
(1)
this model

Collection including pankajpandey-dev/MiniCPM5-1B-Hindi-Instruct-v1-GGUF