How to use from
Docker Model Runner
docker model run hf.co/Jackrong/Qwopus3.6-35B-A3B-v1-MTP-GGUF
Quick Links

Jackrong/Qwopus3.6-35B-A3B-v1-MTP-GGUF

âš¡ What is MTP (Multi-Token Prediction)?
MTP (Multi-Token Prediction) is a technique introduced in the Qwen3.6 architecture that enables the model to predict multiple future tokens simultaneously. By leveraging dedicated MTP heads, this model supports speculative decoding, where a draft model predicts multiple tokens at once and the target model verifies them in parallel, resulting in significant inference speedups without sacrificing output quality.

This GGUF release preserves the MTP heads from unsloth/Qwen3.6-35B-A3B, making it compatible with mainstream inference frameworks that support MTP-based speculative decoding (such as llama.cpp and its derivatives). For optimal throughput, pair this MTP-enabled GGUF with a corresponding draft model.

Source model: Jackrong/Qwopus3.6-35B-A3B-v1 MTP source: unsloth/Qwen3.6-35B-A3B

Uploaded GGUF variants:

  • Qwopus3.6-35B-A3B-v1-MTP-Q2_K.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-Q3_K_S.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-Q3_K_M.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-Q3_K_L.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-IQ4_XS.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-Q4_K_S.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-Q4_K_M.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-Q5_K_S.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-Q5_K_M.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-Q6_K.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-Q8_0.gguf
  • Qwopus3.6-35B-A3B-v1-MTP-BF16.gguf

This release was prepared by validating or injecting Qwen MTP/nextn tensors before GGUF conversion.

Downloads last month
17,965
GGUF
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Jackrong/Qwopus3.6-35B-A3B-v1-MTP-GGUF

Adapter
(7)
this model

Datasets used to train Jackrong/Qwopus3.6-35B-A3B-v1-MTP-GGUF

Collection including Jackrong/Qwopus3.6-35B-A3B-v1-MTP-GGUF