Carbon-3B-GGUF

GGUF quantizations of HuggingFaceBio/Carbon-3B β€” a generative DNA foundation model β€” for efficient inference with llama.cpp.

Carbon-3B is a 3B-parameter autoregressive genomic model that uses a hybrid tokenizer (6-mer chunks for DNA inside <dna> tags, BPE for English text). It matches the performance of Evo2-7B while running over 250Γ— faster at inference.

πŸ“¦ Available Quantizations

File Quant Method Size Use Case
carbon-3b-Q2_K.gguf Q2_K 1.4 GB Smallest β€” for very constrained edge devices
carbon-3b-Q3_K_M.gguf Q3_K_M 1.8 GB Small with reasonable quality
carbon-3b-Q4_K_M.gguf Q4_K_M 2.1 GB ⭐ Recommended β€” best size/quality balance
carbon-3b-Q5_K_M.gguf Q5_K_M 2.4 GB Higher quality
carbon-3b-Q6_K.gguf Q6_K 2.7 GB Very close to f16 quality
carbon-3b-Q8_0.gguf Q8_0 3.5 GB Highest quality, near-lossless

βš™οΈ Requirements

You must use a recent build of llama.cpp that includes PR #23410 β€” this added support for Carbon's custom HybridDNATokenizer. Any build from master after that PR works.

git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp && cmake -B build && cmake --build build -j

πŸš€ Usage

Download a quant

huggingface-cli download pankajpandey-dev/Carbon-3B-GGUF carbon-3b-Q4_K_M.gguf --local-dir .

DNA sequence continuation

Wrap DNA sequences in <dna> tags so the tokenizer switches into 6-mer mode:

./build/bin/llama-completion -m carbon-3b-Q4_K_M.gguf \
    -p '<dna>ATGCGCTAGCTACGATCGATCGTAGCTAGCTAGCTAGCTACG' \
    -n 64 --temp 0 -no-cnv

Conditional generation with metadata tags

./build/bin/llama-completion -m carbon-3b-Q4_K_M.gguf \
    -p '<vertebrate_mammalian><protein_coding_region><dna>ATGCGCTAG' \
    -n 64 --temp 0 -no-cnv

Long-context generation (up to 131k 6-mer tokens β‰ˆ 786 kbp)

./build/bin/llama-completion -m carbon-3b-Q4_K_M.gguf \
    -c 65536 --rope-scaling yarn --rope-scale 4 --yarn-orig-ctx 32768 \
    -p '<dna>...' -n 64 --temp 0 -no-cnv

⚠️ Important Notes

  • DNA tag is mandatory. Raw DNA without the <dna> tag will be BPE-tokenized as English text, producing meaningless results.
  • 6-mer constraint. DNA sequences must contain only [ATCG] characters. Any other letters trigger an <oov> token.
  • fns revision is not supported. This GGUF is converted from the main revision only. The fns branch uses a factorized nucleotide supervision head that requires the Python transformers path.
  • Carbon-500M (GGUF here) can be used as a draft model for speculative decoding to speed up Carbon-3B inference at no quality loss.

πŸ“Š Quantization Method

These quants were produced from the original bf16 HuggingFaceBio/Carbon-3B safetensors using convert_hf_to_gguf.py followed by llama-quantize from llama.cpp (build b9330). Standard K-quants and Q8_0 were used β€” no imatrix calibration was applied, as standard text calibration corpora are not appropriate for DNA-pretrained models.

πŸ“œ License

Apache 2.0 β€” inherited from the source model.

πŸ™ Credits

πŸ”— Related


Part of my GGUF Quantizations collection β€” making open models accessible across hardware tiers.

Downloads last month
639
GGUF
Model size
3B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for pankajpandey-dev/Carbon-3B-GGUF

Quantized
(4)
this model

Collection including pankajpandey-dev/Carbon-3B-GGUF