Quantized version of ai-sage/GigaChat3-10B-A1.8B-bf16

This requires llama.cpp PR#17420 to properly detect as a deepseek lite model.

Quick Start

export CUDA_VISIBLE_DEVICES=0
export model="./output/GigaChat3-10B-A1.8B-f16.gguf"

 ./build/bin/llama-server \
  --model "$model" \
  --ctx-size 2000 \
  --parallel 1 \
  --threads 8 \
  --host 0.0.0.0 \
  --port 8088  \
  -cmoe \
  --jinja \
  --chat-template-file ./output/chat_template.jinja

Downloads last month: 417

GGUF

Model size

11B params

Architecture

deepseek2

Hardware compatibility

8-bit

16-bit

Model tree for evilfreelancer/GigaChat3-10B-A1.8B-GGUF

Base model

ai-sage/GigaChat3-10B-A1.8B-bf16

Quantized

(13)

this model