deepseek-ai-deepseek-r1-distill-qwen-1.5b-f16

This repository contains GGUF quantized models converted from deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.

Model Details

Quantization Information

  • F16: Half precision (16-bit floating point)

Usage

With llama.cpp

# Download the model
huggingface-cli download Kaleemullah/deepseek-ai-deepseek-r1-distill-qwen-1.5b-f16 deepseek-ai-deepseek-r1-distill-qwen-1.5b-f16.gguf --local-dir ./models

# Run inference
./llama-cli -m ./models/deepseek-ai-deepseek-r1-distill-qwen-1.5b-f16.gguf -p "Your prompt here"

With Python (llama-cpp-python)

from llama_cpp import Llama

# Load the model
llm = Llama(
    model_path="./models/deepseek-ai-deepseek-r1-distill-qwen-1.5b-f16.gguf",
    n_ctx=2048,  # Context window
    n_gpu_layers=-1  # Use GPU if available
)

# Generate text
output = llm("Your prompt here", max_tokens=100)
print(output['choices'][0]['text'])

With Ollama

# Create a Modelfile
echo 'FROM ./models/deepseek-ai-deepseek-r1-distill-qwen-1.5b-f16.gguf' > Modelfile

# Create the model
ollama create my-model -f Modelfile

# Run the model
ollama run my-model

Model Architecture

This is a quantized version of deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, optimized for efficient inference while maintaining model quality.

License

This model inherits the license from the original model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Citation

If you use this model, please cite the original model:

@misc{deepseek-ai-DeepSeek-R1-Distill-Qwen-1.5B,
  author = {Original Model Authors},
  title = {deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B}
}

Converted with

This model was converted using llama.cpp's convert_hf_to_gguf.py script.


Note: GGUF models are compatible with llama.cpp, Ollama, LM Studio, and other GGUF-compatible inference engines.

Downloads last month
40
GGUF
Model size
2B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Kaleemullah/deepseek-r1-distill-qwen-1.5b-gguf

Quantized
(236)
this model