Safetensors please?

#2
by aimeri - opened

For best performance with vllm using safetensors would be best. Would it be possible to release the model as safetensors or a vllm compatible quantization (preferably at a high bpw). Thank you!

+1 - also very easy to use mlx_lm.convert with original safetensors

Safetensors would be the best option. I would like to quantize the model to various formats, especially AWQ.

Sign up or log in to comment