Safetensors please?
#2
by
aimeri
- opened
For best performance with vllm using safetensors would be best. Would it be possible to release the model as safetensors or a vllm compatible quantization (preferably at a high bpw). Thank you!
+1 - also very easy to use mlx_lm.convert with original safetensors
Safetensors would be the best option. I would like to quantize the model to various formats, especially AWQ.