Hello, thanks for the amazing work!
I'm having slowdowns during the loading and inference of this model with transformers (4.52.4) and pytorch (2.7.0) using an RTX 5090 (Less than 1 token/s), have you guys experienced this?
yeah, same...
Β· Sign up or log in to comment