Hi, I'm trying to evaluate some Quip quantized llama models. I couldn't find any helpful code/scripts in your repo. Could you please let me know how I can do this? Is there a part in your code that does that? Or if it's not, how can I create a fake quantize model so I can load the model via transformers without a problem?
Hi, I'm trying to evaluate some Quip quantized llama models. I couldn't find any helpful code/scripts in your repo. Could you please let me know how I can do this? Is there a part in your code that does that? Or if it's not, how can I create a fake quantize model so I can load the model via transformers without a problem?