GPTQ: ACCURATE POST-TRAINING QUANTIZATION FOR GENERATIVE PRE-TRAINED TRANSFORMERS
IN THIS , WE ADDRESS THIS CHALLENGE, AND PROPOSE GPTQ, A NEW ONE-SHOT WEIGHT QUANTIZATION METHOD BASED ON APPROXIMATE SECOND-ORDER INFORMATION, THAT IS BOTH HIGHLY-ACCURATE AND HIGHLY-EFFICIENT.