LLM Compression: Enhancing AWQ

Graduation project focused on improving AWQ (Activation-aware Weight Quantization) with extra scaling.
- Obtained lower perplexity for INT3-quantized OPT and Llama 2 models.

Graduation project focused on improving AWQ (Activation-aware Weight Quantization) with extra scaling.