Hardware-Aware Quantization for LLM Inference Optimization

Research on hardware-aware quantization techniques to optimize the inference performance of large language models.

  • This project is funded by Jeonbuk National University.
  • Period: Apr. 2026 – Feb. 2027.
  • Total Grant: 30,000,000 KRW

The goal of this project is to develop hardware-aware quantization techniques that optimize the inference performance of large language models (LLMs). By taking into account the characteristics of the target hardware, the proposed quantization methods aim to reduce computational overhead and memory footprint while preserving model accuracy, enabling efficient deployment of LLMs on resource-constrained platforms.