Present Hardware-Aware Quantization for LLM Inference Optimization Research on hardware-aware quantization techniques to optimize the inference performance of large language models. Implementation of Core Operation Kernels Based on Low-Level API for AI Semiconductor Support Research on implementing core operation kernels using low-level APIs to support AI semiconductors. Past Development of open-edge AI SoC hardware and software platform The compiler is based on MLIR. Deep Learning Compiler for NPU The compiler is designed to allow state of the art compiler optimizations and code generation of neural network graphs. SuggestBot Development of a context-based smart interaction service platform