Osaka Univarsity AI・Date Seminar 104th, Mar27

Title: The Quest for Super-Efficiency: Workflow and Research Status of Large-Scale Model Quantization
Speaker: Yuma Ichikawa (Principal Researcher, Fujitsu Laboratories Ltd.; Project Researcher, RIKEN AI Research Center)
Abstract: In recent years, as Large Language Models (LLMs) have grown in scale, challenges such as reduced inference speed, increased memory usage, and rising power costs have become apparent. To address these issues, quantization techniques that discretize real-number representations to improve computational efficiency have garnered attention. However, achieving ultra-low-bit quantization in LLMs has been reported to be difficult. We tackled this challenge and succeeded in maintaining an average of 90% performance on standard benchmarks even with 1-bit ultra-low-bit quantization. This presentation systematically overviews the LLM quantization workflow that enabled this achievement and the latest research trends.

Date

27th March, 2026(Fri,)18:00~20:00

Venue

Held online

Organizer

Co-organizer (HRAM The Japan Society for Industrial and Applied Mathematics, D-DRIVE National Network)

Participation Fee

Free(Advance registration required)

https://www-mmds.sigmath.es.osaka-u.ac.jp/structure/activity/ai_data.php?id=106

web

https://www-mmds.sigmath.es.osaka-u.ac.jp/structure/activity/ai_data.php?id=106

Contact

Takashi Suzuki
suzuki@sigmath.es.osaka-u.ac.jp