Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format

Summary

This is a publication. If there is no link to the publication on this page, you can try the pre-formated search via the search engines listed on this page.

Authors: Chao Fang, Man Shi, Robin Geens, Arne Symons, Zhongfeng Wang, Marian Verhelst

Journal title: 2025 IEEE International Symposium on High Performance Computer Architecture (HPCA)

Journal publisher: IEEE

Published year: 2025

DOI identifier: 10.1109/HPCA61900.2025.00110