Large Language Models are Strong Audio-Visual Speech Recognition Learners

Summary

This is a publication. If there is no link to the publication on this page, you can try the pre-formated search via the search engines listed on this page.

Authors: Umberto Cappellazzo, Minsu Kim, Honglie Chen, Pingchuan Ma, Stavros Petridis, Daniele Falavigna, Alessio Brutti, Maja Pantic

Journal title: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Journal publisher: IEEE

Published year: 2025

DOI identifier: 10.1109/ICASSP49660.2025.10889251