Improving End-to-end Mixed-case ASR with Knowledge Distillation and Integration of Voice Activity CuesSashi NovitasariTakashi Fukudaet al.2025INTERSPEECH 2025
Voice Activity-based Text Segmentation for ASR Text DenormalizationSashi NovitasariTakashi Fukudaet al.2025INTERSPEECH 2025
Knowledge Distillation Based Training of Unified Conformer CTC Models for Multi-form ASRTakashi FukudaGakuto Kurataet al.2025ICASSP 2025
LLM based Text Generation for Improved Low-resource Speech Recognition ModelsTohru NaganoGakuto Kurataet al.2025ICASSP 2025
SocialStigmaQA Spanish and Japanese - Towards Multicultural Adaptation of Social Bias BenchmarksClara Higuera CabañesRyo Iwakiet al.2024NeurIPS 2024
Robust ASR Error Correction with Conservative Data FilteringTakuma UdagawaMasayuki Suzukiet al.2024EMNLP 2024
MULTIPLE REPRESENTATION TRANSFER FROM LARGE LANGUAGE MODELS TO END-TO-END ASR SYSTEMSTakuma UdagawaMasayuki Suzukiet al.2024ICASSP 2024
Speech-enriched Memory for Inference-time Adaptation of ASR Models to Word DictionariesAshish MittalSunita Sarawagiet al.2023EMNLP 2023
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label SmoothingXiaodong CuiGeorge Saonet al.2022INTERSPEECH 2022
Global RNN Transducer Models For Multi-dialect Speech RecognitionTakashi FukudaSamuel Thomaset al.2022INTERSPEECH 2022