Towards Truly Multilingual ASR: Generalizing Code-Switching ASR to Unseen Language Pairs

Published in ICML Workshop on Machine Learning for Audio, 2026

This work studies whether code-switching automatic speech recognition (CS-ASR) can move beyond pair-specific systems and generalize to language pairs that were not seen during training.

The paper addresses a scalability challenge in multilingual ASR: as the number of supported languages grows, building separate CS-ASR support for every bilingual pair becomes increasingly impractical. It evaluates whether bilingual CS-ASR capabilities can transfer across language pairs through model merging and domain generalization methods.

Key highlights:

Frames unseen language-pair generalization as a central challenge for truly multilingual CS-ASR.
Studies whether capabilities learned from limited seen language pairs transfer to unseen language pairs.
Evaluates model merging and domain generalization approaches for improving scalability.
Finds that merged bilingual CS-ASR models modestly generalize to unseen pairs, indicating limited but meaningful transfer.

Recommended citation: Gio Paik, Hyunseo Shin, Soungmin Lee. (2026). "Towards Truly Multilingual ASR: Generalizing Code-Switching ASR to Unseen Language Pairs." ICML 2026 Workshop on Machine Learning for Audio. arXiv:2606.05846.
Download Paper

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Soungmin (Min) Lee

Share on