Summary and Scope
This proposal introduces Speaker-Controlled and Zero-Shot Language Recognition: a Special Session and community challenge that targets language recognition in the realistic regime where each speaker contributes speech in multiple languages and models are tested on unseen languages. The central scientific question is: How can we develop language recognition systems that disentangle speaker identity from linguistic structure to ensure robust generalization across individuals and languages?
We aim to catalyze discussion and benchmarking around:
Representation Disentanglement: Designing training objectives that decouple speaker-specific acoustic traits from language-discriminative phonetic and phonotactic patterns.
Zero-Shot Generalization: Evaluating the limits of modern multilingual and self-supervised models when encountering languages absent from the training set.
Mitigating Shortcut Learning: Identifying and reducing the reliance on ”shortcuts” (such as speaker-specific artifacts or recording conditions) to improve real-world reliability.
Fairness and Trustworthiness: Ensuring that language recognition performance remains robust and independent of speaker identity, aligning with the Odyssey 2026 theme of trustworthy identity and speech technology.
Submission Information
Deadline: March 15th, 2026
Subject Area 6.04 – TidyLR Challenge: Speaker-Controlled Language Recognition
CMT: https://cmt3.research.microsoft.com/ODYSSEY2026/
Templates: https://odyssey2026.inesc-id.pt/preparation-guidelines-and-templates/
More Information: https://tidylang2026.github.io
Organizers
Aref Farhadipour (University of Zurich)
Jan Marquenie (Otto-von-Guericke University Magdeburg)
Srikanth Madikeri (University of Zurich)
Volker Dellwo (University of Zurich)
Teodora Vukovic (University of Zurich)
Kathy Reid (Australian National University)
Francis M. Tyers (Indiana University)
Ingo Siegert (Otto-von-Guericke University Magdeburg)
Eleanor Chodroff (University of Zurich)
