Odyssey 2026

The Full Schedule

See Program at a glance here

Workshop Schedule

Tuesday 23rd

  • 08:00 – 09:00Registration
  • 09:00 – 09:30Opening
  • 09:30 – 10:30Keynote 1: Rigorous Forensic Automatic Speaker Recognition: Bayesian Decision Theory, Probabilistic Calibration and Case-Specific Validation – Daniel Ramos
  • 10:30 – 11:00Coffee Break
  • 11:00 – 12:20Oral Presentations 1.1 – Deepfake and Spoofing Detection
    Session Chair: TBD
    • 11:00 An Intervention-Based Framework for Shortcut Diagnosis in Spoofing Countermeasures Santiago Rubio (Universidad de Zaragoza)*; Pilar Bello (BTS, Business Telecommunications Services ); Dayana Ribas (BTS, Business Telecommunications Services ); Antonio Miguel (Universidad de Zaragoza); Eduardo Lleida (Universidad de Zaragoza); Alfonso Ortega (Universidad de Zaragoza)
    • 11:20 Domain Adaptation for Deepfake Audio Detection under Degraded Channel Conditions Ayuto Tsutsumi (Tokyo Metropolitan University)*; Akira Gotoh (NEC Corporation); Yuko Saito (NEC Corporation); Hiroki Matsuura (NEC Corporation); Sayaka Shiota (Tokyo Metropolitan University)
    • 11:40 Speaker-Invariant Representation Learning for Spoofing Detection via Gradient Reversal and Information Bottleneck Anh-Tuan DAO (LIA)*; Driss Matrouf (LIA); Mickael Rouvier (LIA); Nicholas Evans (EURECOM)
    • 12:00 Large-Kernel 1D CNN for Raw Waveform Spoofing Countermeasures Guy Perets (Ben Gurion University of the Negev)*; Yehuda Ben-Shimol (Ben Gurion University of the Negev); Itshak Lapidot (Afeka, Tel-Aviv College of Engineering)
  • 12:20 – 13:45Lunch
  • 13:45 – 15:30Early-Stage Researcher Symposium (ESRS) – Poster Session
    Session Chair: TBD
    • Why Do You Say It Like That? A Phoneme-Level Framework for Explainable Speech Deepfake Detection Anna Taylor (EURECOM)*; Michele Panariello (EURECOM); Massimiliano Todisco (EURECOM); Chiara Galdi (EURECOM); Nicholas Evans (EURECOM)
    • Interpreting SSL Representations for Spoof Detection: a WavLM Study Mallat Mohamed (Eurecom)*; Michele Panariello (EURECOM); Massimiliano TODISCO (Eurecom); Nicholas EVANSS (Eurecom); Anthony LARCHER (LIUM)
    • Identity Disambiguation in Common Voice: Enabling Fairness Evaluation Across Demographic Subgroups Chenyi Lin (Aalto University)*; D¯avis ˇSterns (Aalto University); Tom B¨ackstr¨om (Aalto University); Nicholas Evans (EURECOM)
    • Machine-Learning Benchmarking of Voice-Based Biomarkers for Parkinson’s Disease Xiaowen Luo (Maastricht University)*; Ryszard Auksztulewicz (Maastricht University); Sonja Kotz (Maastricht University)
    • The Role of Voice Source and Filter in Speech Emotion Recognition Yuhan Huang (University College London)*; Josef Schlittenlacher (University College London); Chris Carignan (University College London)
    • Transparent Exchange of Speaker Attributes Jiusi Zheng (Radboud University)*; Martha Larson (Radboud University); Tom Bäckström (Aalto University)
    • Limitations of WER for Intelligibility Evaluation in Speech Anonymization Victor Ménestrel (Technische Universität Berlin)*
    • Controllable Voice Anonymization for Privacy-Preserving Disease Detection from Speech “Ben Luks (INESC-ID/Instituto Superior Técnico, University of Lisbon, Technical University of Berlin)*; Francisco Teixeira (INESC-ID); Alberto Abad ( INESC-ID/Instituto Superior Técnico, University of Lisbon); Isabel Trancoso (INESC-ID)”
    • Challenges in Multi-Speaker Privacy Anastasiia Korenevskaia (Radboud University)*
    • Studying Voice Privacy Risks with Side Information through Partially Synthetic Data Eulalie Thiombiano (Radboud University)*; Martha Larson (Radboud University); Vincent Colotte (Université de Lorraine); Emmanuel Vincent (Université de Lorraine)
    • Challenges in Protection against Deepfakes in Speech Priyanshi Pal (Aalto University)*; Lauri Juvela (Aalto University); Isabel Trancoso (INESC-ID, IST); Alberto Abad (INESC-ID, IST)
    • Linkage-Based Adversarial Framework for Voice Privacy Evaluation Dāvis Šterns (Aalto University)*; Tom Bäckström (Aalto University); Catuscia Palamidessi (INRIA); Natasha Fernandes (Macquarie University); Konstantinos Drosos (Nokia)
    • Why Voice Privacy Researchers Should Worry About Attribute Inference? Mehtab Rahman (Radboud University)*; Eulalie Thiombiano (Radboud University); Martha Larson (Radboud University)
    • How Bilingual Are SSL Speech Models? Cross-Lingual Probing of Articulatory Encoding with Finnish and Russian EMA Ailín Pollio (University of Eastern Finland)*; Tomi Kinnunen (University Of Eastern Finland ); Alexandre Nikolaev (University Of Eastern Finland); Ruchi Pandey (University of Eastern Finland)
  • 15:30 – 15:50Coffee Break
  • 15:50 – 16:50Oral Presentations 1.2 – Privacy-Aware Speech Processing and Watermarking
    Session Chair: TBD
    • 15:50 Analysis of embedding-based emotional preservation metrics for voice conversion models Théo Nguyen (Aalto University)*; Tom Bäckström (Aalto University); Rainer Martin (Ruhr-Universität Bochum)
    • 16:10 Latent Secret Spin: Keyed Orthogonal Rotations for Blind Speech Watermarking in Anisotropic Latent Spaces Emma Coletta (EURECOM)*; Massimiliano Todisco (EURECOM); Michele Panariello (EURECOM); Antonio Faonio (EURECOM); Nicholas Evans (EURECOM)
    • 16:30 Sensitive Speaker Attribute Leakage in Speech–LLM Pipelines Siavosh Sepanta (Fondazione Bruno Kessler)*; Alessio Brutti (Fondazione Bruno Kessler)
  • 16:50 – 17:50Oral Presentations 1.3 – Tools and Methods for Speaker Verification
    Session Chair: TBD
    • 16:50 Kiwano: A Cutting-Edge Open-Source Toolkit for Speaker Verification Mickael Rouvier (LIA – Avignon University)*; Pierre Michel Bousquet (LIA – Avignon University)
    • 17:10 Beyond CosFace: Analysing Sparsity-Inducing Losses in Speaker Verification Ladislav Mosner (Brno University of Technology)*; Dimitrios Koutsianos (Athens University of Economics and Business); Themos Stafylakis (Omilia)
    • 17:30 FM-SEE: Flow Matching-based Generative Model For Speaker Embedding Enhancement Sergey Novoselov (ITMO University)*; Vladimir Volokhov (STC-Innovations, ITMO University); Nikita Khmelev (STC-Innovations, ITMO University); Anikin Alexandr (STC-Innovations, ITMO University); Anastasia Zorkina (STC-Innovations, ITMO University); Anastasia Korenevskaya (ITMO University)
  • 18:30 – 20:00Welcome Reception

Wednesday 24th

  • 08:30 – 09:30Oral Presentations 2.1 – Diarization
    Session Chair: TBD
    • 08:30 Scaling self-supervised pretraining for speaker diarization Antoine Laurent (pyannoteAI)*; Joonas Kalda (pyannoteAI); Hervé Bredin (pyannoteAI)
    • 08:50 Adapting Speaker Diarization to Code-Switched Medical Conversations: AUDIAS-UAM at the DISPLACE-M Challenge Sara Barahona (AUDIAS Research Group, Universidad Autónoma de Madrid)*; Laura Herrera-Alarcón (AUDIAS Research Group, Universidad Autónoma de Madrid); Juan-Ignacio Alvarez-Trejos (AUDIAS Research Group, Universidad Autónoma de Madrid); Alicia Lozano-Diez (AUDIAS Research Group, Universidad Autónoma de Madrid)
    • 09:10 Augmented State Space Speaker Clustering: Reformulating HMM Based Clustering To Improve Speaker Diarization Anurag Chowdhury (Solventum)*; Abhinav Misra (Solventum); Yinong Wang (Solventum); Bongjun Kim (Solventum); Mark Fuhs (Solventum); Monika Woszczyna (Solventum)
  • 09:30 – 10:30Keynote 2: Genetic information in human voice: how much do we know today and how much more will technology uncover? – Rita Singh
  • 10:30 – 11:00Coffee Break
  • 11:00 – 12:20Oral Presentations 2.2 – Speech Privacy and Anonymization
    Session Chair: TBD
    • 11:00 Quantized Approximate Signal Processing (QASP): Towards Homomorphic Encryption for Audio Tu Duyen Nguyen (Callyope); Adrien Lesage (Callyope); Clotilde Cantini (Callyope); Rachid Riad (Callyope)*
    • 11:20 Privacy in Spoken Interaction: An Overview of Inferable Attributes Eline Bijmold (Radboud University); Anastasiia Korenevskaia (Radboud University)*; Martha Larson (Radboud University)
    • 11:40 Joint Timbral and Non-Timbral Speaker Anonymisation Rayane Bakari (Orange )*; Olivier Le Blouch (Orange); Nicolas Gengembre (Orange); Nicholas Evans (Eurecom)
    • 12:00 Evaluating voice anonymisation using similarity rank disclosure Shilpa Chandra (EURECOM)*; Matteo Petteno (EURECOM); Michele Panariello (EURECOM); Nicholas Evans (EURECOM); Massimiliano Todisco (EURECOM); Tom Bäckström (Aalto University); Dorothea Kolossa (Technische Universität Berlin); Rainer Martin (Ruhr-Universität Bochum); Themos Stafylakis (Omilia); Nicolas Gengembre (Orange)
  • 12:20 – 13:45Lunch
  • 13:45 – 15:30Special Sessions – Oral Overviews (10 min. each) and parallel Poster Session
    • 13:45 [SS1] Special Session on Speech and Language Technologies in Healthcare – Oral Overview J.A. Gonzalez-Lopez (Univ. of Granada)
    • 13:55 [SS2] Special Session on Model Fairness Meets Source Tracing: Toward Trustworthy AI for Manipulated Speech Attribution – Oral Overview Nicolas Müller (Fraunhofer AISEC / Resemble AI)
    • 14:05 [SS3] Special Session on NIST SRE24 Deeper Analysis – Oral Overview Craig Greenberg (NIST)
    • 14:15 [SS4] Special Session on TidyLang Challenge: Speaker-Controlled Language Recognition – Oral Overview Aref Farhadipour (University of Zurich)
    • [SS1] MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification Xabier de Zuazo (HiTZ Center, University of the Basque Country – UPV/EHU)*; Ibon Saratxaga (HiTZ Center, University of the Basque Country – UPV/EHU); Eva Navas (HiTZ Center, University of the Basque Country – UPV/EHU)
    • [SS1] Adaptive Phone-Wise Weighted Loss for Silent Speech Restoration in Continuous Spanish “Eder del Blanco Sierra (University of the Basque Country (UPV/EHU))*; David Gimeno-Gómez (Universitat Politècnica de València); Ibon Saratxaga ( University of the Basque Country (UPV/EHU)); Eva Navas (University of the Basque Country (UPV/EHU)); Inma Hernáez (University of the Basque Country (UPV/EHU))”
    • [SS1] Comparator Loss: An Ordinal Contrastive Loss to Derive a Severity Score for Speech-based Health Monitoring Jacob Webber (SpeakUnique); Oliver Watts (SpeakUnique); Lovisa Wihlborg (SpeakUnique); Johnny Tam (Anne Rowling Regenerative Neurology Clinic, University of Edinburgh); Christine Weaver (Anne Rowling Regenerative Neurology Clinic, University of Edinburgh); Suvankar Pal (Anne Rowling Regenerative Neurology Clinic, University of Edinburgh); Siddharthan Chandran (Anne Rowling Regenerative Neurology Clinic, University of Edinburgh); Cassia Valentini (SpeakUnique)*
    • [SS1] Rapid Calibration for Cross-Subject Imagined Speech Decoding Toward Restoring Communication Sanae Belfrouh (National School of Applied Sciences, University of Chouaib Doukkali)*; Rahhal Errattahi (National School of Applied Sciences, University of Chouaib Doukkali); Fatima zahra Salmam (National School of Applied Sciences, University of Chouaib Doukkali)
    • [SS1] Deep learning based analysis of spontaneous speech for diagnostic classification and biomarker prediction in Alzheimer’s disease and primary progressive aphasia Roger Esteve (Universitat Politecnica de Catalunya)*; Pilar Armas (Universitat Politècnica de Catalunya and Sant Pau Memory Unit, IR SANT PAU, Hospital de la Santa Creu i Sant Pau); Marc Casals-Salvador (Barcelona Supercomputing Center and Universitat Politècnica de Catalunya); Miguel A Santos-Santos (Sant Pau Memory Unit, IR SANT PAU, Hospital de la Santa Creu i Sant Pau); Alexandre Bejanin (Sant Pau Memory Unit, IR SANT PAU, Hospital de la Santa Creu i Sant Pau); Javier Hernando (Barcelona Supercomputing Center and Universitat Politècnica de Catalunya)
    • [SS1] Vocal markers of Turner syndrome: a preliminary analysis of sustained vowel recordings Marc Freixes (HER Human Environment Research Group, La Salle – URL)*; Jordi Sanz (HER Human Environment Research Group, La Salle – URL); Joan Claudi Socoró (HER Human Environment Research Group, La Salle – URL); Jordi Margalef (HER Human Environment Research Group, La Salle – URL); Isabella Monlleó (Universidade Federal de Alagoas); Debora Michelatto (Universidade Federal de Alagoas); Francesc Alías-Pujol (HER Human Environment Research Group, La Salle – URL); Neus Martínez-Abadías (Universitat de Barcelona); Xavier Sevillano (HER Human Environment Research Group, La Salle – URL)
    • [SS2] The Effect of Telephony Transmission on Source Tracing of Audio Deepfakes Nicholas Klein (Pindrop Security)*; Hemlata Tak (Pindrop Security); Nikolay Gaubitch (Pindrop Security); David Looney (Pindrop Security); Tianxiang Chen (Pindrop Security); Elie Khoury (Pindrop Security)
    • [SS2] Advancing Zero-Shot Open-Set Speech Deepfake Source Tracing Manasi Chhibber (University of Eastern Finland); Jagabandhu Mishra (University of Eastern Finland)*; Tomi Kinnunen (University of Eastern Finland)
    • [SS3] I4U’s Official and Streamlined Audio Systems for NIST SRE24 Daniele Colibro (Microsoft)*; Claudio Vair (Microsoft); Youzhi Tu (The Hong Kong Polytechnic University); Junjie Li (The Hong Kong Polytechnic University); Zilong Huang (The Hong Kong Polytechnic University); Yijia Chen (The Hong Kong Polytechnic University); Kong Aik Lee (The Hong Kong Polytechnic University); Man-Wai Mak (The Hong Kong Polytechnic University); Jagabandhu Mishra (School of Computing, University of Eastern Finland); Vishwanath Singh (School of Computing, University of Eastern Finland); Xi Xuan (School of Computing, University of Eastern Finland); Manasi Chhibber (School of Computing, University of Eastern Finland); Oguzhan Kurnaz (School of Computing, University of Eastern Finland); Tomi Kinnunen (School of Computing, University of Eastern Finland); Suyeon Lee (Korea Advanced Institute of Science and Technology); Chaeyoung Jung (Korea Advanced Institute of Science and Technology); Kihyun Nam (Korea Advanced Institute of Science and Technology); Joon Son Chung (Korea Advanced Institute of Science and Technology); Shuai Wang (Nanjing University)
    • [SS3] Analysis of the NIST 2024 Speaker Recognition Evaluation Elliot Singer (MIT Lincoln lab)*; Craig Greenberg (NIST); Lukas Diduch (NIST); Trang Nguyen (MIT Lincoln Lab); Lisa Mason (US Government); Beth Matys (US Government); Bob Dunn (MIT Lincoln Lab); Audrey Tong (NIST)
    • [SS4] Spoken Language Identification with Pre-trained Models and Margin Loss Zhihua Fang (Xinjiang University)*; Liang He (Tsinghua University); Weiwu Jiang (AgiBot)
    • [SS4] Disentangled Speech Encoder: A Robust Encoder with Dynamic Adapter for Language Identification Barathi Ganesh HB (Kitami Institute of Technology); Jairam R (Amrita Vishwa Vidyapeetham)*; Ptaszynski Michal (Kitami Institute of Technology); Reshma Unnikrishnan (Resilience Business Grids); Jyothish Lal G (Amrita Vishwa Vidyapeetham); Premjith B (Amrita Vishwa Vidyapeetham)
    • [SS4] LLM-Based Language Verification and Multimodal Ensemble for Spoken Language Recognition Aivo Olev (Tallinn University of Technology)*; Tanel Alumäe (Tallinn University of Technology)
    • [SS4] Speaker-Aware Language Verification Based on Attentive Pooling, Mixture of Experts and Neural PLDA Mikel Penagarikano (University of the Basque Country); Luis Javier Rodriguez-Fuentes (University of the Basque Country)*; Amparo Varona (University of the Basque Country); Germán Bordel (University of the Basque Country)
  • 15:30 – 15:50Coffee Break
  • 15:50 – 16:50Oral Presentations 2.3 – Optimization and Efficiency in Speaker Recognition
    Session Chair: TBD
    • 15:50 Harmonizing data augmentation and loss function for speaker recognition: examples with speed perturbation, mixup and mixout Pierre-Michel Bousquet (Avignon University)*; Mickaël Rouvier (Avignon University)
    • 16:10 Assessing the Energy and Carbon Emissions of Neural Speaker Verification Model in Training and Inference Hugo Leguillier (LIA – Avignon University); Driss Matrouf (LIA – Avignon University); Guillaume Lechien (Aday); Mickael Rouvier (LIA – Avignon University)*
    • 16:30 On Low-Bit Quantization Errors in Speaker Verification: Diagnostic and Mitigation Hugo LEGUILLIER (LIA (Laboratoire informatique d’Avignon))*; Driss Matrouf (LIA (Laboratoire informatique d’Avignon)); Guillaume LECHIEN (ADAY); Mickael ROUVIER (LIA (Laboratoire informatique d’Avignon))
  • 16:50 – 17:50Oral Presentations 2.4 – Biomarkers
    Session Chair: TBD
    • 16:50 SLAP: Learning Speaker and Health-Related Representations from Natural Language Supervision Angelika Andò (Callyope); Auguste Crabeil (Callyope); Quentin Spinat (Callyope.com); Adrien Lesage (Callyope); Rachid Riad (Callyope)*
    • 17:10 Speech Quality Embeddings for Improved Detection and Classification of Degradations in Speech Signals Michael Kuhlmann (Paderborn University)*; Tobias Cord-Landwehr (Paderborn University); Reinhold Haeb-Umbach (Paderborn University)
    • 17:30 Dysarthria Severity Classification on the HeyJay! Dataset: A Parameter-Efficient Approach Using Self-Supervised Speech Representations Davide Lillini (Department of Information Engineering, Università Politecnica delle Marche)*; Thomas Thebaud (Department of Electrical and Computer Engineering, Johns Hopkins University); Lucia Migliorelli (Department of Political Science, Università degli Studi di Teramo); Najim Dehak (Department of Electrical and Computer Engineering, Johns Hopkins University); Stefano Squartini (Department of Information Engineering, Università Politecnica delle Marche); Laureano Moro Velazquez (Department of Electrical and Computer Engineering, Johns Hopkins University)
  • 19:30 – 23:30Gala Dinner

Thursday 25th

  • 08:30 – 09:30Oral Presentations 3.1 – Spoofing
    Session Chair: TBD
    • 08:30 A comparison of SSL-Based Feature Extractors and Back-End Classifiers for Spoofing Detection: A Multi-Corpus Training and Cross-Linguistic Analysis Anh-Tuan DAO (LIA)*; Driss Matrouf (LIA); Mickael Rouvier (LIA); Nicholas Evans (Eurecom)
    • 08:50 From Self-Supervised Speech Models to Mixture-of-Experts for Robust Anti-Spoofing Hugo Daumain (LIA – Avignon University, Airbus Defence & Space)*; Driss Matrouf (LIA – Avignon University); Khaled Khelif (Airbus Defence & Space); Mickael Rouvier (LIA – Avignon University)
    • 09:10 Can SSL Frontend Generalize to All-Type Audio Spoofing? “Arnab Das (Deutsches Forschungszentrum für Künstliche Intelligenz)*; Yassine El Kheir ( Deutsches Forschungszentrum für Künstliche Intelligenz); Fabian Ritter Guttierez (Nanyang Technological University); Tim Polzehl (Deutsches Forschungszentrum für Künstliche Intelligenz); Sebastian Möller (TU Berlin)”
  • 09:30 – 10:30Keynote 3: Every breath you take: From Vocal Chords to Health Scores – Björn Schuller
  • 10:30 – 11:00Coffee Break
  • 11:00 – 12:20Oral Presentations 3.2 – Backend and Generalization in Speaker Verification
    Session Chair: TBD
    • 11:00 Spherical-Gaussian TPSDA: combining PLDA, T-PSDA and duration models for speaker verification Sandro Cumani (Politecnico di Torino)*
    • 11:20 Condition-Aware System Fusion for Speaker Verification Jonas Borgstrom (MIT Lincoln Laboratory)*
    • 11:40 Towards Language-Agnostic Speaker Verification: A Cross-Lingual Transfer Study of Architectures Pol Buitrago (Universitat Politècnica de Catalunya – Barcelona Supercomputing Center)*; Javier Hernando (Universitat Politècnica de Catalunya – Barcelona Supercomputing Center)
    • 12:00 Subtract to Clean, Add to Enrich: Dual-Path Disentanglement for Speaker and Language Recognition Aref Farhadipour (University of Zurich)*
  • 12:20 – 13:45Lunch
  • 14:00 – 19:30Tour to Cascais and Sintra

Friday 26th

  • 08:30 – 09:30Oral Presentations 4.1 – Representation Learning in Speaker and Language
    Session Chair: TBD
    • 08:30 Functionnally-grounded evaluation of dimensional interpretability in sparse speaker representations Félix Saget (LIUM)*; Nicolas Dugué (LIUM); Marie Tahon (LIUM); Anthony Larcher (LIUM)
    • 08:50 Multi-Axis Speech Similarity via Factor-Partitioned Embeddings Jim O’Regan (KTH Royal Institute of Technology)*; Jens Edlund (KTH Royal Institute of Technology)
    • 09:10 Flow-Enhanced Language Embeddings for Robust Language Recognition Tianyu Cao (Johns Hopkins University)*; Laureano Moro-Velazquez (Johns Hopkins University); Jesús Villalba (Johns Hopkins University); Thomas Thebaud (Johns Hopkins University); Najim Dehak (Johns Hopkins University)
  • 09:30 – 10:30Keynote 4: From Single-Channel Foundations to Multi-Speaker and Multi-Modal Understanding – Lukáš Burget
  • 10:30 – 11:00Coffee Break
  • 11:00 – 12:20Oral Presentations 4.2 – Spoofing Detection and Robust ASV
    Session Chair: TBD
    • 11:00 Sparse deepfake detection promotes better disentanglement Marie Tahon (LIUM)*; Antoine Tessier (LIUM); Nicolas Dugué (LIUM); Aghilas Sini (LIUM)
    • 11:20 I Hear, Therefore I Trust: A Socio-Technical Investigation of Humans as Synthetic Speech Detectors Lelia Erscoi (University of Eastern Finland)*; Tomi Kinnunen (University of Eastern Finland)
    • 11:40 PLDA Scoring for Spoofing-Robust Automatic Speaker Verification Shani Budilovsky (Ben Gurion University of the Negev)*; Yehuda Ben-Shimol (Ben Gurion University of the Negev); Itshak Lapidot (Afeka the Academic College of Engineering in Tel Aviv)
    • 12:00 J-SPAW2: A Japanese Corpus for Speaker Verification and Anti-Spoofing with Challenging Replay and Speech Synthesis Attacks Sayaka Shiota (Tokyo Metropolitan University)*; Suzuka Horie (Tokyo Metropolitan University); Sawato Furubayashi (Tokyo Metropolitan University); Shinnosuke Takamichi (Tokyo Metropolitan University)
  • 12:20 – 13:00Closing Ceremony