Apple is sponsoring the International Conference on Acoustics, Speech and Signal Processing (ICASSP), which will take place in person from June 4 – 10 in Rhodes Island, Greece. ICASSP is the IEEE Signal Processing Society’s flagship conference on signal processing and its applications. Below is the schedule of Apple sponsored workshops and events at ICASSP 2023.
Schedule
Tuesday, June 6
- I See What You Hear: A Vision-inspired Method to Localize Words
- 10:50 AM – 12:20 PM LT in Salon des Roses A
- Mohammad Samragh, Arnav Kundu, Ting-Yao Hu, Aman Chadha, Ashish Srivastava, Minsik Cho, Oncel Tuzel, Devang Naik
- Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
- 10:50 AM – 12:20 PM LT in Poster Area 4 – Garden
- Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang
- Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis
- 2:00 – 3:30 PM LT in Poster Area 2 – Garden
- Karren Yang, Ting-Yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, Oncel Tuzel
- Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation
- 2:00 – 3:30 PM LT in Poster Area 3 – Garden
- Stefan Braun, Erik McDermott, Roger Hsiao
- More Speaking or More Speakers?
- 2:00 – 3:30 PM LT in Poster Area 3 – Garden
- Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko
- Audio-to-Intent Using Acoustic-Textual Subword Representations from End-to-End ASR
- 2:00 – 3:30 PM LT in Poster Area 4 – Garden
- Pranay Dighe, Prateeth Nayak, Oggi Rudovic, Erik Marchi, Xiaochuan Niu, Ahmed Tewfik
Wednesday, June 7
- HEiMDaL: Highly Efficient Method for Detection and Localization of wake-words
- 8:15 – 9:45 AM LT in Poster Area 8 – Dome
- Arnav Kundu, Mohammad Samragh Razlighi, Minsik Cho, Priyanka Padmanabhan, Devang Naik
- Women in Signal Processing
- 12:20 – 2:20 PM LT at the Ambrosia Restaurant
Thursday, June 8
- Naturalistic Head Motion Generation From Speech
- 10:50 AM – 12:20 PM LT in Salon des Roses A
- Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald
- Student Job Fair and Luncheon
- 12:00 – 3:00 PM LT at the Ambrosia Restaurant
- Pre-trained Model Representations and their Robustness against Noise for Speech Emotion Analysis
- 2:00 – 3:30 PM LT in Poster Area 4 – Garden
- Vikramjit Mitra, Vasudha Kowtha, Hsiang-Yun Sherry Chien, Erdrin Azemi, Carlos Avendano
- On the Role of Lip Articulation in Visual Speech Perception
- 2:00 – 3:30 PM LT in Poster Area 10 – Dome
- Zakaria Aldeneh, Masha Fedzechkina, Skyler Seto, Katherine Metcalf, Miguel Sarabia, Nicholas Apostoloff, Barry-John Theobald
- POSTER PRESENTATION
- Learning to Detect Novel and Fine-Grained Acoustic Sequences Using Pretrained Audio Representations
- 3:35 – 5:05 PM LT in Poster Area 2 – Garden
- Vasudha Kowtha, Miquel Espi, Jonathan J Huang, Yichi Zhang, Carlos Avendano
Friday, June 9
- Improvements to Embedding-Matching Acoustic-to-Word ASR Using Multiple-Hypothesis Pronunciation-Based Embeddings
- 8:15 – 9:45 AM in Poster Area 4 – Garden
- Hao Yen, Woojay Jeon
Accepted Papers
Audio-to-Intent Using Acoustic-Textual Subword Representations from End-to-End ASR
Pranay Dighe, Prateeth Nayak, Oggi Rudovic, Erik Marchi, Xiaochuan Niu, Ahmed Tewfik
HEiMDaL: Highly Efficient Method for Detection and Localization of wake-words
Arnav Kundu, Mohammad Samragh Razlighi, Minsik Cho, Priyanka Padmanabhan, Devang Naik
I See What You Hear: A Vision-inspired Method to Localize Words
Mohammad Samragh, Arnav Kundu, Ting-Yao Hu, Aman Chadha, Ashish Srivastava, Minsik Cho, Oncel Tuzel, Devang Naik
Hao Yen, Woojay Jeon
Learning to Detect Novel and Fine-Grained Acoustic Sequences Using Pretrained Audio Representations
Vasudha Kowtha, Miquel Espi, Jonathan J Huang, Yichi Zhang, Carlos Avendano
More Speaking or More Speakers?
Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko
Naturalistic Head Motion Generation From Speech
Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald
Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation
Stefan Braun, Erik McDermott, Roger Hsiao
On the Role of Lip Articulation in Visual Speech Perception
Zakaria Aldeneh, Masha Fedzechkina, Skyler Seto, Katherine Metcalf, Miguel Sarabia, Nicholas Apostoloff, Barry-John Theobald
Pre-trained Model Representations and their Robustness against Noise for Speech Emotion Analysis
Vikramjit Mitra, Vasudha Kowtha, Hsiang-Yun Sherry Chien, Erdrin Azemi, Carlos Avendano
Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis
Karren Yang, Ting-Yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, Oncel Tuzel
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang
Demo
Contextual Understanding in Siri
This is a demonstration of the context understanding technology shipped in Siri. Users can refer to an aforementioned entity using anaphora or nominal ellipsis, refer to an entity on screen, or correct a previous error by Siri or the user. Context understanding for Siri leverages several backend ML solutions such as query rewriting and reference resolution. This work is a step towards having more natural conversations with Siri, and was shipped in iOS 16.
All ICASSP attendees are invited to stop by the Apple booth (booth number 16, located next to the Dome Bar main entrance of the Rodos Palace Luxury Convention Resort) to experience this demo in person.
Acknowledgements
Tatiana Likhomanenko, Arnav Kundu, Stefan Braun, Vikram Mitra, and Pawel Swietojanski are reviewers for ICASSP 2023.
Yannis Stylianou is a Seasonal School & Short Course Chair for ICASSP 2023.
Let’s innovate together. Build amazing machine-learned experiences with Apple. Discover opportunities for researchers, students, and developers by visiting our Work with us page.