Audio Imagination Workshop
Generative AI has been at the forefront of AI research in recent times, with numerous studies showcasing remarkable and surprising generation capabilities across various modalities such as text, image, and audio. Audio Imagination Workshop at NeurIPS 2024 aims to bring the latest advancements in generative AI focusing on audio generation. Audio generation presents unique challenges due to the nature of the audio signal, its perception by humans, and its relationship with other modalities like text and visuals. Modern generative methods have brought about new opportunities for solving well-studied audio generation problems, such as text-to-speech synthesis, while also leading to explorations of exciting new problems. The workshop seeks to bring together researchers working on different audio generation problems and facilitate concentrated discussions on the topic. It will feature engaging invited talks, high-quality papers presented through oral and poster sessions, and a demo session to showcase the current state of audio generation methods. Audio Imagination Workshop will happen on Dec. 14, 2024.
Call For Papers
Update: Deadline is extended to Sep 20, 2024. 11:59pm AoE
Update: ICASSP 2025 Submissions: Authors interested in submitting their ICASSP 2025 submissions to Audio Imagination workshop are welcome to do so.
We invite submissions for Main Paper and Demo Tracks. Please go to Submission Page for submission instructions and more details.
Feel free to contact the organizers if you have any question regarding the workshop.
The Audio Imagination Workshop at the Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024) aims to bring together researchers working in the field of generative AI for audio, speech, music, including multimodal generative AI with audio as one of the modalities.
We invite researchers to submit papers focusing on, but not limited to, the following topics related to audio generation:
- Textual prompts and natural language inputs based generation and editing of audio, such as text-to-speech (i.e., speech synthesis), text-to-music and text-to-sound
- Audio/Speech in LLMs/Multimodal LLMs
- Connection of audio generation with text generation, including similarities and differences.
- Video to Audio/Speech/Music Generation
- Multimodal generation of audio - going beyond unimodal inputs (text/video/audio) to audio — using multiple modalities for generating audio
- Data for audio/speech/music generative AI
- Methods for Evaluation of Generated Audio
- Generative methods for and its impact on established speech tasks such as speech enhancement, source separation, voice conversion, speech to speech translation, to mention a few
- Generation of spatial audio and experiences driven by spatial audio.
- Generation of audio for virtual or augmented reality (VR/AR)
- Synchronized Generation of audio along with visuals
- Impact of generative audio on media and content creation technologies
- Interpretability in generative AI for audio/speech/music.
- Responsibility in generative AI for audio/speech/music.
- Novel applications of audio/speech/music generation
We welcome submissions from researchers in academia and industry. The workshop will provide a platform for discussing the latest advances in the field and identifying future research directions.
We invite submission in two tracks, Main Paper Track and Demo Track. The submission process and details are outlined below. Please reach out to the organizers for any questions/confusion.
Main paper track
The main paper track is the primary submission track for the Audio Imagination workshop and will facilitate discussions on relevant topics. Accepted papers will be presented through oral talks or poster sessions. Please note that Audio Imagination is an in-person workshop and papers are expected to be presented in person.
Demo Session
A key component of the Audio Imagination workshop is that we will also hold a demo session, where participants will have a chance to showcase their advanced audio generation methods and technologies. The demo track will enable listening experiences for workshop participants which is critical to understand, evaluate and contextualize generated audio. The demo session will be conducted alongside poster sessions.
Please Check Out the Submissions Page for details on paper formatting and submission details.
Important Dates
- September 20, 2024 - Main Paper Submission Deadline
- September 20, 2024 - Demo Paper Submission Deadline
- October 9, 2024 - Paper & Demo Acceptance Notification
- December 14, 2024 - Workshop