Interspeech 2023 Special Session: Biosignal-enabled Spoken Communication


  • Siqi Cai (Human Language Technology Laboratory, National University of Singapore, Singapore)
  • Kevin Scheck (Cognitive​ ​Systems​ ​Lab,​ ​University​ ​of​ ​Bremen,​ ​Bremen,​ ​Germany)
  • Hiroki Tanaka (Augmented Human Communication  Labs, Nara Institute of Science and Technology, Japan)
  • Tanja​ Schultz​ ​(Cognitive​ ​Systems​ ​Lab,​ ​University​ ​of​ ​Bremen,​ ​Bremen,​ ​Germany)
  • Haizhou Li (Human Language Technology Laboratory, National University of Singapore, Singapore)

Scope of the Special Session

Biosignals such as of articulatory or neurological activities provide information about the human speech process and thus can serve as an alternative modality to the acoustic speech signal. As such, they can be the primary driver for speech-driven human-computer interfaces intended to support humans when acoustic speech is not available or perceivable. For instance, articulatory-related biosignals, such as Electromyography (EMG) or Electromagnetic Articulography (EMA), can be leveraged to synthesize the acoustic speech signal from silent articulation. By the same token, neuro-steered hearing aids process neural activities, reflected in signals such as Electroencephalography (EEG), to detect the human selective auditory attention to single out and enhance the attended speech stream. Progress in the field of speech-related biosignal processing will lead to the design of novel biosignal-enabled speech communication devices and speech rehabilitation for everyday situations.

With the special session "Biosignal-enabled Spoken Communication", we aim at bringing together researchers working on biosignals and speech processing to exchange ideas on the interdisciplinary topics.

Topics and Session Format

Topics of interest for this special session include, but are not limited to:

  • Processing of biosignals related to spoken communication, such as brain activity captured by, e.g., EEG, Electrocorticography (ECoG), or functional magnetic resonance imaging (fMRI).
  • Processing of biosignals stemming from respiratory, laryngeal, or articulatory activity, representedby, e.g., EMA, EMG, videos, or similiar.
  • Application of biosignals for speech processing, e.g., speech recognition, synthesis, enhancement, voice conversion, or  auditory attention detection.
  • Utilization of biosignals to increase the explainability or performance of acoustic speech processing methods.
  • Development of novel machine learning algorithms, feature representations, model architectures, as well as training and evaluation strategies for improved performance or to address common chal-lenges.
  • Applications such as speech restoration, training and therapy, speech-related brain-computer interfaces (BCIs), speech communication in noisy environments, or acoustic-free speech communicationfor preserving privacy.

Paper submissions must conform to the format defined in the Interspeech paper preparation guidelines and detailed in the author’s kit, which can be found on the Interspeech web site. When submitting the paper in the Interspeech electronic paper submission system, please indicate that the paper should be included in the Special Session  Biosignal-enabled Spoken Communications. All submissions will take part in the normal paper review process.

The session format will either be a poster session or oral presentations, depending on the number of accepted papers. We will therefore inform participants about the format shortly after the acceptance notification (May 17th, 2023)

Important Dates

Submission opened: January 18th, 2023
Paper submission deadline: March 1st, 2023
Paper update deadline: March 8th, 2023
Acceptance notification: May 17th, 2023