Estimating Breathing Pattern and Parameters from Speech Waveform

Alternation in respiratory system and speech production system results in changes in speech. Therefore, speech signal, which can be acquired in a non-invasive manner, could be used to predict breathing patterns. There is a growing interest in that direction, which has gained further momentum with COVID-19 situation.

The research in the speech processing domain have been mainly focused on modeling the voice source and the articulations. However the speech production system includes a combination of several complex physiological systems such as muscular, respiratory, cognitive, and autonomic nervous systems. Variations in such physiological systems can affect the speech. For example, people with respiratory or heart conditions can experience shortness of breath which in turn changes their speech communications. Parkinson's Disease (PD) is a neuro-degenerative disease which can disrupt the speech by affecting the muscles required for speaking. Even variations in cognitive load due to mental stress can have a noticeable effect on speech abilities of a person.

The respiratory system is one of the major components of the speech production system. Any alteration in breathing can result in changes in speech. Specific breathing characteristics, such as breathing rate and tidal volume, can indicate a person's pathological condition.
More recently, neural network-based methods have started emerging for predicting the breathing signal from the speech signal. The neural networks are trained and evaluated with different objective measures, such as mean squared error (MSE) and Pearson's correlation.

In this work, we investigate respiratory signal and breathing parameters estimation through (a) raw waveform modeling and (b) modeling of short-term spectral features using deep learning techniques. This work is supported by the Swiss National Science Foundation under the project "Towards Integrated processing of Physiological and Speech signals'" (TIPS).

 


(a) raw waveform based modeling



(b) short-term spectral based modeling