Speech Intelligibility Enhancement using Microphone Array via Intra-Vehicular Beamforming

Problem

According to the National Safety Council, there are approximately 1.6 million crashes each year due to distracted driving involving mobile phones. Drivers often hold their phone while making or taking a call, which causes their eyes to leave the road. In an attempt to discourage the handheld use of mobile phones while driving, hands-free Bluetooth calling connectivity has become the auto-industry standard. This hasn’t entirely solved the problem, however.

The level of near-end speech intelligibility being sent is reduced due to multiple sources of noise. Some noises occur outside the car cabin such as engine noise, wind noise, conductive vibration, and road noise such as tires against pavement. Others occur inside the cabin including talking passengers, air conditioning, and music. Regardless of their source location, all of these noise sources and others combine to reduce the intelligibility of phone conversations. This causes frustration and often affects the driver's concentration. They simply pick up the cellphone and use it as normal.

In audio signal processing applications, beamforming can be applied to selectively emphasize audio signals based on the direction-of-arrival (DOA) in the relationship to an array of microphones. Acoustic beamforming is a process by which multiple signals from a microphone array are filtered and combined in order to increase the amplitude of a target source’s signal at a static DOA without increasing the amplitude of signals with differing DOAs.

Solution

This project aims to enhance speech intelligibility using microphone array via intra-vehicular beamforming, where the beamforming technique is used to combat near-end interference, and a uniform linear array (ULA) of microphones is used for data acquisition. The processed signal is then sent to a far-end user over a hands-free Bluetooth system with increased near-end speech intelligibility.

The proposed solution for this project is beamforming. Specifically, we will use a technique called Delay and Sum beamforming. This type of beamforming takes advantage of the fact that a uniform linear array of microphones will detect a signal at different times, due to the space between them. Any signal that is centered among the array will have the strongest correlation. If the microphones are summed and then normalized by the number of microphones in the array, any signal coming from directly in front of the array will stay at its original volume. Any signal coming from an angle will be attenuated.