Sensor for locating sound sources

From PC5271 wiki
Jump to navigation Jump to search

Team Members

Wang Liyue, Li Jiaqi, Su Qiqi, Wang Tongxu

Introduction

The ability to accurately locate sound sources has extensive applications ranging from audio surveillance to enhancement in hearing aids. This project aims to develop a sensor capable of pinpointing sound origins using a microphone array system.

Our approach involves the construction of an array consisting of multiple microphones strategically positioned to capture sound waves from various directions. The core of our sensor's computational framework relies on the cross-correlation function. This algorithm determines the time differences of arrival (TDOAs) of sound waves at different microphones. By computing these TDOAs, we can effectively estimate the direction and distance of the sound source relative to the array, enabling precise localization. Upon assembling the experimental setup, the performance of the sensor was initially validated in a one-dimensional context. Subsequently, the theoretical groundwork and algorithmic coding were extended to more complex two-dimensional and three-dimensional scenarios, enhancing the sensor's applicability across different environments.

Future enhancements will involve expanding the microphone array configuration. In two-dimensional setups, a minimum of three microphones is essential for accurate localization, while three-dimensional configurations require an even greater number. Beyond merely identifying sound locations, we aim to integrate machine learning to distinguish specific sounds and enable mechanical responses to auditory stimuli. By exemplifying the potential of this technology, we could, for instance, discern and locate different individuals by their voice signatures. Integrating this capability with devices like cameras could achieve a sophisticated level of 'voiceprint monitoring/detection,' showcasing the broad utility of our approach.

Theory

Acoustic source location identification

1-Dimensional



 : Speed of sound in air (~340m/s)
According to the mathematical relationship
,
we can know the position of voice. However, this has a drawback, when the sound source is not between the microphone M1M2 we can not tell its position.

2-Dimensional



We can tell that the mathematical relationship are
,
,
,
.

And the distance difference:
,
,
.

So we can get
,
,
.

If the is greater than , we will take as the angle of the , (i.e., the source is in the left plane divided by the line between the centre of gravity and M1) so that the angle of the stays between -.
In this case:
,
,
.

The distance difference:
,
,
.

Since we can measure and calculate the difference in distance between M2 and M3 (i.e. ), the first and second mathematical relationships are solved in terms of joint equations for and , respectively. Then we can get , and know the position of voice.

3-Dimensional


We can tell that the mathematical relationship are
,
,
,

.
In the same way, we can get
,
,
.

So we can get the relationship between x and y,
,
which ,
.

,
which ,
.

Using these two formulas
,
we can get
.

By inserting (x,y) into the distance difference formula, we can find the z value. Get the voice position S (x,y,z).

Cross-Correlation Function

Correlation function is a concept in signal analysis, representing the degree of correlation between two time series, that is, describing the degree of correlation between the values of signal x(t) and y(t) at any two different times t1 and t2. When describing the correlation between two different signals, the two signals can be random signals or they can be deterministic signals.

The correlation function can be used to calculate the arrival time difference of two sound signals, assuming that the signal received by microphone M1 is x(t), the signal received by microphone M2 is y(t)=Ax(t−t0), and the arrival time difference between sound waves arriving M1 and M2 is t0. Then the cross-correlation function of x(t) and y(t) is defined as

.

Substituting the y(t) expression yields:

,

According to ,


Now is the Fourier transform of is the conjugate Fourier transform of . Some of , the integral might correspond to the maximum value of . Therefore, correlations can be quickly calculated by Fourier transforms and conjugate Fourier transforms.

Experiment Setup

1. LeCroy - WaveSurfer - 64Xs - 600MHz Oscilloscope

LeCroy - WaveSurfer - 64Xs - 600MHz Oscilloscope

2. 3 integrated microphones (Adafruit AGC Electret Microphone Amplifier - MAX9814)

Adafruit AGC Electret Microphone Amplifier - MAX9814

3. Breadboard
4. Cables
5. 4.5V power supply

Complete built device

Measurements

1-Dimensional

1.Experiment Setup: The same microphone was set as both the reference and the trigger, placed at the 0.00 cm mark. Two microphones were connected to an oscilloscope, and the parameters were adjusted to enable a clear display of waveforms.

2.Microphone Placement: The microphones were aligned at a predetermined distance from each other (e.g., 200.00 cm), and their positions were carefully measured and noted.
3.Sound Emission: At various designated points along the line between the two microphones (e.g., at 0, 50, 100, 150, and 200 cm), different sounds were produced, including clapping, and vocal sounds like "a", "hello", and "yes". The waveforms captured by the microphones were recorded on the oscilloscope.
4.Data Analysis: MATLAB was used to analyze the recorded data. This involved calculating the time differences of arrival (TDOAs) of the sounds at the two microphones, from which the precise locations of the sounds were determined.

Data Processing

1-Dimensional

Data Collection

  • Raw audio data were collected with two microphones, each recording saved as a '.dat' file.
  • MATLAB was utilized for data importation and preliminary analysis.

Data Analysis Steps

  1. Data Loading: Each microphone's data was loaded into MATLAB.
  2. Distance Setting: A fixed distance of 'L' meters was maintained between both microphones.
  3. Time and Amplitude Extraction: Time-stamped amplitude data were extracted from the audio files.
  4. Amplitude Normalization: DC offset was mitigated by subtracting the mean amplitude from each signal.
  5. Time Vector Adjustment: The time vector was recalibrated to start at zero, synchronizing the datasets.

Visualization

  • Amplitude offsets for each microphone were plotted in a shared graph with distinct colors (red for Microphone 1 and blue for Microphone 2) to ensure clear differentiation.
Sound source type: Clap
Sound source type: 'Hello'
Sound source type: 'Yes'
Sound source type: 'A'

Cross-Correlation Analysis

  • Temporal delays between signals were quantified through cross-correlation, pinpointing the sound's origination point relative to the microphones.

Sound Source Localization

  • Employing the speed of sound (340 m/s), the source's location was computed based on the calculated time lag between the two signals.

Documentation and Output

  • MATLAB commands were scripted to automate the processing steps, including normalization and cross-correlation plotting.
  • The calculated time differences, delta distances, and source positions were output to the MATLAB console.

MATLAB code

1-Dimensional

 ch_0 = load('C1Trace00000.dat');
 ch_1 = load('C2Trace00000.dat'); % trigger, reference 
 L = 2; % dist btw 2 microphones
 time = ch_0(:, 1);
 dtime = ch_0(2, 1)- ch_0(1, 1);
 ch_0_ampl = ch_0(:, 2);
 ch_1_ampl = ch_1(:, 2);
 ch_0_ampl_mean = mean(ch_0_ampl);
 ch_1_ampl_mean = mean(ch_1_ampl);
 ch_0_ampl_offset = ch_0_ampl - ch_0_ampl_mean;
 ch_1_ampl_offset = ch_1_ampl - ch_1_ampl_mean;
 % Adjust time vector to start from 0
 time_adj = time - time(1);  % Subtract the first time value from all elements
 % Create a new figure window showing the first figure
 figure;
 plot(time_adj, ch_1_ampl_offset, 'r', time_adj, ch_0_ampl_offset, 'b');
 grid on;
 title('Amplitude Offset for M1 and M2');
 xlim([min(time_adj) max(time_adj)]);
 xlabel('Time (s)');
 ylabel('Amplitude Offset (V)');
 legend('Microphone 1', 'Microphone 2');
 [c, lag] = xcorr(ch_0_ampl_offset, ch_1_ampl_offset);
 % Create new figure window showing cross-correlation graphs
 figure;
 plot(dtime * lag, c);
 grid on;
 title('Cross-Correlation between M1 and M2');
 xlabel('Time Lag (s)');
 ylabel('Cross-Correlation');
 VS = 340; % speed of sound in ms^-1
 [val, index] = max(c);
 time_diff = dtime * lag(index);
 delta_x = time_diff * VS;
 x_pos = (L - delta_x) / 2;
 % Output calculation results
 disp(['Time Difference: ', num2str(time_diff), ' seconds']);
 disp(['Delta x: ', num2str(delta_x), ' meters']);
 disp(['Source Position: ', num2str(x_pos), ' meters']);
 % Compute normalized cross-correlation
 [c_norm, lag_norm] = xcorr(ch_0_ampl_offset, ch_1_ampl_offset, 'coeff');
 % Create a new graphics window to display the normalized cross-correlation graph
 figure;
 plot(dtime * lag_norm, c_norm);
 grid on;
 title('Normalized Cross-Correlation between M1 and M2');
 xlabel('Time Lag (s)');
 ylabel('Normalized Cross-Correlation');
 % Find the maximum value of the normalized cross-correlation and calculate the time difference
 [val_norm, index_norm] = max(c_norm);
 time_diff_norm = dtime * lag_norm(index_norm);
 % Output normalized calculation results
 disp(['Normalized Time Difference: ', num2str(time_diff_norm), ' seconds']);

Results

1-Dimensional

Source type

绝对误差图定量对比计算和实际和人工结果

Clap
'Hello'
'a'

xcorr结果分析

看绝对系数大小,比较测量结果。下结论:xxx声音的展现的相关度最高

Clap
'Hello'
'a'

Summary

Error Analysis and Discussions

In the process of sound source localization using cross-correlation techniques in MATLAB, various factors can introduce significant errors. These include human-induced errors during operation and readings, as well as the intrinsic limitations of the `xcorr` function.

Human factors

  • Manual Readings and Subjectivity:
    • It is often challenging to accurately determine the relative positions of two sound waves, leading to large errors. The direct reading of time differences (Δt) from the oscilloscope may therefore have limited reliability.
    • Comparative charts should be used to check for noticeable discrepancies in these readings.
  • Optimal Parameter Settings:
    • Finding the most accurate setting for measurement parameters is crucial. Incorrect settings can lead to insufficient precision, significantly impacting the calculated results and the actual sound source location.
    • Time Division Settings: Larger time divisions may produce sharper peaks but decrease time precision, affecting overall location accuracy. Conversely, smaller divisions can increase precision but limit the measurable sound duration.
    • Voltage Division Settings: The ratio of the sound wave to the oscilloscope display scale must be optimally set to ensure that waveforms are neither too compressed nor too elongated, as both extremes can distort correlation accuracy.

Limitations of Cross-Correlation (xcorr)

  • Sampling Frequency and Data Point Quantity:
    • Sampling Rate and Data Points: The accuracy of the time delay measured by the cross-correlation function (`xcorr`) is contingent upon the sampling rate and the number of data points.
      • If the sampling interval is too large or if the signal is not continuously recorded, the time delay derived may not be very precise.
      • Higher sampling rates can provide more detailed data and potentially more accurate delay estimates.
  • Signal Characteristics and Noise:
    • Waveform Integrity and Noise: An incomplete waveform or significant noise within the signal can lead to erroneous interpretations by the cross-correlation algorithm.
      • Low-frequency oscillations or small amplitude variations can make it difficult for `xcorr` to accurately identify the delay corresponding to the peak correlation.
      • Ensuring high signal integrity and minimizing noise are crucial for reliable cross-correlation analysis.
    • Sound Signal Attenuation: The natural decay of the sound signal with distance may influence the code's ability to judge and compare waveforms accurately.
      • Sound intensity diminishes as it travels through the medium, which could lead to variations in the recorded waveforms at different distances.
  • Boundary Effects and Window Size:
    • Effects of Signal Length and Window Size: The outcome of cross-correlation analysis is also influenced by the length of the signal and the size of the processing window.
      • Short signals or inappropriate window sizes can lead to misinterpretation of delays, affecting the overall results.
      • Adjusting the window size to adequately capture the dynamics of the signal without truncating important features is essential.


It is imperative to meticulously consider and, where possible, quantify each potential source of error. By doing so, the interpretation of data can be significantly enhanced, improving the reliability of the sound source localization methodology.

改进空间

技术操作层面

算法改进层面

增加功能

Logs

Build a link to record experiment log.

References

Sensor that recognizes specific sounds and steers toward the source

This is the name and link of the previous page. We adjusted the final project name displayed on the main page based on the final results and conditions.