How to Compute Spectrogram of Audio in Signal Processing
To compute a
spectrogram of audio, split the audio signal into short overlapping segments, apply a window function, and compute the Fourier Transform on each segment. This shows how the frequency content changes over time. Libraries like scipy.signal.spectrogram in Python make this process simple and efficient.Syntax
The typical function to compute a spectrogram in Python is scipy.signal.spectrogram. It takes the audio signal and sampling rate as main inputs, along with optional parameters like window type, segment length, and overlap.
signal: The audio data array.fs: Sampling frequency of the audio.window: Type of window to apply (e.g., 'hann').nperseg: Number of samples per segment.noverlap: Number of overlapping samples between segments.nfft: Number of FFT points.
The function returns frequencies, times, and the spectrogram matrix.
python
frequencies, times, Sxx = scipy.signal.spectrogram(signal, fs, window='hann', nperseg=256, noverlap=128, nfft=256)
Example
This example shows how to compute and plot the spectrogram of a sample audio signal using scipy.signal.spectrogram and matplotlib. It demonstrates the frequency content changing over time.
python
import numpy as np import matplotlib.pyplot as plt from scipy.signal import spectrogram # Create a sample signal: 2 seconds, 1000 Hz sample rate fs = 1000 T = 2 t = np.linspace(0, T, int(T*fs), endpoint=False) # Signal with two frequencies changing over time signal = np.sin(2*np.pi*100*t) * (t < 1) + np.sin(2*np.pi*300*t) * (t >= 1) # Compute spectrogram frequencies, times, Sxx = spectrogram(signal, fs, window='hann', nperseg=256, noverlap=128) # Plot spectrogram plt.pcolormesh(times, frequencies, 10 * np.log10(Sxx), shading='gouraud') plt.ylabel('Frequency [Hz]') plt.xlabel('Time [sec]') plt.title('Spectrogram of Sample Signal') plt.colorbar(label='Intensity [dB]') plt.show()
Output
A plot window showing a spectrogram with frequency on the vertical axis and time on the horizontal axis, displaying two distinct frequency bands: 100 Hz in the first second and 300 Hz in the second second.
Common Pitfalls
Common mistakes when computing spectrograms include:
- Using too short or too long
nperseg, which can reduce frequency or time resolution. - Not applying window functions, causing spectral leakage.
- Choosing
noverlaptoo small, losing smoothness in time. - Misinterpreting the spectrogram scale (linear vs dB).
Always balance segment length and overlap for your signal's characteristics.
python
import numpy as np from scipy.signal import spectrogram # Wrong: no window and very short segment frequencies, times, Sxx_wrong = spectrogram(signal, fs, window='boxcar', nperseg=50, noverlap=10) # Right: hann window and moderate segment length frequencies, times, Sxx_right = spectrogram(signal, fs, window='hann', nperseg=256, noverlap=128)
Quick Reference
| Parameter | Description | Typical Values |
|---|---|---|
| signal | Input audio data array | Array of samples |
| fs | Sampling frequency in Hz | Audio sample rate (e.g., 44100) |
| window | Window function to reduce spectral leakage | 'hann', 'hamming', 'blackman' |
| nperseg | Samples per segment (affects resolution) | 256, 512, 1024 |
| noverlap | Overlap between segments | Usually 50% of nperseg |
| nfft | Number of FFT points | Equal or greater than nperseg |
Key Takeaways
Use
scipy.signal.spectrogram to compute spectrograms easily from audio data.Choose segment length and overlap carefully to balance time and frequency resolution.
Apply a window function like 'hann' to reduce spectral leakage.
Interpret spectrogram intensity often in decibels (dB) for better visualization.
Plotting the spectrogram helps visualize how frequencies change over time.