RosHow-ToBeginner · 3 min read

How to Compute Spectrogram of Audio in Signal Processing

To compute a spectrogram of audio, split the audio signal into short overlapping segments, apply a window function, and compute the Fourier Transform on each segment. This shows how the frequency content changes over time. Libraries like scipy.signal.spectrogram in Python make this process simple and efficient.

📐

Syntax

The typical function to compute a spectrogram in Python is scipy.signal.spectrogram. It takes the audio signal and sampling rate as main inputs, along with optional parameters like window type, segment length, and overlap.

signal: The audio data array.
fs: Sampling frequency of the audio.
window: Type of window to apply (e.g., 'hann').
nperseg: Number of samples per segment.
noverlap: Number of overlapping samples between segments.
nfft: Number of FFT points.

The function returns frequencies, times, and the spectrogram matrix.

python

frequencies, times, Sxx = scipy.signal.spectrogram(signal, fs, window='hann', nperseg=256, noverlap=128, nfft=256)

💻

Example

This example shows how to compute and plot the spectrogram of a sample audio signal using scipy.signal.spectrogram and matplotlib. It demonstrates the frequency content changing over time.

python

import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import spectrogram

# Create a sample signal: 2 seconds, 1000 Hz sample rate
fs = 1000
T = 2
t = np.linspace(0, T, int(T*fs), endpoint=False)

# Signal with two frequencies changing over time
signal = np.sin(2*np.pi*100*t) * (t < 1) + np.sin(2*np.pi*300*t) * (t >= 1)

# Compute spectrogram
frequencies, times, Sxx = spectrogram(signal, fs, window='hann', nperseg=256, noverlap=128)

# Plot spectrogram
plt.pcolormesh(times, frequencies, 10 * np.log10(Sxx), shading='gouraud')
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.title('Spectrogram of Sample Signal')
plt.colorbar(label='Intensity [dB]')
plt.show()

Output

A plot window showing a spectrogram with frequency on the vertical axis and time on the horizontal axis, displaying two distinct frequency bands: 100 Hz in the first second and 300 Hz in the second second.

⚠️

Common Pitfalls

Common mistakes when computing spectrograms include:

Using too short or too long nperseg, which can reduce frequency or time resolution.
Not applying window functions, causing spectral leakage.
Choosing noverlap too small, losing smoothness in time.
Misinterpreting the spectrogram scale (linear vs dB).

Always balance segment length and overlap for your signal's characteristics.

python

import numpy as np
from scipy.signal import spectrogram

# Wrong: no window and very short segment
frequencies, times, Sxx_wrong = spectrogram(signal, fs, window='boxcar', nperseg=50, noverlap=10)

# Right: hann window and moderate segment length
frequencies, times, Sxx_right = spectrogram(signal, fs, window='hann', nperseg=256, noverlap=128)

📊

Quick Reference

Parameter	Description	Typical Values
signal	Input audio data array	Array of samples
fs	Sampling frequency in Hz	Audio sample rate (e.g., 44100)
window	Window function to reduce spectral leakage	'hann', 'hamming', 'blackman'
nperseg	Samples per segment (affects resolution)	256, 512, 1024
noverlap	Overlap between segments	Usually 50% of nperseg
nfft	Number of FFT points	Equal or greater than nperseg

✅

Key Takeaways

Use scipy.signal.spectrogram to compute spectrograms easily from audio data.

Choose segment length and overlap carefully to balance time and frequency resolution.

Apply a window function like 'hann' to reduce spectral leakage.

Interpret spectrogram intensity often in decibels (dB) for better visualization.

Plotting the spectrogram helps visualize how frequencies change over time.