RosHow-ToBeginner · 4 min read

How to Process Audio Signal Using Python: Simple Guide

To process audio signals in Python, use libraries like librosa or scipy.io.wavfile to load audio files, then apply operations like filtering or feature extraction. These tools let you read audio data as arrays and manipulate them for analysis or playback.

📐

Syntax

Here is the basic syntax to load and process audio using librosa and scipy:

librosa.load(path, sr=None): Loads audio file, returns audio time series and sample rate.
scipy.io.wavfile.read(path): Reads WAV file, returns sample rate and data array.
Audio data is a numeric array representing sound amplitude over time.
You can apply filters, transformations, or extract features from this array.

python

import librosa
from scipy.io import wavfile

# Load audio with librosa
audio_data, sample_rate = librosa.load('audio.wav', sr=None)

# Load audio with scipy
sample_rate_scipy, data_scipy = wavfile.read('audio.wav')

💻

Example

This example loads an audio file, computes its duration, and plots the waveform using matplotlib.

python

import librosa
import matplotlib.pyplot as plt

# Load audio file
audio_path = 'audio.wav'
audio_data, sample_rate = librosa.load(audio_path, sr=None)

# Calculate duration in seconds
duration = len(audio_data) / sample_rate
print(f'Duration: {duration:.2f} seconds')

# Plot waveform
plt.figure(figsize=(10, 4))
plt.plot(audio_data)
plt.title('Audio Waveform')
plt.xlabel('Samples')
plt.ylabel('Amplitude')
plt.show()

Output

Duration: 3.45 seconds

⚠️

Common Pitfalls

Common mistakes when processing audio in Python include:

Not matching sample rates when combining audio signals.
Using scipy.io.wavfile.read which only supports WAV files, not MP3 or others.
Ignoring stereo vs mono channels; some functions expect mono audio.
Not normalizing audio data, which can cause clipping or incorrect analysis.

Always check the audio format and sample rate before processing.

python

import librosa

# Wrong: assuming stereo audio without checking
audio_data, sr = librosa.load('audio.wav', mono=False)
print(audio_data.shape)  # Could be (n,) mono or (2, n) stereo

# Right: convert to mono if needed
audio_mono, sr = librosa.load('audio.wav', mono=True)
print(audio_mono.shape)  # (n,)

Output

(220500,) (220500,)

📊

Quick Reference

Function	Description
librosa.load(path, sr=None, mono=True)	Load audio file as array, optional sample rate and mono conversion
scipy.io.wavfile.read(path)	Read WAV file, returns sample rate and data array
librosa.feature.mfcc(y, sr)	Extract MFCC features from audio
librosa.effects.trim(y)	Trim silence from start and end of audio
matplotlib.pyplot.plot(data)	Plot audio waveform or features

✅

Key Takeaways

Use librosa or scipy to load audio files as numeric arrays for processing.

Always check and handle sample rate and mono/stereo format correctly.

Visualize audio waveforms to understand signal shape and duration.

Normalize audio data to avoid clipping and ensure consistent analysis.

Use built-in feature extraction functions like MFCC for audio analysis.