How to Process Audio Signal Using Python: Simple Guide
To process audio signals in Python, use libraries like
librosa or scipy.io.wavfile to load audio files, then apply operations like filtering or feature extraction. These tools let you read audio data as arrays and manipulate them for analysis or playback.Syntax
Here is the basic syntax to load and process audio using librosa and scipy:
librosa.load(path, sr=None): Loads audio file, returns audio time series and sample rate.scipy.io.wavfile.read(path): Reads WAV file, returns sample rate and data array.- Audio data is a numeric array representing sound amplitude over time.
- You can apply filters, transformations, or extract features from this array.
python
import librosa from scipy.io import wavfile # Load audio with librosa audio_data, sample_rate = librosa.load('audio.wav', sr=None) # Load audio with scipy sample_rate_scipy, data_scipy = wavfile.read('audio.wav')
Example
This example loads an audio file, computes its duration, and plots the waveform using matplotlib.
python
import librosa import matplotlib.pyplot as plt # Load audio file audio_path = 'audio.wav' audio_data, sample_rate = librosa.load(audio_path, sr=None) # Calculate duration in seconds duration = len(audio_data) / sample_rate print(f'Duration: {duration:.2f} seconds') # Plot waveform plt.figure(figsize=(10, 4)) plt.plot(audio_data) plt.title('Audio Waveform') plt.xlabel('Samples') plt.ylabel('Amplitude') plt.show()
Output
Duration: 3.45 seconds
Common Pitfalls
Common mistakes when processing audio in Python include:
- Not matching sample rates when combining audio signals.
- Using
scipy.io.wavfile.readwhich only supports WAV files, not MP3 or others. - Ignoring stereo vs mono channels; some functions expect mono audio.
- Not normalizing audio data, which can cause clipping or incorrect analysis.
Always check the audio format and sample rate before processing.
python
import librosa # Wrong: assuming stereo audio without checking audio_data, sr = librosa.load('audio.wav', mono=False) print(audio_data.shape) # Could be (n,) mono or (2, n) stereo # Right: convert to mono if needed audio_mono, sr = librosa.load('audio.wav', mono=True) print(audio_mono.shape) # (n,)
Output
(220500,)
(220500,)
Quick Reference
| Function | Description |
|---|---|
| librosa.load(path, sr=None, mono=True) | Load audio file as array, optional sample rate and mono conversion |
| scipy.io.wavfile.read(path) | Read WAV file, returns sample rate and data array |
| librosa.feature.mfcc(y, sr) | Extract MFCC features from audio |
| librosa.effects.trim(y) | Trim silence from start and end of audio |
| matplotlib.pyplot.plot(data) | Plot audio waveform or features |
Key Takeaways
Use librosa or scipy to load audio files as numeric arrays for processing.
Always check and handle sample rate and mono/stereo format correctly.
Visualize audio waveforms to understand signal shape and duration.
Normalize audio data to avoid clipping and ensure consistent analysis.
Use built-in feature extraction functions like MFCC for audio analysis.