How to Use Matplotlib with Pandas for Data Visualization
You can use
matplotlib directly with pandas by calling the plot() method on a pandas DataFrame or Series, which uses matplotlib under the hood. This allows you to create various charts like line, bar, and scatter plots easily with simple syntax.Syntax
The basic syntax to plot data from a pandas DataFrame or Series using matplotlib is:
DataFrame.plot(kind='line'): Plots a line chart by default.kindparameter can be'line','bar','scatter','hist', etc.- You can pass matplotlib parameters like
title,xlabel,ylabel, andcolorto customize the plot.
python
df.plot(kind='line', title='Title', xlabel='X-axis', ylabel='Y-axis', color='blue')
Example
This example shows how to create a simple line plot from a pandas DataFrame using matplotlib through pandas' plot() method.
python
import pandas as pd import matplotlib.pyplot as plt # Create sample data data = {'Year': [2018, 2019, 2020, 2021], 'Sales': [250, 300, 400, 350]} df = pd.DataFrame(data) df.set_index('Year', inplace=True) # Plot sales over years ax = df.plot(kind='line', title='Yearly Sales', xlabel='Year', ylabel='Sales', color='green') plt.show()
Output
A line chart showing sales values on the Y-axis and years on the X-axis with a green line and the title 'Yearly Sales'.
Common Pitfalls
- Not importing
matplotlib.pyplotaspltcan cause errors when callingplt.show(). - For scatter plots, you must specify
xandycolumns explicitly. - Calling
plot()without setting an index may produce unexpected X-axis labels. - For multiple plots, forgetting to call
plt.show()can prevent the plot from displaying.
python
import pandas as pd import matplotlib.pyplot as plt data = {'A': [1, 2, 3], 'B': [4, 5, 6]} df = pd.DataFrame(data) # Wrong: scatter plot without x and y # df.plot(kind='scatter') # This will raise an error # Right: specify x and y ax = df.plot(kind='scatter', x='A', y='B', color='red') plt.show()
Output
A scatter plot with points colored red, plotting column 'A' on X-axis and 'B' on Y-axis.
Quick Reference
| Plot Type | kind Parameter | Notes |
|---|---|---|
| Line Plot | line | Default plot type, good for trends |
| Bar Plot | bar | Vertical bars, good for categories |
| Horizontal Bar Plot | barh | Horizontal bars |
| Scatter Plot | scatter | Requires x and y columns |
| Histogram | hist | Distribution of data |
| Box Plot | box | Shows data spread and outliers |
| Area Plot | area | Stacked area chart |
Key Takeaways
Use pandas DataFrame or Series
plot() method to create matplotlib plots easily.Specify the
kind parameter to choose the plot type like 'line', 'bar', or 'scatter'.Always import
matplotlib.pyplot as plt and call plt.show() to display plots.For scatter plots, explicitly provide
x and y columns to avoid errors.Setting the DataFrame index helps control the X-axis labels in plots.