How to Use filter Method in pandas for Data Selection
The
filter method in pandas lets you select rows or columns by specifying labels, like names or patterns. You can use it to keep only columns or rows that match certain criteria, making data selection easy and flexible.Syntax
The filter method has these main parameters:
items: List of labels to keep.like: String pattern to match labels.regex: Regular expression to match labels.axis: Choose0for rows or1for columns (default is columns).
Use one of items, like, or regex to filter labels.
python
DataFrame.filter(items=None, like=None, regex=None, axis=None)
Example
This example shows how to filter columns by name and rows by index labels using filter.
python
import pandas as pd data = { 'apple': [1, 2, 3], 'banana': [4, 5, 6], 'cherry': [7, 8, 9], 'date': [10, 11, 12] } index_labels = ['a', 'b', 'c'] df = pd.DataFrame(data, index=index_labels) # Filter columns containing 'a' filtered_columns = df.filter(like='a', axis=1) # Filter rows with index labels 'a' and 'c' filtered_rows = df.filter(items=['a', 'c'], axis=0) print('Filtered columns (contain "a"):\n', filtered_columns) print('\nFiltered rows (labels a and c):\n', filtered_rows)
Output
Filtered columns (contain "a"):
apple banana
a 1 4
b 2 5
c 3 6
Filtered rows (labels a and c):
apple banana cherry date
a 1 4 7 10
c 3 6 9 12
Common Pitfalls
Common mistakes when using filter include:
- Not setting
axiscorrectly (default is columns). - Using multiple filter parameters at once, which can cause unexpected results.
- Trying to filter by values instead of labels (filter works on labels only).
Always check if you want to filter rows (axis=0) or columns (axis=1).
python
import pandas as pd data = {'A': [1, 2], 'B': [3, 4]} df = pd.DataFrame(data, index=['x', 'y']) # Wrong: filtering rows but axis not set (defaults to columns) wrong_filter = df.filter(items=['x']) # returns empty DataFrame # Right: specify axis=0 to filter rows right_filter = df.filter(items=['x'], axis=0) print('Wrong filter result:\n', wrong_filter) print('\nRight filter result:\n', right_filter)
Output
Wrong filter result:
Empty DataFrame
Columns: [A, B]
Index: []
Right filter result:
A B
x 1 3
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| items | List of labels to keep | None |
| like | Substring to match labels | None |
| regex | Regular expression to match labels | None |
| axis | 0 for rows, 1 for columns | 1 (columns) |
Key Takeaways
Use pandas filter to select rows or columns by label names or patterns.
Set axis=0 to filter rows and axis=1 to filter columns (default).
Use only one of items, like, or regex parameters at a time for clear filtering.
Filter works on labels, not on data values inside the DataFrame.
Check your filter results to avoid empty DataFrames due to wrong axis or labels.