0
0
PandasHow-ToBeginner · 3 min read

How to Select Columns by dtype in pandas: Simple Guide

Use the DataFrame.select_dtypes() method to select columns by their data type in pandas. Pass the desired data types as a list to the include or exclude parameter to filter columns accordingly.
📐

Syntax

The main method to select columns by data type in pandas is DataFrame.select_dtypes(). It has two key parameters:

  • include: a single dtype or list of dtypes to keep.
  • exclude: a single dtype or list of dtypes to drop.

You can specify data types like 'int', 'float', 'object' (for strings), 'bool', or numpy dtypes.

python
DataFrame.select_dtypes(include=None, exclude=None)
💻

Example

This example shows how to select only numeric columns and then only object (string) columns from a DataFrame.

python
import pandas as pd
import numpy as np

data = {
    'age': [25, 32, 40],
    'name': ['Alice', 'Bob', 'Charlie'],
    'height': [5.5, 6.0, 5.8],
    'member': [True, False, True]
}
df = pd.DataFrame(data)

# Select numeric columns (int and float)
numeric_cols = df.select_dtypes(include=['number'])

# Select only object (string) columns
string_cols = df.select_dtypes(include=['object'])

print('Numeric columns:')
print(numeric_cols)
print('\nString columns:')
print(string_cols)
Output
Numeric columns: age height 0 25 5.5 1 32 6.0 2 40 5.8 String columns: name 0 Alice 1 Bob 2 Charlie
⚠️

Common Pitfalls

One common mistake is to pass data types as strings that pandas does not recognize, causing an empty DataFrame or error. Also, mixing include and exclude incorrectly can lead to unexpected results.

Remember that object dtype usually means strings, but can also hold mixed types.

python
import pandas as pd

data = {'a': [1, 2], 'b': ['x', 'y']}
df = pd.DataFrame(data)

# Wrong: typo in dtype string
wrong = df.select_dtypes(include=['strng'])  # No columns selected

# Right: use 'object' for strings
right = df.select_dtypes(include=['object'])

print('Wrong selection:')
print(wrong)
print('\nRight selection:')
print(right)
Output
Wrong selection: Empty DataFrame Columns: [] Index: [0, 1] Right selection: b 0 x 1 y
📊

Quick Reference

ParameterDescriptionExample Values
includeData types to keep['number'], ['object'], ['float'], ['bool']
excludeData types to drop['number'], ['object']
Data typesCommon dtype strings'int', 'float', 'bool', 'object', 'category'

Key Takeaways

Use DataFrame.select_dtypes(include=...) to select columns by data type.
Pass data types as strings like 'number', 'object', or numpy dtypes.
Avoid typos in dtype names to prevent empty results.
Use include and exclude parameters carefully to filter columns.
Remember 'object' dtype often means string columns in pandas.