0
0
PandasHow-ToBeginner · 3 min read

How to Use str.split in pandas for Splitting Strings

Use pandas.Series.str.split() to split strings in a DataFrame column by a delimiter. It returns a Series of lists or can expand into multiple columns with expand=True.
📐

Syntax

The basic syntax of str.split() in pandas is:

  • Series.str.split(pat=None, n=-1, expand=False)

Where:

  • pat is the delimiter string to split on (default splits on whitespace).
  • n is the maximum number of splits (-1 means no limit).
  • expand if True, returns a DataFrame with each split as a separate column; if False, returns a Series of lists.
python
Series.str.split(pat=None, n=-1, expand=False)
💻

Example

This example shows how to split a column of full names into first and last names using str.split() with expand=True to get separate columns.

python
import pandas as pd

data = {'Name': ['Alice Smith', 'Bob Johnson', 'Charlie Lee']}
df = pd.DataFrame(data)

# Split 'Name' column by space into two columns
split_names = df['Name'].str.split(' ', n=1, expand=True)
split_names.columns = ['First Name', 'Last Name']

# Join back to original DataFrame
result = df.join(split_names)
print(result)
Output
Name First Name Last Name 0 Alice Smith Alice Smith 1 Bob Johnson Bob Johnson 2 Charlie Lee Charlie Lee
⚠️

Common Pitfalls

Common mistakes when using str.split() include:

  • Not setting expand=True when you want separate columns, resulting in a Series of lists instead.
  • Using the wrong delimiter pat, which causes unexpected splits or no splits.
  • Forgetting that n limits the number of splits, so setting it too low may truncate results.

Example of wrong and right usage:

python
import pandas as pd

data = {'Code': ['A-1-2', 'B-3-4', 'C-5-6']}
df = pd.DataFrame(data)

# Wrong: returns lists, not separate columns
wrong = df['Code'].str.split('-')
print('Wrong output:')
print(wrong)

# Right: expand=True splits into columns
right = df['Code'].str.split('-', expand=True)
print('\nRight output:')
print(right)
Output
Wrong output: 0 [A, 1, 2] 1 [B, 3, 4] 2 [C, 5, 6] Name: Code, dtype: object Right output: 0 1 2 0 A 1 2 1 B 3 4 2 C 5 6
📊

Quick Reference

ParameterDescriptionDefault
patDelimiter string to split onNone (whitespace)
nMaximum number of splits-1 (no limit)
expandReturn DataFrame if True, else Series of listsFalse

Key Takeaways

Use pandas Series.str.split() to split strings in DataFrame columns by a delimiter.
Set expand=True to get separate columns instead of lists.
Specify the delimiter with pat to control how strings are split.
Use n to limit the number of splits if needed.
Common mistake: forgetting expand=True when separate columns are desired.