How to Use str.split in pandas for Splitting Strings
Use
pandas.Series.str.split() to split strings in a DataFrame column by a delimiter. It returns a Series of lists or can expand into multiple columns with expand=True.Syntax
The basic syntax of str.split() in pandas is:
Series.str.split(pat=None, n=-1, expand=False)
Where:
patis the delimiter string to split on (default splits on whitespace).nis the maximum number of splits (-1 means no limit).expandifTrue, returns a DataFrame with each split as a separate column; ifFalse, returns a Series of lists.
python
Series.str.split(pat=None, n=-1, expand=False)
Example
This example shows how to split a column of full names into first and last names using str.split() with expand=True to get separate columns.
python
import pandas as pd data = {'Name': ['Alice Smith', 'Bob Johnson', 'Charlie Lee']} df = pd.DataFrame(data) # Split 'Name' column by space into two columns split_names = df['Name'].str.split(' ', n=1, expand=True) split_names.columns = ['First Name', 'Last Name'] # Join back to original DataFrame result = df.join(split_names) print(result)
Output
Name First Name Last Name
0 Alice Smith Alice Smith
1 Bob Johnson Bob Johnson
2 Charlie Lee Charlie Lee
Common Pitfalls
Common mistakes when using str.split() include:
- Not setting
expand=Truewhen you want separate columns, resulting in a Series of lists instead. - Using the wrong delimiter
pat, which causes unexpected splits or no splits. - Forgetting that
nlimits the number of splits, so setting it too low may truncate results.
Example of wrong and right usage:
python
import pandas as pd data = {'Code': ['A-1-2', 'B-3-4', 'C-5-6']} df = pd.DataFrame(data) # Wrong: returns lists, not separate columns wrong = df['Code'].str.split('-') print('Wrong output:') print(wrong) # Right: expand=True splits into columns right = df['Code'].str.split('-', expand=True) print('\nRight output:') print(right)
Output
Wrong output:
0 [A, 1, 2]
1 [B, 3, 4]
2 [C, 5, 6]
Name: Code, dtype: object
Right output:
0 1 2
0 A 1 2
1 B 3 4
2 C 5 6
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| pat | Delimiter string to split on | None (whitespace) |
| n | Maximum number of splits | -1 (no limit) |
| expand | Return DataFrame if True, else Series of lists | False |
Key Takeaways
Use pandas Series.str.split() to split strings in DataFrame columns by a delimiter.
Set expand=True to get separate columns instead of lists.
Specify the delimiter with pat to control how strings are split.
Use n to limit the number of splits if needed.
Common mistake: forgetting expand=True when separate columns are desired.