How to Concatenate Columns in pandas: Simple Guide
To concatenate columns in pandas, you can use the
+ operator for string columns or the .str.cat() method for more control. These methods combine values from two or more columns into a single column easily.Syntax
Here are two common ways to concatenate columns in pandas:
df['new_col'] = df['col1'] + df['col2']: Adds string columns directly.df['new_col'] = df['col1'].str.cat(df['col2'], sep=' '): Concatenates with a separator and handles missing values.
python
df['new_col'] = df['col1'] + df['col2'] df['new_col'] = df['col1'].str.cat(df['col2'], sep=' ')
Example
This example shows how to combine two text columns into one with a space between the values.
python
import pandas as pd data = {'first_name': ['John', 'Jane', 'Alice'], 'last_name': ['Doe', 'Smith', 'Johnson']} df = pd.DataFrame(data) df['full_name'] = df['first_name'].str.cat(df['last_name'], sep=' ') print(df)
Output
first_name last_name full_name
0 John Doe John Doe
1 Jane Smith Jane Smith
2 Alice Johnson Alice Johnson
Common Pitfalls
Common mistakes include:
- Trying to add non-string columns directly, which causes errors.
- Not handling missing values, which can result in
NaNin the output.
Use .astype(str) to convert non-string columns before concatenation and .str.cat() with na_rep='' to handle missing data.
python
import pandas as pd data = {'A': ['foo', None, 'bar'], 'B': [1, 2, 3]} df = pd.DataFrame(data) # Wrong: adding string and int columns directly causes error # df['concat'] = df['A'] + df['B'] # This will raise TypeError # Right: convert int to string first s1 = df['A'].fillna('') s2 = df['B'].astype(str) df['concat'] = s1.str.cat(s2, na_rep='') print(df)
Output
A B concat
0 foo 1 foo1
1 NaN 2 2
2 bar 3 bar3
Quick Reference
| Method | Description | Example |
|---|---|---|
| + operator | Concatenate string columns directly | df['new'] = df['col1'] + df['col2'] |
| .str.cat() | Concatenate with separator and handle missing values | df['new'] = df['col1'].str.cat(df['col2'], sep=' ') |
| .astype(str) | Convert non-string columns before concatenation | df['col2'].astype(str) |
| fillna('') | Replace missing values to avoid NaN in output | df['col1'].fillna('') |
Key Takeaways
Use the + operator or .str.cat() to concatenate string columns in pandas.
Convert non-string columns to string with .astype(str) before concatenation.
Handle missing values with fillna('') or na_rep parameter to avoid NaN results.
.str.cat() allows adding separators between concatenated columns easily.
Direct addition of non-string columns causes errors; always convert types first.