0
0
Data Analysis Pythondata~5 mins

concat() for stacking DataFrames in Data Analysis Python

Choose your learning style9 modes available
Introduction

We use concat() to join two or more tables (DataFrames) together. It helps us combine data from different sources into one big table.

You have sales data from January and February in separate tables and want to see all months together.
You collected survey results from different cities and want to analyze them as one dataset.
You want to add new rows of data to an existing table without changing the columns.
You want to stack data vertically or horizontally to prepare for analysis.
Syntax
Data Analysis Python
import pandas as pd

# Create DataFrames
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

# Concatenate DataFrames vertically (stack rows)
result = pd.concat([df1, df2], axis=0, ignore_index=True)

# Concatenate DataFrames horizontally (stack columns)
result = pd.concat([df1, df2], axis=1)

axis=0 stacks rows (one below the other).

axis=1 stacks columns (side by side).

Examples
Shows what happens when one DataFrame is empty and when columns differ.
Data Analysis Python
import pandas as pd

df_empty = pd.DataFrame()
df_one = pd.DataFrame({'A': [1]})
df_two = pd.DataFrame({'A': [2], 'B': [3]})

# Concatenate empty and one-row DataFrame
result1 = pd.concat([df_empty, df_one], ignore_index=True)

# Concatenate one-row and two-column DataFrame
result2 = pd.concat([df_one, df_two], ignore_index=True)

print(result1)
print(result2)
Stacks two DataFrames vertically, showing how rows from the second come after the first.
Data Analysis Python
import pandas as pd

df_start = pd.DataFrame({'A': [10, 20]})
df_end = pd.DataFrame({'A': [30, 40]})

# Stack df_end below df_start
result = pd.concat([df_start, df_end], ignore_index=True)

print(result)
Stacks two DataFrames horizontally, adding columns side by side.
Data Analysis Python
import pandas as pd

df_left = pd.DataFrame({'A': [1, 2]})
df_right = pd.DataFrame({'B': [3, 4]})

# Stack columns side by side
result = pd.concat([df_left, df_right], axis=1)

print(result)
Sample Program

This program creates two small tables for January and February sales. Then it stacks them vertically to see all sales together.

Data Analysis Python
import pandas as pd

# Create two DataFrames with sales data
sales_jan = pd.DataFrame({
    'Product': ['Apple', 'Banana'],
    'Sales': [100, 150]
})

sales_feb = pd.DataFrame({
    'Product': ['Apple', 'Banana'],
    'Sales': [120, 130]
})

print('Sales January:')
print(sales_jan)
print('\nSales February:')
print(sales_feb)

# Stack the two DataFrames vertically to combine months
all_sales = pd.concat([sales_jan, sales_feb], axis=0, ignore_index=True)

print('\nCombined Sales:')
print(all_sales)
OutputSuccess
Important Notes

Time complexity: Concatenation runs in O(n) where n is total rows or columns combined.

Space complexity: Requires extra space to hold the combined DataFrame.

Common mistake: Forgetting ignore_index=True can keep old row numbers, causing confusion.

Use concat() when you want to stack DataFrames. Use merge() or join() when you want to combine based on matching columns.

Summary

concat() stacks DataFrames vertically or horizontally.

Use axis=0 to add rows, axis=1 to add columns.

Remember to use ignore_index=True to reset row numbers after stacking rows.