How to Use sum() in pandas: Syntax, Examples, and Tips
Use the
sum() function in pandas to add values across rows or columns in a DataFrame or Series. You can specify the axis to sum over with axis=0 for columns or axis=1 for rows. It works on numeric data and can skip missing values by default.Syntax
The basic syntax of sum() in pandas is:
DataFrame.sum(axis=None, skipna=True, numeric_only=None)Series.sum(skipna=True, numeric_only=None)
Explanation:
- axis: Choose
0to sum each column,1to sum each row. - skipna: If
True, ignore missing values (NaN). - numeric_only: If
True, sum only numeric data.
python
DataFrame.sum(axis=None, skipna=True, numeric_only=None) Series.sum(skipna=True, numeric_only=None)
Example
This example shows how to sum columns and rows in a pandas DataFrame.
python
import pandas as pd data = { 'A': [1, 2, 3], 'B': [4, 5, None], 'C': [7, 8, 9] } df = pd.DataFrame(data) # Sum each column (axis=0) column_sum = df.sum(axis=0) # Sum each row (axis=1) row_sum = df.sum(axis=1) print('Sum of each column:') print(column_sum) print('\nSum of each row:') print(row_sum)
Output
Sum of each column:
A 6.0
B 9.0
C 24.0
dtype: float64
Sum of each row:
0 12.0
1 15.0
2 12.0
dtype: float64
Common Pitfalls
Common mistakes when using sum() in pandas include:
- Not specifying
axisand getting unexpected results. - Summing columns with non-numeric data causing errors or unexpected output.
- Not handling
NaNvalues properly, which can affect sums.
Example of a common mistake and fix:
python
# Wrong: Summing without axis on DataFrame returns sum of each column as a Series import pandas as pd df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) # This sums each column in the DataFrame and returns a Series total_sum_wrong = df.sum() # Correct: Specify axis=0 to sum columns or axis=1 to sum rows sum_columns = df.sum(axis=0) sum_rows = df.sum(axis=1) print('Sum without axis (wrong):') print(total_sum_wrong) print('\nSum of columns (correct):') print(sum_columns) print('\nSum of rows (correct):') print(sum_rows)
Output
Sum without axis (wrong):
A 3
B 7
dtype: int64
Sum of columns (correct):
A 3
B 7
dtype: int64
Sum of rows (correct):
0 4
1 6
dtype: int64
Quick Reference
Summary tips for using sum() in pandas:
- Use
axis=0to sum down columns. - Use
axis=1to sum across rows. - Missing values (
NaN) are ignored by default. - Works best with numeric data; use
numeric_only=Trueif needed.
Key Takeaways
Use pandas
sum() to add values across rows or columns in DataFrames or Series.Specify
axis=0 to sum columns and axis=1 to sum rows.Missing values are ignored by default, so sums skip
NaN automatically.Ensure data is numeric or use
numeric_only=True to avoid errors.Always check the axis parameter to avoid unexpected sum results.