0
0
PandasHow-ToBeginner · 3 min read

How to Select Rows by Condition in pandas DataFrame

Use df[condition] to select rows in a pandas DataFrame where the condition is a boolean expression. For example, df[df['column'] > value] returns rows where the column values are greater than the given value.
📐

Syntax

The basic syntax to select rows by condition in pandas is:

  • df[condition]: Returns rows where the condition is True.
  • condition is a boolean expression involving DataFrame columns, e.g., df['age'] > 30.
python
df[condition]
💻

Example

This example shows how to select rows where the 'age' column is greater than 30.

python
import pandas as pd

data = {'name': ['Alice', 'Bob', 'Charlie', 'David'],
        'age': [25, 35, 30, 40]}
df = pd.DataFrame(data)

# Select rows where age is greater than 30
result = df[df['age'] > 30]
print(result)
Output
name age 1 Bob 35 3 David 40
⚠️

Common Pitfalls

Common mistakes when selecting by condition include:

  • Using df['column'] = value instead of df['column'] == value for comparison.
  • Forgetting that the condition must return a boolean Series of the same length as the DataFrame.
  • Using and / or instead of & / | for element-wise logical operations.

Example of wrong and right usage:

python
import pandas as pd

data = {'score': [80, 90, 70, 85]}
df = pd.DataFrame(data)

# Wrong: Using 'and' instead of '&'
# df[(df['score'] > 75) and (df['score'] < 90)]  # This raises an error

# Right: Use '&' with parentheses
result = df[(df['score'] > 75) & (df['score'] < 90)]
print(result)
Output
score 0 80 3 85
📊

Quick Reference

OperationSyntax ExampleDescription
Equal todf[df['col'] == value]Select rows where column equals value
Greater thandf[df['col'] > value]Select rows where column is greater than value
Less thandf[df['col'] < value]Select rows where column is less than value
Multiple conditionsdf[(df['col1'] > val1) & (df['col2'] < val2)]Select rows matching all conditions
OR conditiondf[(df['col1'] == val1) | (df['col2'] == val2)]Select rows matching any condition

Key Takeaways

Use df[condition] where condition is a boolean expression to select rows.
Combine multiple conditions with & (and) and | (or), always inside parentheses.
Avoid using = for comparison; use == instead.
Conditions must return a boolean Series matching DataFrame length.
Use parentheses to group conditions for clarity and correctness.