How to Select Rows by Condition in pandas DataFrame
Use
df[condition] to select rows in a pandas DataFrame where the condition is a boolean expression. For example, df[df['column'] > value] returns rows where the column values are greater than the given value.Syntax
The basic syntax to select rows by condition in pandas is:
df[condition]: Returns rows where theconditionis True.conditionis a boolean expression involving DataFrame columns, e.g.,df['age'] > 30.
python
df[condition]
Example
This example shows how to select rows where the 'age' column is greater than 30.
python
import pandas as pd data = {'name': ['Alice', 'Bob', 'Charlie', 'David'], 'age': [25, 35, 30, 40]} df = pd.DataFrame(data) # Select rows where age is greater than 30 result = df[df['age'] > 30] print(result)
Output
name age
1 Bob 35
3 David 40
Common Pitfalls
Common mistakes when selecting by condition include:
- Using
df['column'] = valueinstead ofdf['column'] == valuefor comparison. - Forgetting that the condition must return a boolean Series of the same length as the DataFrame.
- Using
and/orinstead of&/|for element-wise logical operations.
Example of wrong and right usage:
python
import pandas as pd data = {'score': [80, 90, 70, 85]} df = pd.DataFrame(data) # Wrong: Using 'and' instead of '&' # df[(df['score'] > 75) and (df['score'] < 90)] # This raises an error # Right: Use '&' with parentheses result = df[(df['score'] > 75) & (df['score'] < 90)] print(result)
Output
score
0 80
3 85
Quick Reference
| Operation | Syntax Example | Description |
|---|---|---|
| Equal to | df[df['col'] == value] | Select rows where column equals value |
| Greater than | df[df['col'] > value] | Select rows where column is greater than value |
| Less than | df[df['col'] < value] | Select rows where column is less than value |
| Multiple conditions | df[(df['col1'] > val1) & (df['col2'] < val2)] | Select rows matching all conditions |
| OR condition | df[(df['col1'] == val1) | (df['col2'] == val2)] | Select rows matching any condition |
Key Takeaways
Use df[condition] where condition is a boolean expression to select rows.
Combine multiple conditions with & (and) and | (or), always inside parentheses.
Avoid using = for comparison; use == instead.
Conditions must return a boolean Series matching DataFrame length.
Use parentheses to group conditions for clarity and correctness.