0
0
Pandasdata~10 mins

query() for fast filtering in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - query() for fast filtering
Start with DataFrame
Write query string
Apply df.query(query_string)
Filter rows matching condition
Return filtered DataFrame
The flow shows how a DataFrame is filtered by writing a query string and applying df.query() to get matching rows.
Execution Sample
Pandas
import pandas as pd

df = pd.DataFrame({
    'age': [25, 30, 35, 40],
    'city': ['NY', 'NY', 'NY', 'SF']
})

filtered = df.query('age > 30 and city == "NY"')
This code filters rows where age is greater than 30 and city is NY using df.query().
Execution Table
StepActionQuery ConditionRows CheckedRows MatchingResulting DataFrame
1Start with full DataFrameage > 30 and city == 'NY'All rows (4)N/A[All 4 rows]
2Check row 0age=25, city=NYRow 0False (25 > 30? No)Exclude row 0
3Check row 1age=30, city=NYRow 1False (30 > 30? No)Exclude row 1
4Check row 2age=35, city=NYRow 2True (35 > 30 and city NY)Include row 2
5Check row 3age=40, city=SFRow 3False (city SF != NY)Exclude row 3
6Return filtered DataFrameN/AN/ARows matching: 1DataFrame with row 2 only
💡 All rows checked; only row 2 matches the query condition.
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4After Step 5Final
filteredEmptyEmptyEmptyContains row 2Contains row 2Contains row 2
Key Moments - 2 Insights
Why does df.query('age > 30 and city == "NY"') exclude rows where age is exactly 30?
Because the condition uses '>' which means strictly greater than 30. Rows with age 30 do not satisfy 'age > 30' as shown in execution_table row 3.
How does df.query handle string comparisons like city == 'NY'?
It compares the string values exactly. Only rows where city equals 'NY' are included, as seen in execution_table row 4 where city is 'NY' and row 5 where city is 'SF' is excluded.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, which row is included in the filtered DataFrame?
ARow 0
BRow 1
CRow 2
DRow 3
💡 Hint
Check the 'Rows Matching' column in execution_table rows 2-5.
At which step does the query condition become false for row 1?
AStep 3
BStep 4
CStep 2
DStep 5
💡 Hint
Look at the 'Rows Checked' and 'Rows Matching' columns for row 1 in execution_table.
If the query was changed to 'age >= 30 and city == "NY"', which additional row would be included?
ARow 0
BRow 1
CRow 3
DRow 1 and Row 3
💡 Hint
Check which rows have age exactly 30 and city 'NY' in variable_tracker and execution_table.
Concept Snapshot
df.query('condition') filters DataFrame rows fast.
Use a string condition like 'age > 30 and city == "NY"'.
It returns rows where condition is True.
Supports logical operators: and, or, not.
Works with column names directly.
Faster and cleaner than boolean indexing.
Full Transcript
This visual execution shows how pandas df.query() filters rows by a condition string. Starting with a DataFrame of ages and cities, the query 'age > 30 and city == "NY"' is applied. Each row is checked: rows 0 and 1 fail because age is not greater than 30, row 3 fails because city is not NY. Only row 2 matches both conditions and is included in the filtered DataFrame. Variables track the filtered result building up. Key moments clarify why exact comparisons matter and how strings are matched. The quiz tests understanding of which rows pass or fail and how changing the condition affects results. The snapshot summarizes usage: df.query() is a clean, fast way to filter DataFrames using readable string conditions.