The query() method helps you quickly find rows in a table that match certain conditions. It makes filtering data easy and readable.
query() for fast filtering in Pandas
DataFrame.query('condition')The condition is a string that looks like a simple expression, e.g., 'age > 30'.
You can use column names directly inside the string without extra brackets.
age column is greater than 25.df.query('age > 25')city column equals 'New York'.df.query('city == "New York"')age is over 20 and city is 'Chicago'.df.query('age > 20 and city == "Chicago"')score is at least 80 or grade is 'A'.df.query('score >= 80 or grade == "A"')This code creates a small table of people with their ages and cities. It then uses query() to find people older than 30, and separately, people in New York who are younger than 30.
import pandas as pd data = { 'name': ['Alice', 'Bob', 'Charlie', 'David'], 'age': [25, 32, 18, 47], 'city': ['New York', 'Chicago', 'New York', 'Chicago'] } df = pd.DataFrame(data) # Filter people older than 30 older_than_30 = df.query('age > 30') print(older_than_30) # Filter people in New York younger than 30 ny_young = df.query('city == "New York" and age < 30') print(ny_young)
Use double quotes inside the query string if your values are strings, like city == "New York".
query() can be faster than normal filtering with brackets for large data.
Column names with spaces or special characters need backticks, e.g., df.query('`my column` > 5').
query() lets you filter rows using simple, readable strings.
It works well for conditions with one or more columns combined with and / or.
It can make your code cleaner and sometimes faster for big tables.