0
0
PandasHow-ToBeginner · 3 min read

How to Use loc in pandas for Data Selection and Filtering

In pandas, loc is used to select rows and columns by their labels. You can specify row labels, column labels, or both inside df.loc[row_label, column_label] to get or set data based on index names.
📐

Syntax

The basic syntax of loc is df.loc[row_indexer, column_indexer]. Here:

  • row_indexer: label(s) of the row(s) you want to select.
  • column_indexer: label(s) of the column(s) you want to select.

You can use single labels, lists of labels, slices, or boolean arrays for both row and column indexers.

python
df.loc[row_label, column_label]
💻

Example

This example shows how to select specific rows and columns using loc. It demonstrates selecting a single row, multiple rows, and specific columns by their labels.

python
import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['NY', 'LA', 'Chicago', 'Houston']}

df = pd.DataFrame(data, index=['a', 'b', 'c', 'd'])

# Select row with label 'b'
row_b = df.loc['b']

# Select rows 'a' to 'c' and columns 'Name' and 'Age'
subset = df.loc['a':'c', ['Name', 'Age']]

print("Row with label 'b':")
print(row_b)
print("\nSubset of rows 'a' to 'c' and columns 'Name' and 'Age':")
print(subset)
Output
Row with label 'b': Name Bob Age 30 City LA Name: b, dtype: object Subset of rows 'a' to 'c' and columns 'Name' and 'Age': Name Age a Alice 25 b Bob 30 c Charlie 35
⚠️

Common Pitfalls

One common mistake is confusing loc with iloc. loc uses labels, while iloc uses integer positions. Another pitfall is using labels that do not exist, which causes a KeyError. Also, slicing with loc includes the end label, unlike Python's usual slicing.

python
import pandas as pd

data = {'Value': [10, 20, 30]}

df = pd.DataFrame(data, index=['x', 'y', 'z'])

# Wrong: using integer position with loc (raises KeyError)
try:
    print(df.loc[1])
except KeyError as e:
    print(f"KeyError: {e}")

# Right: use label with loc
print(df.loc['y'])

# Slicing includes end label
print(df.loc['x':'y'])
Output
KeyError: 1 Value 20 Name: y, dtype: int64 Value x 10 y 20
📊

Quick Reference

UsageDescriptionExample
Select single row by labelReturns a Series for the rowdf.loc['row_label']
Select multiple rows by labelsReturns DataFrame for rowsdf.loc[['row1', 'row2']]
Select rows and columnsReturns subset DataFramedf.loc['row1':'row3', ['col1', 'col2']]
Boolean indexingSelect rows where condition is Truedf.loc[df['Age'] > 30]
Set valuesAssign new values to subsetdf.loc['a', 'Age'] = 26

Key Takeaways

Use df.loc[row_label, column_label] to select data by labels in pandas.
loc includes the end label when slicing rows or columns.
Always use labels with loc, not integer positions (use iloc for positions).
You can select single or multiple rows and columns with loc.
loc supports boolean conditions to filter rows easily.