0
0
PandasHow-ToBeginner · 3 min read

How to Use str.startswith in pandas for String Filtering

In pandas, use Series.str.startswith() to check if each string in a column starts with a specific prefix. It returns a boolean Series that you can use to filter rows or analyze data based on string beginnings.
📐

Syntax

The str.startswith() method is used on a pandas Series containing strings. It checks if each string starts with the given prefix and returns a Series of True or False values.

  • prefix: The string or tuple of strings to check at the start.
  • na: Optional boolean or value to fill missing values; default is None.
  • case: Optional boolean to specify case sensitivity; default is True.
python
Series.str.startswith(prefix, na=None, case=True)
💻

Example

This example shows how to filter a DataFrame to keep only rows where the 'Name' column starts with 'A'.

python
import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Anna', 'Mike', 'Amanda'], 'Age': [25, 30, 22, 32, 28]}
df = pd.DataFrame(data)

# Check which names start with 'A'
starts_with_a = df['Name'].str.startswith('A')

# Filter rows where 'Name' starts with 'A'
filtered_df = df[starts_with_a]

print(filtered_df)
Output
Name Age 0 Alice 25 2 Anna 22 4 Amanda 28
⚠️

Common Pitfalls

Common mistakes include:

  • Using startswith on non-string columns without converting them first.
  • Ignoring case sensitivity when needed.
  • Not handling missing (NaN) values, which can cause unexpected results.

Always ensure the column is string type and consider na and case parameters.

python
import pandas as pd

data = {'Name': ['Alice', None, 'anna', 'Mike', 'Amanda'], 'Age': [25, 30, 22, 32, 28]}
df = pd.DataFrame(data)

# Wrong: No handling of None and case
# This returns False for None and is case sensitive
print(df['Name'].str.startswith('A'))

# Right: Handle NaN and ignore case
print(df['Name'].str.startswith('A', na=False, case=False))
Output
0 True 1 False 2 False 3 False 4 True Name: Name, dtype: bool 0 True 1 False 2 True 3 False 4 True Name: Name, dtype: bool
📊

Quick Reference

ParameterDescriptionDefault
prefixString or tuple of strings to check at startRequired
naValue to fill for missing data (NaN)None
caseWhether to consider case when matchingTrue

Key Takeaways

Use Series.str.startswith(prefix) to get a boolean mask for strings starting with prefix.
Handle missing values with the na parameter to avoid errors or unexpected False results.
Set case=False to ignore letter case when checking prefixes.
Always ensure the column is string type before using str.startswith.
Use the boolean result to filter DataFrame rows easily.