0
0
Data Analysis Pythondata~10 mins

Extracting with str.extract (regex) in Data Analysis Python - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to extract the first word from each string in the 'text' column.

Data Analysis Python
df['first_word'] = df['text'].str.[1]('(\\w+)')
Drag options to blanks, or click blank then click option'
Afindall
Bextract
Csplit
Dreplace
Attempts:
3 left
💡 Hint
Common Mistakes
Using str.findall returns a list of all matches, not a single extracted group.
Using str.split splits by delimiter, not regex groups.
Using str.replace changes text instead of extracting.
2fill in blank
medium

Complete the code to extract the year (4 digits) from the 'date' column strings.

Data Analysis Python
df['year'] = df['date'].str.[1]('(\\d{4})')
Drag options to blanks, or click blank then click option'
Acontains
Bmatch
Cextract
Dfindall
Attempts:
3 left
💡 Hint
Common Mistakes
Using str.contains returns True/False, not extracted text.
Using str.match only matches from the start of the string.
Using str.findall returns lists, not single extracted values.
3fill in blank
hard

Fix the error in the code to extract the domain name from email addresses in the 'email' column.

Data Analysis Python
df['domain'] = df['email'].str.extract('@(\w+\[1]\w+)')
Drag options to blanks, or click blank then click option'
A.
B*
C+
D?
Attempts:
3 left
💡 Hint
Common Mistakes
Using '+' or '*' without escaping the dot matches wrong patterns.
Not escaping the dot causes it to match any character.
4fill in blank
hard

Fill both blanks to extract the area code (3 digits) and the phone number (7 digits) from 'phone' column strings.

Data Analysis Python
df[['area_code', 'number']] = df['phone'].str.extract('\(([1])\) ([2])')
Drag options to blanks, or click blank then click option'
A\d{3}
B\w{3}
C\d{7}
D\w{7}
Attempts:
3 left
💡 Hint
Common Mistakes
Using \w matches letters and digits, not only digits.
Not matching the exact number of digits causes wrong extraction.
5fill in blank
hard

Fill all three blanks to create a dictionary comprehension that maps words to their lengths only if length is greater than 3.

Data Analysis Python
lengths = { [1] : [2] for [3] in words if len([3]) > 3 }
Drag options to blanks, or click blank then click option'
Aword
Blen(word)
Dw
Attempts:
3 left
💡 Hint
Common Mistakes
Using different variable names for key and loop variable causes errors.
Using the wrong expression for length causes wrong results.