Complete the code to extract the first word from each string in the 'text' column.
df['first_word'] = df['text'].str.[1]('(\\w+)')
The str.extract method extracts groups from the first match of the regex pattern. Here, it extracts the first word.
Complete the code to extract the year (4 digits) from the 'date' column strings.
df['year'] = df['date'].str.[1]('(\\d{4})')
str.extract extracts the first group matching the regex pattern, here the 4-digit year.
Fix the error in the code to extract the domain name from email addresses in the 'email' column.
df['domain'] = df['email'].str.extract('@(\w+\[1]\w+)')
The dot . matches any character in regex. To match a literal dot, it must be escaped as \.. Here, \. is needed to match the dot in domain names.
Fill both blanks to extract the area code (3 digits) and the phone number (7 digits) from 'phone' column strings.
df[['area_code', 'number']] = df['phone'].str.extract('\(([1])\) ([2])')
\w matches letters and digits, not only digits.The area code is 3 digits, so \d{3} matches it. The phone number is 7 digits, so \d{7} matches it.
Fill all three blanks to create a dictionary comprehension that maps words to their lengths only if length is greater than 3.
lengths = { [1] : [2] for [3] in words if len([3]) > 3 }The dictionary comprehension uses word as key, len(word) as value, and iterates over word in words. The condition filters words longer than 3.