Hint: Apply strip, lower, then replace to clean text [OK]
Common Mistakes:
Forgetting strip() removes spaces
Not removing comma correctly
Confusing case conversion order
4. Identify the error in this preprocessing code snippet:
text = "Example Text!"
clean_text = text.lower().strip().remove('!')
print(clean_text)
medium
A. remove() is not a string method
B. strip() should be called before lower()
C. lower() does not change the text
D. print() is missing parentheses
Solution
Step 1: Check string methods used
Python strings do not have a remove() method; to remove characters, replace() should be used.
Step 2: Verify other method usage
strip() and lower() are valid and order is acceptable; print() has parentheses.
Final Answer:
remove() is not a string method -> Option A
Quick Check:
remove() invalid for strings = D [OK]
Hint: Use replace() to remove chars, not remove() [OK]
Common Mistakes:
Using remove() instead of replace()
Thinking strip() must come before lower()
Ignoring syntax errors in print()
5. You have a dataset with inconsistent casing, extra spaces, and punctuation. Which sequence of preprocessing steps best cleans the text for a machine learning model?
hard
A. Convert to lowercase, strip spaces, remove punctuation
B. Strip spaces, remove punctuation, convert to lowercase
C. Remove punctuation, convert to lowercase, strip spaces
D. Remove punctuation, strip spaces, convert to uppercase
Solution
Step 1: Start by removing extra spaces
Stripping spaces first cleans the text edges, making punctuation removal accurate.
Step 2: Remove punctuation and convert to lowercase
Removing punctuation after spaces avoids leftover spaces; converting to lowercase last ensures uniform casing.
Final Answer:
Strip spaces, remove punctuation, convert to lowercase -> Option B
Quick Check:
Clean edges, remove noise, unify case = A [OK]
Hint: Strip spaces first, then remove punctuation, then lowercase [OK]