Experiment - Tokenization (word and sentence)
Problem:You want to split text into words and sentences correctly to prepare data for NLP tasks.
Current Metrics:Current tokenization splits words and sentences but sometimes merges punctuation or misses sentence boundaries.
Issue:The tokenization is inconsistent, causing errors in downstream tasks like sentiment analysis or translation.