Model Pipeline - Tokenization (word and sentence)
This pipeline breaks down text into smaller pieces called tokens. It splits text into sentences first, then splits each sentence into words. This helps computers understand and work with text better.
This pipeline breaks down text into smaller pieces called tokens. It splits text into sentences first, then splits each sentence into words. This helps computers understand and work with text better.
No training loss to show because tokenization is a fixed process.
| Epoch | Loss ↓ | Accuracy ↑ | Observation |
|---|---|---|---|
| 1 | N/A | N/A | Tokenization is a rule-based process, no training needed. |