NLP Program to Detect Language with Python
langdetect to detect language by calling detect(text) which returns the language code like 'en' for English.Examples
How to Think About It
Algorithm
Code
from langdetect import detect def detect_language(text: str) -> str: return detect(text) sample_text = "Hello, how are you?" language = detect_language(sample_text) print(f"Detected language: {language}")
Dry Run
Let's trace detecting language for the input 'Hello, how are you?' through the code
Input text
text = 'Hello, how are you?'
Call detect function
detect('Hello, how are you?')
Analyze text
The function matches text patterns to English language profile
Return result
Returns 'en' as detected language code
| Step | Action | Value |
|---|---|---|
| 1 | Input text | Hello, how are you? |
| 2 | Call detect() | detect('Hello, how are you?') |
| 3 | Match language profile | English (en) |
| 4 | Return language code | en |
Why This Works
Step 1: Importing langdetect
The langdetect library provides a simple detect() function to identify language from text.
Step 2: Detecting language
The detect() function analyzes the input text's character and word patterns to guess the language.
Step 3: Returning language code
It returns a short code like en for English or fr for French, which is easy to use in programs.
Alternative Approaches
import fasttext model = fasttext.load_model('lid.176.ftz') def detect_language_fasttext(text: str) -> str: prediction = model.predict(text) return prediction[0][0].replace('__label__', '') print(detect_language_fasttext('Hello, how are you?'))
from textblob import TextBlob def detect_language_textblob(text: str) -> str: return TextBlob(text).detect_language() print(detect_language_textblob('Bonjour tout le monde'))
Complexity: O(n) time, O(1) space
Time Complexity
The detection scans the input text once, so time grows linearly with text length.
Space Complexity
Uses constant extra space for internal language profiles and variables.
Which Approach is Fastest?
FastText is faster for large texts but needs model loading; langdetect is simple and good for small to medium texts.
| Approach | Time | Space | Best For |
|---|---|---|---|
| langdetect | O(n) | O(1) | Simple scripts and small texts |
| fasttext | O(n) | O(m) model size | High speed and many languages |
| TextBlob | O(n) | O(1) | Easy use with internet access |
pip install langdetect before running the program.