Bird
Raised Fist0
NlpProgramBeginner ยท 2 min read

NLP Program to Detect Language with Python

Use the Python library langdetect to detect language by calling detect(text) which returns the language code like 'en' for English.
๐Ÿ“‹

Examples

InputHello, how are you?
Outputen
InputBonjour tout le monde
Outputfr
Inputใ“ใ‚Œใฏๆ—ฅๆœฌ่ชžใฎๆ–‡็ซ ใงใ™ใ€‚
Outputja
๐Ÿง 

How to Think About It

To detect the language of a text, the program looks for patterns and clues in the words and characters that match known languages. It compares the input text against models built from many languages and picks the best match.
๐Ÿ“

Algorithm

1
Get the input text from the user or source.
2
Use a language detection tool or library to analyze the text.
3
The tool compares text features to known language profiles.
4
Return the language code that best matches the text.
๐Ÿ’ป

Code

python
from langdetect import detect

def detect_language(text: str) -> str:
    return detect(text)

sample_text = "Hello, how are you?"
language = detect_language(sample_text)
print(f"Detected language: {language}")
Output
Detected language: en
๐Ÿ”

Dry Run

Let's trace detecting language for the input 'Hello, how are you?' through the code

1

Input text

text = 'Hello, how are you?'

2

Call detect function

detect('Hello, how are you?')

3

Analyze text

The function matches text patterns to English language profile

4

Return result

Returns 'en' as detected language code

StepActionValue
1Input textHello, how are you?
2Call detect()detect('Hello, how are you?')
3Match language profileEnglish (en)
4Return language codeen
๐Ÿ’ก

Why This Works

Step 1: Importing langdetect

The langdetect library provides a simple detect() function to identify language from text.

Step 2: Detecting language

The detect() function analyzes the input text's character and word patterns to guess the language.

Step 3: Returning language code

It returns a short code like en for English or fr for French, which is easy to use in programs.

๐Ÿ”„

Alternative Approaches

Using fasttext language identification
python
import fasttext
model = fasttext.load_model('lid.176.ftz')
def detect_language_fasttext(text: str) -> str:
    prediction = model.predict(text)
    return prediction[0][0].replace('__label__', '')

print(detect_language_fasttext('Hello, how are you?'))
FastText is faster and supports more languages but requires downloading a model file.
Using TextBlob library
python
from textblob import TextBlob

def detect_language_textblob(text: str) -> str:
    return TextBlob(text).detect_language()

print(detect_language_textblob('Bonjour tout le monde'))
TextBlob is easy to use but depends on internet connection for language detection.
โšก

Complexity: O(n) time, O(1) space

Time Complexity

The detection scans the input text once, so time grows linearly with text length.

Space Complexity

Uses constant extra space for internal language profiles and variables.

Which Approach is Fastest?

FastText is faster for large texts but needs model loading; langdetect is simple and good for small to medium texts.

ApproachTimeSpaceBest For
langdetectO(n)O(1)Simple scripts and small texts
fasttextO(n)O(m) model sizeHigh speed and many languages
TextBlobO(n)O(1)Easy use with internet access
๐Ÿ’ก
Install langdetect with pip install langdetect before running the program.
โš ๏ธ
Not handling short or ambiguous texts can cause wrong language detection results.