0
0
Elasticsearchquery~10 mins

Testing analyzers (_analyze API) in Elasticsearch - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Testing analyzers (_analyze API)
Send text to _analyze API
Elasticsearch applies analyzer
Text is tokenized
Tokens are filtered
Return tokens and details
Display output
The flow shows how text is sent to the _analyze API, processed by the analyzer, tokenized, filtered, and then returned as tokens.
Execution Sample
Elasticsearch
POST _analyze
{
  "analyzer": "standard",
  "text": "Quick Brown Foxes"
}
This request tests the 'standard' analyzer on the text 'Quick Brown Foxes' and returns the tokens.
Execution Table
StepActionInput TextAnalyzer UsedTokens Produced
1Send request to _analyze APIQuick Brown Foxesstandard
2Apply standard analyzerQuick Brown Foxesstandard
3Tokenize textQuick Brown Foxesstandard["quick", "brown", "foxes"]
4Return tokensQuick Brown Foxesstandard["quick", "brown", "foxes"]
5Display outputQuick Brown Foxesstandard["quick", "brown", "foxes"]
💡 All tokens produced and returned by the analyzer; execution ends.
Variable Tracker
VariableStartAfter Step 2After Step 3Final
Input Text"Quick Brown Foxes""Quick Brown Foxes""Quick Brown Foxes""Quick Brown Foxes"
Analyzernone"standard""standard""standard"
Tokensnonenone["quick", "brown", "foxes"]["quick", "brown", "foxes"]
Key Moments - 2 Insights
Why are the tokens all lowercase even though the input text has uppercase letters?
Because the 'standard' analyzer lowercases tokens during analysis, as shown in execution_table step 3 where tokens are all lowercase.
What happens if we change the analyzer to 'whitespace'?
The analyzer will split tokens only on whitespace without lowercasing, so tokens keep original case; this would change tokens in step 3.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what tokens are produced at step 3?
A["Quick", "Brown", "Foxes"]
B["quickbrownfoxes"]
C["quick", "brown", "foxes"]
D["QuickBrownFoxes"]
💡 Hint
Check the 'Tokens Produced' column at step 3 in the execution_table.
At which step does the analyzer actually split the text into tokens?
AStep 3
BStep 1
CStep 2
DStep 5
💡 Hint
Look for the step where 'Tokenize text' happens in the execution_table.
If the input text was 'Quick-Brown Foxes' and the analyzer is 'standard', how would tokens change?
A["quick-brown", "foxes"]
B["quick", "brown", "foxes"]
C["quickbrown", "foxes"]
D["Quick-Brown", "Foxes"]
💡 Hint
The 'standard' analyzer splits on punctuation and lowercases tokens, similar to step 3 tokens.
Concept Snapshot
Testing analyzers with _analyze API:
- Send POST request to _analyze with 'text' and 'analyzer'
- Elasticsearch tokenizes and filters text
- Returns tokens array
- Useful to see how text is broken down
- Helps debug indexing and search behavior
Full Transcript
This visual trace shows how the Elasticsearch _analyze API processes input text. First, the text 'Quick Brown Foxes' is sent with the 'standard' analyzer. The analyzer lowercases and splits the text into tokens 'quick', 'brown', and 'foxes'. These tokens are returned and displayed. The trace highlights key steps: sending request, applying analyzer, tokenizing, and returning tokens. It also clarifies why tokens are lowercase and how changing analyzers affects output.