Overview - Text vs keyword field types

What is it?

In Elasticsearch, fields in documents can be stored as either text or keyword types. Text fields are used for full-text search, where the content is analyzed and broken into words. Keyword fields store exact values without analysis, useful for filtering, sorting, and aggregations. Understanding the difference helps you choose the right field type for your search needs.

Why it matters

Without knowing the difference, you might store data in a way that makes searching slow or inaccurate. For example, searching for exact matches on a text field can fail because it’s analyzed into parts. This can cause wrong search results or inefficient queries, impacting user experience and system performance.

Where it fits

Before this, you should understand basic Elasticsearch concepts like documents, fields, and indexing. After this, you can learn about analyzers, mappings, and how to optimize search queries for performance and relevance.

Mental Model

Core Idea

Text fields break content into searchable words for full-text search, while keyword fields store exact values for precise matching and sorting.

Think of it like...

Think of text fields like a book index that breaks down topics into words to find pages easily, and keyword fields like a library catalog number that points exactly to one book without breaking it down.

┌───────────────┐       ┌───────────────┐
│   Text Field  │──────▶│ Analyzed into │
│ (Full-text)   │       │  words/tokens │
└───────────────┘       └───────────────┘
         │                        │
         ▼                        ▼
┌───────────────┐       ┌───────────────┐
│ Keyword Field │──────▶│ Stored as is  │
│ (Exact match) │       │ (No analysis) │
└───────────────┘       └───────────────┘

Build-Up - 7 Steps

1

FoundationWhat is a Text Field?

Concept: Text fields store strings that are analyzed for full-text search.

A text field takes the input string and breaks it into smaller parts called tokens or words. For example, the sentence 'Fast cars are cool' becomes ['fast', 'cars', 'are', 'cool']. This allows Elasticsearch to find documents matching any of these words when you search.

Result

You can search for any word in the text and find matching documents, even if the exact phrase isn't typed.

Understanding that text fields break content into words explains why they are great for searching by meaning or parts of text.

2

FoundationWhat is a Keyword Field?

3

IntermediateHow Text Fields Are Analyzed

4

IntermediateWhen to Use Keyword Fields

5

IntermediateMulti-fields: Combining Text and Keyword

6

AdvancedImpact on Performance and Storage

7

ExpertSurprises in Keyword Field Limits

Under the Hood

When indexing, Elasticsearch processes text fields through analyzers that tokenize, lowercase, and filter the input, creating an inverted index mapping tokens to documents. Keyword fields skip analysis and store the exact string in a columnar data structure optimized for exact match queries, filters, and sorting.

Why designed this way?

This design balances flexibility and performance. Full-text search needs tokenization for relevance and partial matches, while exact matches require fast, precise lookups. Separating these types avoids slowing down queries and keeps storage efficient.

┌───────────────┐        ┌───────────────┐
│ Input String  │        │ Input String  │
└──────┬────────┘        └──────┬────────┘
       │                        │
       ▼                        ▼
┌───────────────┐        ┌───────────────┐
│ Analyzer     │        │ No Analyzer   │
│ (tokenizes)  │        │ (stores raw)  │
└──────┬────────┘        └──────┬────────┘
       │                        │
       ▼                        ▼
┌───────────────┐        ┌───────────────┐
│ Inverted Index│        │ Exact Storage │
│ (tokens → doc)│        │ (keyword data)│
└───────────────┘        └───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Do you think searching a keyword field finds partial matches inside the string? Commit yes or no.

Common Belief:Keyword fields support partial matching like text fields do.

Tap to reveal reality

Quick: Do you think text fields store the original string exactly as is? Commit yes or no.

Common Belief:Text fields keep the original string intact for retrieval and search.

Tap to reveal reality

Quick: Do you think keyword fields can store very long strings without limits? Commit yes or no.

Common Belief:Keyword fields can store strings of any length without issue.

Tap to reveal reality

Quick: Do you think multi-fields duplicate data and waste storage? Commit yes or no.

Common Belief:Using multi-fields to store both text and keyword versions wastes a lot of storage.

Tap to reveal reality

Expert Zone

1

Keyword fields are case-sensitive by default, so filtering on 'Active' vs 'active' differs unless normalized.

2

Text fields can use custom analyzers to control tokenization, affecting search precision and recall deeply.

3

Multi-fields allow different analyzers on the same data, enabling complex search patterns without duplicating data.

When NOT to use

Avoid using text fields for filtering or sorting because they are slow and imprecise; instead, use keyword fields. Conversely, do not use keyword fields for full-text search or relevance ranking; use text fields or specialized search types like 'match' queries.

Production Patterns

In production, it’s common to define fields as multi-fields with a text type for search and a keyword subfield for filtering and sorting. This pattern balances flexibility and performance. Also, adjusting keyword length limits and analyzer settings is standard to handle real-world data.

Connections

Inverted Index

Text fields build inverted indexes from tokens, keyword fields store exact values without tokenization.

Understanding inverted indexes clarifies why text fields support full-text search and keyword fields do not.

Data Normalization

Text field analyzers normalize data (lowercase, remove punctuation), keyword fields do not.

Knowing normalization helps explain why searches on text fields are case-insensitive but keyword filters are case-sensitive.

Library Cataloging Systems

Keyword fields act like catalog numbers for exact identification, text fields like subject indexes for searching topics.

This cross-domain link shows how organizing information for exact lookup versus flexible search is a common challenge.

Common Pitfalls

#1Filtering on a text field expecting exact matches.

Wrong approach:GET /books/_search { "query": { "term": { "title": "Elasticsearch Basics" } } }

Correct approach:GET /books/_search { "query": { "term": { "title.keyword": "Elasticsearch Basics" } } }

Root cause:Text fields are analyzed and broken into tokens, so exact term queries fail unless using the keyword subfield.

#2Using keyword fields for full-text search queries.

Wrong approach:GET /articles/_search { "match": { "content.keyword": "fast cars" } }

Correct approach:GET /articles/_search { "match": { "content": "fast cars" } }

Root cause:Keyword fields do not support tokenization or full-text search, so match queries on them do not work as expected.

#3Ignoring keyword field length limits causing data truncation.

Wrong approach:Mapping without length limit adjustment: "tags": { "type": "keyword" }

Correct approach:Mapping with length limit adjustment: "tags": { "type": "keyword", "ignore_above": 512 }

Root cause:Default ignore_above setting ignores strings longer than 256 characters, causing unexpected data loss.

Key Takeaways

Text fields are designed for full-text search by breaking content into searchable words.

Keyword fields store exact values for precise filtering, sorting, and aggregations.

Multi-fields let you store the same data as both text and keyword for flexible queries.

Choosing the right field type affects search accuracy, performance, and storage.

Understanding analyzer behavior and keyword limits prevents common search mistakes.