Elasticsearchquery~15 mins

Percolate queries (reverse search) in Elasticsearch - Deep Dive

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Overview - Percolate queries (reverse search)

What is it?

Percolate queries in Elasticsearch let you register queries first and then check which of those queries match a new document. Instead of searching documents with a query, you search queries with a document. This reverse search helps find all queries interested in a given piece of data. It is useful for alerting, notifications, and matching new data against saved criteria.

Why it matters

Without percolate queries, you would have to run every saved query against new data manually, which is slow and inefficient. Percolate queries solve this by indexing queries and quickly finding matches when new documents arrive. This saves time and resources, enabling real-time matching and alerting in applications like monitoring, recommendation, and security.

Where it fits

Before learning percolate queries, you should understand basic Elasticsearch concepts like indexing, documents, and standard queries. After mastering percolate queries, you can explore advanced alerting systems, real-time data processing, and integrating Elasticsearch with event-driven architectures.

Mental Model

Core Idea

Percolate queries flip the usual search: instead of searching documents with queries, you search queries with documents.

Think of it like...

Imagine a job board where instead of searching for jobs with your resume, the board searches all saved resumes to find which ones match a new job posting.

┌───────────────┐       ┌───────────────┐
│ Registered    │       │ New Document  │
│ Queries       │◄──────│ (to match)    │
└───────────────┘       └───────────────┘
        ▲                      │
        │                      ▼
   Percolate Engine  ──────► Matches Queries
        │                      │
        └───────────────► Output: List of matching queries

Build-Up - 7 Steps

FoundationBasic Elasticsearch Query Concept

Concept: Understand how Elasticsearch normally searches documents using queries.

In Elasticsearch, you store documents in an index. To find documents, you write queries that describe what you want. For example, a query might look for documents where the 'title' contains 'database'. Elasticsearch returns documents matching that query.

Result

You get a list of documents matching your search criteria.

Knowing how normal queries work is essential because percolate queries reverse this process.

FoundationWhat is a Percolate Query?

IntermediateSetting Up a Percolator Index

IntermediateRegistering Queries in the Percolator

IntermediateRunning a Percolate Query

AdvancedPerformance Considerations and Scaling

ExpertAdvanced Use Cases and Internals

Under the Hood

Elasticsearch indexes stored queries using a special 'percolator' field type. When a new document is percolated, Elasticsearch rewrites the stored queries into a form that can be matched against the document's fields. It uses inverted indexes and caching to quickly find which queries match the document's content.

Why designed this way?

Percolate queries were designed to invert the search process for efficiency in real-time matching scenarios. Traditional search indexes documents for queries; percolate indexes queries for documents. This design allows fast matching of many queries against incoming data, which is essential for alerting and notification systems.

┌───────────────┐      ┌───────────────┐      ┌───────────────┐
│ Stored Queries │─────▶│ Query Index   │─────▶│ Matching      │
│ (percolator)  │      │ (inverted)    │      │ Engine        │
└───────────────┘      └───────────────┘      └───────────────┘
                                                   ▲
                                                   │
                                           ┌───────────────┐
                                           │ New Document  │
                                           └───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Do you think percolate queries search documents with queries like normal? Commit yes or no.

Common Belief:Percolate queries work like normal queries but just with a different name.

Tap to reveal reality

Quick: Do you think you can store any kind of query in a percolator index? Commit yes or no.

Common Belief:Any Elasticsearch query can be stored and percolated without restrictions.

Tap to reveal reality

Quick: Do you think percolate queries scale easily to millions of stored queries? Commit yes or no.

Common Belief:Percolate queries scale linearly and can handle millions of stored queries without issue.

Tap to reveal reality

Quick: Do you think percolate queries return the matching documents? Commit yes or no.

Common Belief:Percolate queries return documents that match the stored queries.

Tap to reveal reality

Expert Zone

Percolate queries internally rewrite stored queries into a normalized form for efficient matching, which affects how complex queries behave.

The percolator field type requires careful mapping and cannot be mixed with normal text fields in the same field name.

Combining percolate queries with filters and bool queries allows building complex matching rules that can optimize performance by reducing candidate queries.

When NOT to use

Percolate queries are not suitable when you need to match documents against very complex queries involving scripts or aggregations. In such cases, consider using external processing or custom application logic. Also, if you have very few queries or documents, normal search might be simpler and more efficient.

Production Patterns

In production, percolate queries are used for real-time alerting systems, such as monitoring logs for error patterns, matching user profiles to notifications, or filtering incoming data streams. They are often combined with message queues and event-driven architectures to trigger actions when matches occur.

Connections

Event-driven Architecture

Percolate queries enable real-time matching which triggers events in event-driven systems.

Understanding percolate queries helps design systems that react instantly to new data by firing events based on matching criteria.

Pattern Matching in Functional Programming

Both involve checking data against a set of patterns or rules to find matches.

Knowing how pattern matching works in programming clarifies how percolate queries match documents against stored query patterns.

Reverse Indexing in Information Retrieval

Percolate queries use inverted indexes of queries, similar to how search engines index documents.

Understanding reverse indexing explains how Elasticsearch efficiently finds matching queries for a document.

Common Pitfalls

#1Trying to store normal text fields as 'percolator' type without proper mapping.

Wrong approach:{ "mappings": { "properties": { "query": { "type": "text" } } } }

Correct approach:{ "mappings": { "properties": { "query": { "type": "percolator" } } } }

Root cause:Confusing normal text fields with the special 'percolator' field type needed to store queries.

#2Sending a query instead of a document in the percolate query request.

Wrong approach:{ "query": { "percolate": { "field": "query", "document": { "query": { "match": { "title": "error" } } } } } }

Correct approach:{ "query": { "percolate": { "field": "query", "document": { "title": "error" } } } }

Root cause:Misunderstanding that the percolate query expects a document to match against stored queries, not another query.

#3Storing unsupported query types like aggregations in the percolator index.

Wrong approach:{ "query": { "aggs": { "avg_price": { "avg": { "field": "price" } } } } }

Correct approach:{ "query": { "match": { "status": "error" } } }

Root cause:Not knowing that percolate queries only support query types that can be matched against documents.

Key Takeaways

Percolate queries reverse the usual search process by matching stored queries against new documents.

They require a special index mapping with a 'percolator' field type to store queries as documents.

You send a document to the percolate query to find which stored queries match it, enabling real-time alerting and notifications.

Performance depends on the number and complexity of stored queries, so design and optimization are important for scaling.

Understanding percolate queries unlocks powerful use cases in monitoring, recommendation, and event-driven systems.

Practice

(1/5)

What is the main purpose of a percolate query in Elasticsearch?

easy

A. To find stored queries that match a new document

B. To update documents in an index

C. To delete documents based on a condition

D. To aggregate data by terms

Percolate queries (reverse search) in Elasticsearch - Deep Dive

Start learning this pattern below

Practice

Solution

Step 1: Understand percolate query concept

Step 2: Compare options with concept

Final Answer:

Quick Check:

Solution

Step 1: Identify required field type for percolate queries

Step 2: Match options with required type

Final Answer:

Quick Check:

Solution

Step 1: Understand percolate query behavior

Step 2: Analyze the given query

Final Answer:

Quick Check:

Solution

Step 1: Check JSON syntax in query

Step 2: Validate other parts

Final Answer:

Quick Check:

Solution

Step 1: Setup index with percolator field

Step 2: Store queries and percolate new documents

Final Answer:

Quick Check: