Overview - Index patterns for time-series

What is it?

Index patterns for time-series in Elasticsearch are ways to organize and search data that changes over time. They help group many indexes that each hold data for a specific time period, like days or months. This makes it easier to find and analyze data from certain time ranges quickly. Instead of searching all data at once, you search only the relevant time slices.

Why it matters

Without index patterns for time-series, searching through large amounts of time-based data would be slow and inefficient. Imagine trying to find a single day's weather data in a huge pile of years of records without any order. Index patterns let Elasticsearch quickly narrow down to the right data, saving time and computing power. This is crucial for monitoring systems, logs, or any data that grows continuously over time.

Where it fits

Before learning index patterns, you should understand basic Elasticsearch concepts like indexes, documents, and queries. After mastering index patterns, you can explore advanced topics like index lifecycle management, rollups, and optimizing queries for time-series data.

Mental Model

Core Idea

Index patterns group many time-based indexes so you can search and analyze only the relevant time slices efficiently.

Think of it like...

Think of a library where books are organized by year on different shelves. Instead of searching the entire library, you go directly to the shelf for the year you want. Index patterns are like the labels on those shelves that help you find the right books quickly.

┌─────────────────────────────┐
│      Index Pattern          │
│  (e.g., logs-2023.*)        │
├─────────────┬───────────────┤
│ Index: logs-2023.01.01 │ Index: logs-2023.01.02 │
│ Index: logs-2023.01.03 │ Index: logs-2023.01.04 │
└─────────────┴───────────────┘

Searches using the pattern only query the matching daily indexes.

Build-Up - 7 Steps

1

FoundationUnderstanding Elasticsearch Index Basics

Concept: Learn what an index is and how data is stored in Elasticsearch.

An index in Elasticsearch is like a database table that stores documents. Each document holds data in JSON format. Indexes help organize data for fast searching. For example, a 'logs' index might store all log entries. You can create many indexes to separate data logically.

Result

You understand that an index is a container for documents and that Elasticsearch uses indexes to organize data.

Knowing what an index is helps you see why grouping indexes by time is useful for managing large datasets.

2

FoundationWhat is Time-Series Data?

3

IntermediateWhy Use Multiple Indexes for Time-Series?

4

IntermediateHow Index Patterns Work in Elasticsearch

5

IntermediateChoosing the Right Time Interval for Indexes

6

AdvancedIndex Lifecycle Management with Time-Series Patterns

7

ExpertSurprises in Index Pattern Performance and Querying

Under the Hood

Elasticsearch stores data in shards within indexes. Each index has metadata and segments holding documents. When you use an index pattern, Elasticsearch expands the pattern to find all matching indexes and queries their shards in parallel. It merges results before returning them. Internally, it manages cluster state with index metadata and optimizes segment merging to keep search fast.

Why designed this way?

Time-series data grows fast and can become huge. Storing all data in one index would slow searches and make management hard. Splitting by time and using patterns lets Elasticsearch scale horizontally and manage data lifecycle. This design balances flexibility, speed, and resource use.

┌───────────────┐
│ Index Pattern │
│  logs-2023.*  │
└──────┬────────┘
       │ Matches multiple indexes
       ▼
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│ logs-2023.01.01│  │ logs-2023.01.02│  │ logs-2023.01.03│
│ ┌───────────┐ │  │ ┌───────────┐ │  │ ┌───────────┐ │
│ │ Shard 1   │ │  │ │ Shard 1   │ │  │ │ Shard 1   │ │
│ └───────────┘ │  │ └───────────┘ │  │ └───────────┘ │
└───────────────┘   └───────────────┘   └───────────────┘

Elasticsearch queries all shards in parallel and merges results.

Myth Busters - 4 Common Misconceptions

Quick: Do you think using one big index for all time-series data is faster than many small indexes? Commit yes or no.

Common Belief:One big index is always faster because it avoids overhead of multiple indexes.

Tap to reveal reality

Quick: Do you think index patterns can match indexes with completely different naming schemes? Commit yes or no.

Common Belief:Index patterns can match any index regardless of naming if they contain the right data.

Tap to reveal reality

Quick: Do you think more indexes always mean better performance? Commit yes or no.

Common Belief:More indexes always improve performance by narrowing search scope.

Tap to reveal reality

Quick: Do you think deleting documents inside an index is the same as deleting an entire index? Commit yes or no.

Common Belief:Deleting documents inside an index frees up space immediately like deleting the whole index.

Tap to reveal reality

Expert Zone

1

Index patterns depend heavily on consistent naming conventions; even small deviations break pattern matching.

2

Elasticsearch's internal segment merging affects how quickly deleted data space is reclaimed, impacting storage and performance.

3

Cluster state size grows with the number of indexes, so very high index counts can slow cluster operations even if queries are fast.

When NOT to use

Index patterns for time-series are not ideal when data is not time-based or when data volume is small; in such cases, a single index or other data stores like relational databases may be better.

Production Patterns

In production, teams use daily or hourly indexes for logs, combined with ILM policies to delete or archive old data. They use index patterns in Kibana for dashboards and alerts, and optimize index intervals based on query patterns and storage costs.

Connections

Partitioning in Relational Databases

Both split large datasets into smaller parts based on a key (like time) to improve query speed and management.

Understanding partitioning helps grasp why Elasticsearch splits time-series data into multiple indexes.

File Organization in Operating Systems

Index patterns are like directory structures organizing files by date to find them quickly.

Knowing how OS organizes files by folders helps understand how index patterns organize data.

Calendar Systems in Time Management

Both use time intervals (days, months) to segment continuous data or events for easier handling.

Recognizing time segmentation in calendars clarifies why time-series data is split by intervals.

Common Pitfalls

#1Searching all indexes without using an index pattern.

Wrong approach:GET /_search { "query": { "match_all": {} } }

Correct approach:GET /logs-2023.01.*/_search { "query": { "match_all": {} } }

Root cause:Not using index patterns causes queries to scan all indexes, slowing performance.

#2Using inconsistent index names breaking the pattern.

Wrong approach:Indexes named logs-2023-01-01 and log2023.01.02 mixed together with pattern logs-2023.*

Correct approach:Consistent naming like logs-2023.01.01, logs-2023.01.02 matching pattern logs-2023.*

Root cause:Inconsistent naming prevents index patterns from matching all relevant indexes.

#3Creating too many tiny indexes (e.g., per minute) without need.

Wrong approach:Indexes like logs-2023.01.01-12:00, logs-2023.01.01-12:01, ... for low volume data.

Correct approach:Use daily or hourly indexes matching data volume and query needs.

Root cause:Over-splitting increases cluster overhead and slows operations.

Key Takeaways

Index patterns group many time-based indexes to make searching large time-series data efficient.

Splitting data into multiple indexes by time improves query speed and data management.

Consistent naming conventions are essential for index patterns to work correctly.

Index Lifecycle Management automates data retention and storage optimization for time-series indexes.

Balancing index size and count is key to maintaining Elasticsearch cluster performance.