Bird
Raised Fist0
PostgreSQLquery~10 mins

GIN index for arrays and JSONB in PostgreSQL - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Concept Flow - GIN index for arrays and JSONB
Start: Create GIN index
PostgreSQL scans array/JSONB
Extract elements/keys
Store elements in GIN index
Query uses GIN index
Fast search for elements/keys
Return matching rows
This flow shows how PostgreSQL creates a GIN index on arrays or JSONB by extracting elements and storing them for fast search.
Execution Sample
PostgreSQL
CREATE TABLE products(id SERIAL PRIMARY KEY, tags TEXT[]);
CREATE INDEX idx_tags_gin ON products USING GIN(tags);

INSERT INTO products(tags) VALUES
('{red,large}'),
('{blue,small}');

SELECT * FROM products WHERE tags @> '{red}';
Create a GIN index on an array column and query rows containing 'red' tag.
Execution Table
StepActionIndex StateQuery ConditionResult Rows
1Create table with array columnEmpty index
2Create GIN index on tagsIndex ready but empty
3Insert row with tags {"red", "large"}Index stores 'red', 'large' for row 1
4Insert row with tags {"blue", "small"}Index stores 'blue', 'small' for row 2
5Query WHERE tags @> '{"red"}'Index used to find rows with 'red'tags contains 'red'Row 1
6Query WHERE tags @> '{"small"}'Index used to find rows with 'small'tags contains 'small'Row 2
7Query WHERE tags @> '{"green"}'Index used but no matchtags contains 'green'No rows
💡 Queries stop after index finds matching rows or none.
Variable Tracker
VariableStartAfter Step 3After Step 4After Step 5Final
Index ContentEmpty{'red': [1], 'large': [1]}{'red': [1], 'large': [1], 'blue': [2], 'small': [2]}Used to find rows with 'red'Used to find matching rows or none
Key Moments - 3 Insights
Why does the GIN index store individual elements instead of whole arrays?
Because GIN indexes each element separately, it can quickly find rows containing specific elements without scanning entire arrays, as shown in execution_table rows 3 and 4.
How does the query use the GIN index to speed up searches?
The query condition uses the index to directly find rows containing the searched element, avoiding full table scans, as seen in execution_table rows 5 and 6.
What happens if the searched element is not in any array?
The index quickly shows no matching rows, so the query returns empty results without scanning the table, as in execution_table row 7.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what elements does the index store after inserting the second row?
A'red', 'large', 'blue', 'small'
B'red', 'large' only
C'blue', 'small' only
DEmpty index
💡 Hint
Check the 'Index Content' variable after Step 4 in variable_tracker.
At which step does the query find rows containing the 'red' tag?
AStep 3
BStep 5
CStep 6
DStep 7
💡 Hint
Look at the 'Query Condition' and 'Result Rows' columns in execution_table.
If we query for tags containing 'green', what will the result be according to the execution_table?
ARow 1
BRow 2
CNo rows
DAll rows
💡 Hint
See the last row in execution_table where no match is found.
Concept Snapshot
GIN index stores each element of arrays or JSONB separately.
It allows fast searches for elements inside these complex types.
Create with: CREATE INDEX idx ON table USING GIN(column);
Queries use operators like @> to find elements.
Index avoids full scans by quickly locating matching rows.
Full Transcript
This visual execution trace shows how PostgreSQL creates and uses a GIN index for arrays and JSONB data. First, a table with an array column is created. Then a GIN index is built on that column. When rows are inserted, the index stores each element separately with references to the row. Queries using conditions like @> search the index for matching elements, returning only relevant rows quickly. If no matching element is found, the index quickly returns no results without scanning the whole table. This makes searching inside arrays or JSONB much faster and efficient.

Practice

(1/5)
1. What is the main purpose of a GIN index in PostgreSQL when used with arrays or JSONB columns?
easy
A. To speed up searches for specific elements inside arrays or JSONB data
B. To compress the data stored in arrays or JSONB columns
C. To automatically update array or JSONB data when rows change
D. To enforce uniqueness on array or JSONB columns

Solution

  1. Step 1: Understand GIN index purpose

    GIN indexes are designed to speed up searches inside complex data types like arrays and JSONB by indexing their elements.
  2. Step 2: Compare options

    Options B, C, and D describe compression, automatic updates, and uniqueness enforcement, which are not the main roles of GIN indexes.
  3. Final Answer:

    To speed up searches for specific elements inside arrays or JSONB data -> Option A
  4. Quick Check:

    GIN index purpose = speed up element search [OK]
Hint: GIN indexes speed up element searches inside arrays/JSONB [OK]
Common Mistakes:
  • Confusing GIN with data compression
  • Thinking GIN enforces uniqueness
  • Assuming GIN auto-updates data
2. Which of the following is the correct syntax to create a GIN index on a JSONB column named data in a table items?
easy
A. CREATE INDEX idx_data ON items USING HASH (data);
B. CREATE INDEX idx_data ON items USING GIN (data);
C. CREATE INDEX idx_data ON items USING GIN (data jsonb_path_ops);
D. CREATE INDEX idx_data ON items USING BTREE (data);

Solution

  1. Step 1: Identify correct index type for JSONB

    GIN indexes are created using USING GIN and applied directly on the JSONB column.
  2. Step 2: Check syntax correctness

    CREATE INDEX idx_data ON items USING GIN (data); uses correct syntax: CREATE INDEX idx_data ON items USING GIN (data); CREATE INDEX idx_data ON items USING GIN (data jsonb_path_ops); is invalid because jsonb_path_ops must be specified inside parentheses, e.g., data jsonb_path_ops is incorrect syntax here.
  3. Final Answer:

    CREATE INDEX idx_data ON items USING GIN (data); -> Option B
  4. Quick Check:

    Correct GIN index syntax = CREATE INDEX idx_data ON items USING GIN (data); [OK]
Hint: Use 'USING GIN (column)' to create GIN index on JSONB [OK]
Common Mistakes:
  • Using BTREE or HASH instead of GIN
  • Incorrect syntax with jsonb_path_ops
  • Missing USING keyword
3. Given the table products with a JSONB column tags and a GIN index on tags, what will the following query return?
SELECT id FROM products WHERE tags @> '["organic"]';
medium
A. All product ids where the tags array contains the element 'organic'
B. All product ids where the tags array is exactly '["organic"]'
C. All product ids where the tags array contains any element
D. Syntax error due to incorrect JSONB operator

Solution

  1. Step 1: Understand the JSONB containment operator @>

    The operator @> checks if the left JSONB contains the right JSONB. Here, it checks if tags contains the element 'organic'.
  2. Step 2: Analyze the query result

    The query returns all product ids where the tags array includes 'organic' anywhere, not just exact match or any element.
  3. Final Answer:

    All product ids where the tags array contains the element 'organic' -> Option A
  4. Quick Check:

    tags @> '["organic"]' means contains 'organic' [OK]
Hint: Use @> to check if JSONB contains specific element [OK]
Common Mistakes:
  • Thinking @> means exact match
  • Confusing @> with existence of any element
  • Assuming syntax error with @>
4. You created a GIN index on a JSONB column info but your queries using info @> '{"key": "value"}' are still slow. What is the most likely cause?
medium
A. GIN indexes do not support the @> operator
B. The queries are missing the WHERE clause
C. The GIN index was created without the jsonb_path_ops operator class
D. The JSONB column contains NULL values

Solution

  1. Step 1: Understand GIN index operator classes

    GIN indexes on JSONB can use default or jsonb_path_ops operator class. The latter is optimized for existence queries using @>.
  2. Step 2: Identify cause of slow queries

    If the index was created without jsonb_path_ops, the index may not efficiently support @> queries, causing slow performance.
  3. Final Answer:

    The GIN index was created without the jsonb_path_ops operator class -> Option C
  4. Quick Check:

    Missing jsonb_path_ops = slow @> queries [OK]
Hint: Use jsonb_path_ops for faster @> queries on JSONB [OK]
Common Mistakes:
  • Assuming GIN doesn't support @>
  • Ignoring operator class choice
  • Blaming NULL values for index slowness
5. You want to create a GIN index on a table orders with a column items that stores an array of integers. Which statement correctly creates the index and optimizes queries checking if an integer is present in the array?
hard
A. CREATE INDEX idx_items_gin ON orders USING GIN (items gin_int_ops);
B. CREATE INDEX idx_items_gin ON orders USING GIN (items gin__int_ops);
C. CREATE INDEX idx_items_gin ON orders USING GIN (items gin__intarray_ops);
D. CREATE INDEX idx_items_gin ON orders USING GIN (items);

Solution

  1. Step 1: Identify correct GIN index syntax for integer arrays

    For integer arrays, the default GIN index supports containment and membership queries without specifying operator classes.
  2. Step 2: Validate options

    Options B, C, and D use invalid operator class names like gin__int_ops or gin__intarray_ops, which do not exist in PostgreSQL.
  3. Final Answer:

    CREATE INDEX idx_items_gin ON orders USING GIN (items); -> Option D
  4. Quick Check:

    Default GIN index on array column = CREATE INDEX idx_items_gin ON orders USING GIN (items); [OK]
Hint: Use default GIN index on array column without extra ops [OK]
Common Mistakes:
  • Using non-existent operator classes
  • Adding unnecessary syntax after column name
  • Confusing GIN with other index types