Introduction
Searching large databases quickly can be slow when many records must be checked. Bitmap indexes solve this by using simple yes/no maps to speed up queries, especially when data has few unique values.
Imagine a classroom where each student has a checklist for different activities like reading, sports, or music. For each activity, a sheet shows which students participate by marking a check or leaving it blank. To find students who do both reading and sports, you look at both sheets and find students checked on both.
┌───────────────┐ │ Column Values │ ├───────────────┤ │ Value A │ │ 1010010 │ ├───────────────┤ │ Value B │ │ 0101101 │ └───────────────┘ Bitmaps show which records have each value. Queries combine these bitmaps with AND/OR operations.
import sqlite3 # Create in-memory database conn = sqlite3.connect(':memory:') cur = conn.cursor() # Create a table with a low-cardinality column cur.execute('CREATE TABLE employees (id INTEGER, gender TEXT)') # Insert sample data cur.executemany('INSERT INTO employees VALUES (?, ?)', [ (1, 'M'), (2, 'F'), (3, 'M'), (4, 'F'), (5, 'M') ]) # Simulate bitmap index creation for gender # Bitmap for 'M': 1 if gender is 'M', else 0 cur.execute("SELECT id, CASE WHEN gender = 'M' THEN 1 ELSE 0 END AS bitmap_M FROM employees ORDER BY id") bitmap_M = cur.fetchall() # Bitmap for 'F': 1 if gender is 'F', else 0 cur.execute("SELECT id, CASE WHEN gender = 'F' THEN 1 ELSE 0 END AS bitmap_F FROM employees ORDER BY id") bitmap_F = cur.fetchall() print('Bitmap for M:', bitmap_M) print('Bitmap for F:', bitmap_F)