Introduction
Finding specific data in a large database can be very slow if the system has to look through every record one by one. This problem makes searching inefficient and frustrating when dealing with big amounts of information.
Jump into concepts and practice - no test required
Imagine looking for a book in a huge library without a catalog; you would have to check every shelf. But with a catalog that tells you exactly where the book is, you can go straight to the right shelf and find it quickly.
┌─────────────────────────────┐ │ Database Table │ │ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │Row 1│ │Row 2│ │Row 3│ ... │ │ └─────┘ └─────┘ └─────┘ │ └─────────────┬───────────────┘ │ Full Table Scan ↓ ┌─────────────────────────────┐ │ Index │ │ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │Key 1│ │Key 2│ │Key 3│ ... │ │ └─┬───┘ └─┬───┘ └─┬───┘ │ │ │ │ │ │ │ ↓ ↓ ↓ │ │ Row 1 Row 2 Row 3 │ └─────────────────────────────┘
import sqlite3 conn = sqlite3.connect(':memory:') cur = conn.cursor() cur.execute('CREATE TABLE people (id INTEGER PRIMARY KEY, name TEXT)') # Insert 10000 rows for i in range(1, 10001): cur.execute('INSERT INTO people (name) VALUES (?)', (f'Person{i}',)) conn.commit() import time # Search without index start = time.time() cur.execute("SELECT * FROM people WHERE name = 'Person9999'") print(cur.fetchone()) print('Search without index:', time.time() - start, 'seconds') # Create index on name cur.execute('CREATE INDEX idx_name ON people(name)') conn.commit() # Search with index start = time.time() cur.execute("SELECT * FROM people WHERE name = 'Person9999'") print(cur.fetchone()) print('Search with index:', time.time() - start, 'seconds') conn.close()
employee_id in SQL?CREATE INDEX, followed by the index name, then ON and the table and column.username column. What will likely happen when you run SELECT * FROM users WHERE username = 'alice';?username helps the database find the row with 'alice' quickly without scanning the entire table.email. You create an index on email. However, queries are still slow. What could be a reason?