How to Read SQL in Pandas: Simple Guide with Examples
You can read SQL queries into pandas using the
pandas.read_sql() function, which takes a SQL query string and a database connection. This lets you load data from databases directly into a DataFrame for easy analysis.Syntax
The basic syntax to read SQL data into pandas is:
pandas.read_sql(sql, con)
Where:
sqlis a SQL query string or table name.conis a database connection object.
python
import pandas as pd from sqlalchemy import create_engine # Create a database connection engine = create_engine('sqlite:///example.db') # Read SQL query into DataFrame df = pd.read_sql('SELECT * FROM table_name', con=engine)
Example
This example shows how to create a simple SQLite database, insert data, and read it into a pandas DataFrame using read_sql.
python
import pandas as pd from sqlalchemy import create_engine import sqlite3 # Create SQLite in-memory database conn = sqlite3.connect(':memory:') cursor = conn.cursor() # Create table and insert data cursor.execute('CREATE TABLE users (id INTEGER, name TEXT, age INTEGER)') cursor.execute('INSERT INTO users VALUES (1, "Alice", 30)') cursor.execute('INSERT INTO users VALUES (2, "Bob", 25)') cursor.execute('INSERT INTO users VALUES (3, "Charlie", 35)') conn.commit() # Read SQL query into pandas DataFrame query = 'SELECT * FROM users WHERE age > 28' df = pd.read_sql(query, conn) print(df)
Output
id name age
0 1 Alice 30
1 3 Charlie 35
Common Pitfalls
Common mistakes when reading SQL in pandas include:
- Not providing a valid database connection object.
- Passing a table name instead of a SQL query without setting
read_sql_tablewhen needed. - Forgetting to install or import the required database driver or SQLAlchemy.
Always ensure your connection is open and valid before calling read_sql.
python
import pandas as pd # Wrong: passing table name without connection try: df = pd.read_sql('users') # Missing connection except Exception as e: print(f'Error: {e}') # Right: pass connection and query from sqlalchemy import create_engine engine = create_engine('sqlite:///example.db') df = pd.read_sql('SELECT * FROM users', con=engine) print(df.head())
Output
Error: read_sql() missing 1 required positional argument: 'con'
id name age
0 1 Alice 30
1 2 Bob 25
2 3 Charlie 35
Quick Reference
| Function | Description |
|---|---|
| pandas.read_sql(sql, con) | Read SQL query or table into DataFrame |
| pandas.read_sql_query(sql, con) | Read SQL query into DataFrame (query only) |
| pandas.read_sql_table(table_name, con) | Read SQL table into DataFrame (table only) |
| con | Database connection object (e.g., SQLAlchemy engine or DBAPI connection) |
Key Takeaways
Use pandas.read_sql() with a SQL query string and a valid database connection to load data into a DataFrame.
You can connect to many databases using SQLAlchemy engines or DBAPI connections.
Always ensure your database connection is open and the SQL syntax is correct.
For reading entire tables, consider pandas.read_sql_table() for simplicity.
Common errors include missing connection objects or incorrect SQL syntax.