What is Pandas with SQL databases?

Pandasdata~5 mins

Pandas with SQL databases

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

Pandas helps you easily get data from SQL databases and work with it in Python. This makes data analysis faster and simpler.

You want to analyze data stored in a SQL database using Python.

You need to run SQL queries and then process the results with pandas.

You want to save your pandas DataFrame back into a SQL database.

You are combining SQL data with other data sources in Python.

You want to quickly explore database tables without leaving Python.

Syntax

Pandas

import pandas as pd
from sqlalchemy import create_engine

# Create a database connection
engine = create_engine('sqlite:///mydatabase.db')

# Read SQL query or table into a DataFrame
df = pd.read_sql('SELECT * FROM tablename', engine)

# Write DataFrame back to SQL table
df.to_sql('new_table', engine, if_exists='replace', index=False)

Use create_engine from SQLAlchemy to connect to your database.

pd.read_sql() can take a SQL query or a table name to load data.

Examples

This loads all rows from the 'users' table in the SQLite database 'example.db' and prints the first 5 rows.

Pandas

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('sqlite:///example.db')
df = pd.read_sql('SELECT * FROM users', engine)
print(df.head())

This saves the DataFrame df into a new table called 'backup_users' in the database, replacing it if it exists.

Pandas

df.to_sql('backup_users', engine, if_exists='replace', index=False)

This runs a SQL query to get names and ages of users older than 30, then prints the result.

Pandas

query = 'SELECT name, age FROM users WHERE age > 30'
df = pd.read_sql(query, engine)
print(df)

Sample Program

This program creates a small employees table in a temporary database, loads it into pandas, updates salaries for IT staff, saves changes back, and reads the updated table.

Pandas

import pandas as pd
from sqlalchemy import create_engine

# Create an in-memory SQLite database
engine = create_engine('sqlite:///:memory:')

# Create a sample table and insert data
with engine.connect() as conn:
    conn.execute("""
    CREATE TABLE employees (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        name TEXT,
        department TEXT,
        salary INTEGER
    )
    """)
    conn.execute("""
    INSERT INTO employees (name, department, salary) VALUES
    ('Alice', 'HR', 70000),
    ('Bob', 'IT', 80000),
    ('Charlie', 'Finance', 75000)
    """)

# Read the table into a pandas DataFrame
df = pd.read_sql('SELECT * FROM employees', engine)

# Show the DataFrame
print(df)

# Increase salary by 10% for IT department
df.loc[df['department'] == 'IT', 'salary'] = df.loc[df['department'] == 'IT', 'salary'] * 1.10

# Save updated data back to SQL
df.to_sql('employees', engine, if_exists='replace', index=False)

# Read again to confirm changes
df_updated = pd.read_sql('SELECT * FROM employees', engine)
print(df_updated)

OutputSuccess

Important Notes

Make sure to install SQLAlchemy with pip install sqlalchemy to use create_engine.

When writing DataFrames back to SQL, use if_exists='replace' to overwrite or if_exists='append' to add data.

SQLite is great for testing, but for bigger projects use databases like PostgreSQL or MySQL.

Summary

Pandas can read from and write to SQL databases easily using read_sql and to_sql.

Use SQLAlchemy's create_engine to connect pandas to your database.

This lets you combine SQL's power with pandas' easy data analysis in Python.