Why exploratory inspection guides analysis in Data Analysis Python - Performance Analysis
Exploratory inspection means looking closely at data before deep analysis.
We want to know how this step affects the time it takes to analyze data.
Analyze the time complexity of the following code snippet.
import pandas as pd
def inspect_data(df):
print(df.head())
print(df.describe())
for col in df.columns:
print(f"Unique values in {col}:", df[col].nunique())
This code prints the first rows, summary stats, and counts unique values per column.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Loop over each column to count unique values.
- How many times: Once per column in the data frame.
As the number of columns grows, the unique count runs more times.
| Input Size (columns) | Approx. Operations |
|---|---|
| 10 | 10 unique counts |
| 100 | 100 unique counts |
| 1000 | 1000 unique counts |
Pattern observation: The work grows directly with the number of columns.
Time Complexity: O(c * n)
This means the time grows with columns (c) times rows (n) because counting unique values checks all rows per column.
[X] Wrong: "Exploratory inspection is always fast and does not affect analysis time."
[OK] Correct: Counting unique values or summaries can take time proportional to data size, so inspection can be costly on big data.
Understanding how initial data checks scale helps you plan analysis steps wisely and shows you think about efficiency.
"What if we only inspected a random sample of rows instead of the full data? How would the time complexity change?"