0
0
Data Analysis Pythondata~5 mins

Exploratory Data Analysis (EDA) template in Data Analysis Python - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Exploratory Data Analysis (EDA) template
O(c * n)
Understanding Time Complexity

We want to understand how the time needed to explore data grows as the data size increases.

How does the work change when we have more rows or columns to analyze?

Scenario Under Consideration

Analyze the time complexity of the following EDA template code.

import pandas as pd

def eda_template(df):
    print(df.head())
    print(df.describe())
    print(df.info())
    for col in df.columns:
        print(f"Unique values in {col}:", df[col].nunique())

This code prints basic summaries and counts unique values for each column in the data.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Loop over all columns to count unique values.
  • How many times: Once per column in the dataset.
How Execution Grows With Input

The time grows mainly with the number of columns because we count unique values for each.

Input Size (columns)Approx. Operations
1010 unique counts
100100 unique counts
10001000 unique counts

Pattern observation: The work increases linearly with the number of columns.

Final Time Complexity

Time Complexity: O(c * n)

This means the time grows with both the number of columns (c) and the number of rows (n) because counting unique values scans each column's data.

Common Mistake

[X] Wrong: "The time only depends on the number of columns, not rows."

[OK] Correct: Counting unique values requires looking at every row in each column, so rows affect time too.

Interview Connect

Understanding how data size affects EDA steps helps you explain your approach clearly and shows you think about efficiency in real projects.

Self-Check

"What if we added a nested loop to compare every pair of columns? How would the time complexity change?"