Challenge - 5 Problems

🎖️

Info() Mastery Badge

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of info() on a DataFrame with mixed types and nulls

What is the output of the df.info() call on the following DataFrame?

Pandas

import pandas as pd
import numpy as np
df = pd.DataFrame({
    'A': [1, 2, 3, None],
    'B': ['x', 'y', None, 'z'],
    'C': [1.1, 2.2, 3.3, 4.4],
    'D': pd.to_datetime(['2023-01-01', None, '2023-01-03', '2023-01-04'])
})
df.info()

RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   A       4 non-null      int64         
 1   B       4 non-null      object        
 2   C       4 non-null      float64       
 3   D       4 non-null      datetime64[ns]

memory usage: 256.0 bytes

RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   A       3 non-null      float64       
 1   B       3 non-null      object        
 2   C       4 non-null      float64       
 3   D       3 non-null      datetime64[ns]

memory usage: 256.0 bytes

RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   A       3 non-null      int64         
 1   B       3 non-null      object        
 2   C       4 non-null      float64       
 3   D       3 non-null      object        

memory usage: 256.0 bytes

RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   A       3 non-null      float64       
 1   B       4 non-null      object        
 2   C       4 non-null      float64       
 3   D       3 non-null      datetime64[ns]

memory usage: 256.0 bytes

Attempts:

2 left

❓ data_output

intermediate

1:30remaining

Count of non-null values per column from info()

Given the DataFrame below, how many non-null values does column 'X' have according to df.info()?

Pandas

import pandas as pd
df = pd.DataFrame({
    'X': [10, None, 30, None, 50],
    'Y': ['a', 'b', 'c', 'd', 'e']
})
df.info()

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in info() output interpretation

Which statement about the output of df.info() is incorrect?

Pandas

import pandas as pd
df = pd.DataFrame({
    'A': [1, 2, None],
    'B': ['x', None, 'z']
})
df.info()

AThe 'B' column has 3 non-null values.

BThe 'A' column has dtype 'float64' because of the null value.

CThe 'A' column has 2 non-null values.

DThe 'B' column has dtype 'object'.

Attempts:

2 left

🧠 Conceptual

advanced

1:30remaining

Understanding memory usage in info() with nulls

Why does df.info() show memory usage even when columns have null values?

ABecause memory usage only counts non-null values.

BBecause null values are ignored and do not consume memory.

CBecause pandas stores nulls as special values within the column's dtype, so memory is allocated for all rows.

DBecause info() does not calculate memory usage when nulls are present.

Attempts:

2 left

🚀 Application

expert

2:30remaining

Using info() to detect columns with missing data

You have a DataFrame df with many columns. You want to quickly find which columns have missing values using df.info(). Which approach below correctly identifies columns with missing data?

ALook for columns where the memory usage is zero.

BLook for columns where the 'Non-Null Count' equals the total number of rows.

CLook for columns with dtype 'object' because only those can have missing values.

DLook for columns where the 'Non-Null Count' is less than the total number of rows.

Attempts:

2 left