What is the output of the df.info() call on the following DataFrame?
import pandas as pd import numpy as np df = pd.DataFrame({ 'A': [1, 2, 3, None], 'B': ['x', 'y', None, 'z'], 'C': [1.1, 2.2, 3.3, 4.4], 'D': pd.to_datetime(['2023-01-01', None, '2023-01-03', '2023-01-04']) }) df.info()
Remember that null in numeric columns converts the column to float64 with nulls counted.
The column 'A' has one null, so it becomes float64 with 3 non-null values. 'B' is object with 3 non-null because of one null. 'C' is float64 with no nulls. 'D' is datetime64 with one null, so 3 non-null.
Given the DataFrame below, how many non-null values does column 'X' have according to df.info()?
import pandas as pd df = pd.DataFrame({ 'X': [10, None, 30, None, 50], 'Y': ['a', 'b', 'c', 'd', 'e'] }) df.info()
Count how many values in 'X' are not null.
Column 'X' has values 10, null, 30, null, 50. The non-null count is 3.
Which statement about the output of df.info() is incorrect?
import pandas as pd df = pd.DataFrame({ 'A': [1, 2, None], 'B': ['x', None, 'z'] }) df.info()
Check how many non-null values are in column 'B'.
Column 'B' has one null, so non-null count is 2, not 3.
Why does df.info() show memory usage even when columns have null values?
Think about how pandas stores data internally.
Pandas uses special markers (like NaN) to represent nulls but still allocates memory for all rows in the column.
You have a DataFrame df with many columns. You want to quickly find which columns have missing values using df.info(). Which approach below correctly identifies columns with missing data?
Compare non-null counts to total rows.
If a column's non-null count is less than total rows, it has missing values.