Recall & Review
beginner
What is the purpose of data validation checks in pandas?
Data validation checks help ensure that the data is clean, accurate, and meets expected rules before analysis. They catch errors or inconsistencies early.
Click to reveal answer
beginner
How can you check for missing values in a pandas DataFrame?
Use the
isnull() or isna() methods to find missing values. For example, df.isnull().sum() shows the count of missing values per column.Click to reveal answer
intermediate
What pandas method helps to check if all values in a column meet a condition?
The
all() method can be used after a condition. For example, (df['age'] > 0).all() checks if all ages are positive.Click to reveal answer
beginner
How do you find duplicate rows in a pandas DataFrame?
Use
df.duplicated() to get a boolean Series marking duplicates. To see duplicates, use df[df.duplicated()].Click to reveal answer
intermediate
Explain how to validate data types of columns in pandas.
Use
df.dtypes to see data types of each column. You can check if a column has the expected type, e.g., df['price'].dtype == 'float64'.Click to reveal answer
Which pandas method helps identify missing values in a DataFrame?
✗ Incorrect
The isnull() method returns a DataFrame of booleans indicating missing values.
How do you check if all values in a column 'age' are greater than zero?
✗ Incorrect
Using (df['age'] > 0).all() returns True only if every value in 'age' is greater than zero.
What does df.duplicated() return?
✗ Incorrect
df.duplicated() returns a boolean Series where True marks duplicate rows.
Which method shows the data types of each column in a DataFrame?
✗ Incorrect
df.dtypes returns a Series with the data type of each column.
To count missing values per column, which code is correct?
✗ Incorrect
df.isnull().sum() counts missing values in each column.
Describe how you would perform basic data validation checks on a new dataset using pandas.
Think about common data issues like missing data, duplicates, and wrong types.
You got /4 concepts.
Explain why data validation checks are important before analyzing data.
Consider what happens if you analyze bad data.
You got /4 concepts.