0
0
Data Analysis Pythondata~5 mins

Dropping missing values (dropna) in Data Analysis Python - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What does the dropna() function do in data analysis?
The dropna() function removes rows or columns that contain missing values (NaN) from a dataset, helping to clean the data for analysis.
Click to reveal answer
beginner
How can you use dropna() to remove columns with missing values instead of rows?
You can use dropna(axis=1) to remove columns that have any missing values instead of rows.
Click to reveal answer
intermediate
What does the parameter how='all' do in dropna()?
The parameter how='all' tells dropna() to drop only those rows or columns where all values are missing.
Click to reveal answer
intermediate
What is the effect of dropna(thresh=2) on a DataFrame?
It keeps only rows or columns with at least 2 non-missing values, dropping those with fewer than 2 valid entries.
Click to reveal answer
beginner
Why is it important to drop missing values before analysis?
Missing values can cause errors or misleading results in calculations and models, so dropping them helps ensure accurate and reliable analysis.
Click to reveal answer
What does df.dropna() do by default?
AFills missing values with zero
BDrops columns with any missing values
CDrops rows with any missing values
DKeeps all rows and columns
How do you drop columns with missing values using dropna()?
Adf.dropna(thresh=1)
Bdf.dropna(axis=0)
Cdf.dropna(how='all')
Ddf.dropna(axis=1)
What does dropna(how='all') do?
ADrops rows or columns where all values are missing
BDrops rows or columns with any missing value
CDrops no rows or columns
DFills missing values with the mean
What does the thresh parameter control in dropna()?
AMinimum number of non-missing values required to keep the row/column
BMaximum number of missing values allowed
CWhether to drop rows or columns
DThe value to replace missing data
Why might you choose to drop missing values instead of filling them?
ABecause missing values are always errors
BTo avoid introducing bias from incorrect guesses
CBecause filling is always slower
DTo increase dataset size
Explain how to use dropna() to clean a dataset by removing rows or columns with missing values.
Think about the axis and how parameters.
You got /4 concepts.
    Describe why handling missing data is important before doing data analysis.
    Consider the impact of missing values on results.
    You got /4 concepts.