0
0
Data Analysis Pythondata~10 mins

Basic DataFrame info (shape, dtypes, describe) in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Basic DataFrame info (shape, dtypes, describe)
Create DataFrame
Check shape
Check dtypes
Run describe()
Review summary statistics
Start with a DataFrame, then check its size, data types, and summary statistics step-by-step.
Execution Sample
Data Analysis Python
import pandas as pd

data = {'Age': [25, 30, 22], 'Name': ['Ann', 'Bob', 'Cal'], 'Score': [88.5, 92.0, 85.0]}
df = pd.DataFrame(data)

print(df.shape)
print(df.dtypes)
print(df.describe())
This code creates a DataFrame and shows its shape, data types, and summary statistics.
Execution Table
StepActionOutput TypeOutput Value
1Create DataFrame from dictionaryDataFrame{'Age': [25,30,22], 'Name': ['Ann','Bob','Cal'], 'Score': [88.5,92.0,85.0]}
2Check shapetuple(3, 3)
3Check dtypesSeriesAge int64 Name object Score float64 dtype: object
4Run describe()DataFrame Age Score count 3.000000 3.000000 mean 25.666667 88.500000 std 4.041452 3.511885 min 22.000000 85.000000 25% 23.500000 86.250000 50% 25.000000 88.500000 75% 27.500000 90.250000 max 30.000000 92.000000
💡 All DataFrame info methods executed successfully.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 4
dfNoneDataFrame with 3 rows and 3 columnsSameSameSame
shapeNoneNone(3, 3)(3, 3)(3, 3)
dtypesNoneNoneNoneAge int64, Name object, Score float64Age int64, Name object, Score float64
describeNoneNoneNoneNoneSummary stats DataFrame for numeric columns
Key Moments - 3 Insights
Why does describe() only show statistics for numeric columns?
describe() by default summarizes numeric columns only, so text columns like 'Name' are excluded. See execution_table step 4 where only 'Age' and 'Score' stats appear.
What does the shape (3, 3) mean exactly?
Shape (3, 3) means 3 rows and 3 columns in the DataFrame, as shown in execution_table step 2.
Why is 'Name' dtype object and not string?
In pandas, text columns are stored as 'object' dtype, which means general Python objects, including strings. This is shown in execution_table step 3.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 2, what does the shape (3, 3) represent?
A3 rows and 3 columns
B3 columns and 3 rows
C3 rows and 2 columns
D3 columns and 2 rows
💡 Hint
Check the 'Output Value' column in execution_table step 2.
At which step do we see the data types of each column?
AStep 1
BStep 2
CStep 3
DStep 4
💡 Hint
Look for 'Check dtypes' in the 'Action' column of execution_table.
If the DataFrame had a new column with text data, how would describe() output change?
AIt would include summary stats for the new text column
BIt would ignore the new text column by default
CIt would cause an error
DIt would convert text to numbers automatically
💡 Hint
Refer to key_moments about describe() behavior on text columns.
Concept Snapshot
Basic DataFrame info:
- df.shape shows rows and columns count
- df.dtypes shows data type per column
- df.describe() summarizes numeric columns
Use these to quickly understand your data size and types.
Full Transcript
We start by creating a DataFrame from a dictionary with three columns: Age, Name, and Score. Then, we check the shape using df.shape, which tells us the DataFrame has 3 rows and 3 columns. Next, df.dtypes shows the data type of each column: Age is integer, Name is object (text), and Score is float. Finally, df.describe() gives summary statistics like count, mean, and min for numeric columns only, excluding the text column. This step-by-step helps us understand the size, types, and basic stats of our data.