0
0
Data Analysis Pythondata~5 mins

Changing data types (astype) in Data Analysis Python

Choose your learning style9 modes available
Introduction

Sometimes data is not in the right form to work with. Changing data types helps fix this so we can analyze data correctly.

When numbers are read as text but you want to do math with them.
When dates are stored as strings and you want to work with dates.
When you want to save memory by changing big numbers to smaller types.
When you want to convert categories stored as text into special category types.
When you want to prepare data for machine learning models that need specific types.
Syntax
Data Analysis Python
new_data = old_data.astype(new_type)

astype() is a method used on pandas Series or DataFrames.

You can pass types like 'int', 'float', 'str', 'category', or numpy types.

Examples
Convert the 'age' column to integers.
Data Analysis Python
df['age'] = df['age'].astype('int')
Convert the 'price' column to floating point numbers.
Data Analysis Python
df['price'] = df['price'].astype('float')
Convert the 'date' column to datetime type.
Data Analysis Python
df['date'] = pd.to_datetime(df['date']).astype('datetime64[ns]')
Convert the 'category' column to a category type to save memory and improve performance.
Data Analysis Python
df['category'] = df['category'].astype('category')
Sample Program

This code shows how to convert columns from strings to proper types like int, float, and category. It prints the data types before and after the change.

Data Analysis Python
import pandas as pd

# Create a sample DataFrame with mixed types
data = {'age': ['25', '30', '22'], 'price': ['10.5', '20.3', '15.0'], 'category': ['A', 'B', 'A']}
df = pd.DataFrame(data)

print('Before changing types:')
print(df.dtypes)

# Change 'age' to int
s_age = df['age'].astype(int)
# Change 'price' to float
s_price = df['price'].astype(float)
# Change 'category' to category type
s_cat = df['category'].astype('category')

# Assign back to DataFrame
df['age'] = s_age
 df['price'] = s_price
 df['category'] = s_cat

print('\nAfter changing types:')
print(df.dtypes)
OutputSuccess
Important Notes

If conversion fails (like trying to convert 'abc' to int), astype() will raise an error.

Use pd.to_datetime() for date conversions before astype if needed.

Converting to 'category' can save memory when you have repeated text values.

Summary

Use astype() to change data types of pandas columns easily.

Correct data types help with accurate calculations and faster processing.

Always check data before converting to avoid errors.