Column expressions and functions
📖 Scenario: You work as a data analyst for a retail company. You have sales data with product names and their prices. You want to create a new column that shows the price after applying a 10% discount.
🎯 Goal: Create a Spark DataFrame with product names and prices, define a discount rate, apply a column expression to calculate discounted prices, and display the final DataFrame.
📋 What You'll Learn
Create a Spark DataFrame named
products_df with columns product and price using the exact data provided.Create a variable named
discount_rate and set it to 0.10.Use Spark column expressions and functions to add a new column
discounted_price to products_df that applies the discount.Show the resulting DataFrame using
show().💡 Why This Matters
🌍 Real World
Retail companies often need to adjust prices dynamically, such as applying discounts or taxes, and analyze the updated prices.
💼 Career
Data analysts and data engineers use Spark column expressions to efficiently transform and analyze large datasets in real time.
Progress0 / 4 steps