Spot instances for cost savings
📖 Scenario: You work for a cloud services company that wants to analyze the cost savings from using spot instances instead of on-demand instances. Spot instances are cheaper but can be interrupted. You have data about instance types, their on-demand prices, and spot prices.Your task is to calculate the percentage cost savings for each instance type when using spot instances.
🎯 Goal: Build a Spark DataFrame with instance pricing data, add a configuration for minimum savings threshold, filter instance types that meet or exceed this threshold, and display the results.
📋 What You'll Learn
Create a Spark DataFrame with instance types and their on-demand and spot prices.
Add a configuration variable for minimum savings percentage.
Calculate the percentage savings for each instance type using spot instances.
Filter the DataFrame to only include instance types with savings greater than or equal to the threshold.
Display the filtered DataFrame.
💡 Why This Matters
🌍 Real World
Cloud engineers and data analysts use this kind of analysis to optimize cloud costs by choosing cheaper spot instances when possible.
💼 Career
Understanding how to manipulate Spark DataFrames and perform cost analysis is valuable for roles in cloud cost management, data engineering, and data science.
Progress0 / 4 steps