Hadoop vs Spark Comparison
📖 Scenario: You work as a data analyst in a company that uses big data tools. Your manager wants you to compare two popular big data frameworks: Hadoop and Spark. You will create a small dataset with their features and performance metrics, then filter and display the best option based on speed.
🎯 Goal: Build a Python program that stores Hadoop and Spark data in a dictionary, sets a speed threshold, filters frameworks faster than the threshold, and prints the filtered results.
📋 What You'll Learn
Create a dictionary called
frameworks with keys 'Hadoop' and 'Spark' and values as dictionaries containing 'speed' and 'ease_of_use' ratings.Create a variable called
speed_threshold with a numeric value.Use a dictionary comprehension to create a new dictionary
fast_frameworks with only frameworks having speed greater than speed_threshold.Print the
fast_frameworks dictionary.💡 Why This Matters
🌍 Real World
Companies often compare big data tools like Hadoop and Spark to choose the best one for their needs based on speed and usability.
💼 Career
Data analysts and engineers must understand how to organize and filter data to make informed decisions about technology choices.
Progress0 / 4 steps