Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to launch an EMR cluster using AWS CLI.
Hadoop
aws emr create-cluster --name MyCluster --release-label emr-6.3.0 --applications Name=Hadoop [1]
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using --instance-type instead of --instance-count
Forgetting to specify the number of instances
✗ Incorrect
The --instance-count option specifies the number of EC2 instances in the EMR cluster.
2fill in blank
mediumComplete the code to submit a Hadoop job on Google Dataproc.
Hadoop
gcloud dataproc jobs submit hadoop --cluster my-cluster --region us-central1 --jar myjob.jar [1] Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using --num-workers which is for cluster size, not job submission
Omitting the main class option
✗ Incorrect
The --class option specifies the main class of the Hadoop job to run.
3fill in blank
hardFix the error in the HDInsight script to create a Hadoop cluster with 4 worker nodes.
Hadoop
az hdinsight create --name mycluster --resource-group mygroup --type Hadoop --location eastus [1] Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using --worker-node-size which sets size, not count
Forgetting to specify worker node count
✗ Incorrect
The --worker-node-count option sets the number of worker nodes in the HDInsight cluster.
4fill in blank
hardFill both blanks to configure a Dataproc cluster with 3 master nodes and 5 worker nodes.
Hadoop
gcloud dataproc clusters create my-cluster --region us-central1 --num-masters=[1] --num-workers=[2]
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Swapping master and worker counts
Using incorrect numbers for nodes
✗ Incorrect
The --num-masters option sets master nodes, and --num-workers sets worker nodes.
5fill in blank
hardFill all three blanks to create an EMR cluster with Hadoop, specify instance type, and enable debugging.
Hadoop
aws emr create-cluster --name TestCluster --release-label emr-6.3.0 --applications Name=[1] --instance-type [2] --enable-debugging [3]
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using Spark instead of Hadoop for the application
Omitting the log URI for debugging
Using wrong instance type
✗ Incorrect
Specify Hadoop as the application, m5.xlarge as instance type, and provide a log URI to enable debugging logs.