Challenge - 5 Problems

🎖️

Hadoop Cluster Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Key factor in determining Hadoop cluster size

Which of the following is the most important factor when deciding the size of a Hadoop cluster?

AThe total volume of data to be processed

BThe number of users accessing the cluster

CThe brand of hardware used

DThe color of the server racks

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Calculate total storage needed for a Hadoop cluster

You have 10 TB of raw data. Hadoop uses replication factor 3 by default. What is the total storage needed in the cluster?

A10 TB

B20 TB

C3 TB

D30 TB

Attempts:

2 left

❓ Predict Output

advanced

2:00remaining

Estimate number of nodes needed based on CPU cores

Given a Hadoop job requires 200 CPU cores and each node has 16 cores, how many nodes are needed?

Consider only whole nodes.

A13 nodes

B12 nodes

C14 nodes

D15 nodes

Attempts:

2 left

❓ visualization

advanced

2:00remaining

Visualize cluster storage distribution

You have a Hadoop cluster with 5 nodes. Each node has 4 TB storage. The replication factor is 3. How much usable storage is available in the cluster?

A20 TB usable storage

B60 TB usable storage

C6.67 TB usable storage

D15 TB usable storage

Attempts:

2 left

🚀 Application

expert

3:00remaining

Optimize cluster size for mixed workload

You manage a Hadoop cluster running both batch and real-time jobs. Batch jobs need high storage, real-time jobs need low latency and high CPU. Which cluster sizing strategy best balances these needs?

AUse small nodes with minimal storage and CPU, run all workloads on same nodes

BUse large nodes with high storage and many CPU cores, and separate batch and real-time workloads on different nodes

CUse nodes with only high storage, ignore CPU needs

DUse nodes with only high CPU, ignore storage needs

Attempts:

2 left