0
0
Hadoopdata~10 mins

Input splits and data locality in Hadoop - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to create input splits for a file in Hadoop.

Hadoop
FileInputFormat.addInputPath(job, new Path([1]));
Drag options to blanks, or click blank then click option'
Ajob
B"inputfile.txt"
C"/user/data/input"
Dnew Path()
Attempts:
3 left
💡 Hint
Common Mistakes
Passing the job object instead of a path string.
Not using quotes around the path string.
2fill in blank
medium

Complete the code to get the number of input splits from the job context.

Hadoop
List<InputSplit> splits = inputFormat.getSplits([1]);
Drag options to blanks, or click blank then click option'
Ajob
Bcontext
Cconfiguration
DinputFormat
Attempts:
3 left
💡 Hint
Common Mistakes
Passing the inputFormat object itself.
Passing an unrelated variable like context.
3fill in blank
hard

Fix the error in the code to ensure data locality is considered when processing splits.

Hadoop
for (InputSplit split : splits) {
    String[] locations = split.getLocations();
    if (locations.length > [1]) {
        System.out.println("Data is local to node: " + locations[0]);
    }
}
Drag options to blanks, or click blank then click option'
A1
B0
C-1
Dlocations.length
Attempts:
3 left
💡 Hint
Common Mistakes
Checking if length > 1 which skips cases with exactly one location.
Using negative numbers in the condition.
4fill in blank
hard

Fill both blanks to create a map of split locations and their corresponding split sizes.

Hadoop
Map<String, Long> splitSizes = new HashMap<>();
for (InputSplit split : splits) {
    String[] locations = split.getLocations();
    long size = split.[1]();
    for (String loc : locations) {
        splitSizes.put(loc, splitSizes.getOrDefault(loc, [2]) + size);
    }
}
Drag options to blanks, or click blank then click option'
AgetLength
B0L
CgetSize
Dnull
Attempts:
3 left
💡 Hint
Common Mistakes
Using getSize() which does not exist.
Using null as default value causing errors.
5fill in blank
hard

Fill all three blanks to filter input splits larger than 128MB and print their first location.

Hadoop
for (InputSplit split : splits) {
    if (split.[1]() > [2]) {
        String[] locs = split.[3]();
        if (locs.length > 0) {
            System.out.println("Large split at: " + locs[0]);
        }
    }
}
Drag options to blanks, or click blank then click option'
AgetLength
B134217728
CgetLocations
DgetSize
Attempts:
3 left
💡 Hint
Common Mistakes
Using getSize() which is not a valid method.
Using incorrect byte value for 128MB.
Confusing getLocations() with other methods.