Complete the code to create input splits for a file in Hadoop.
FileInputFormat.addInputPath(job, new Path([1]));The addInputPath method requires a Path object pointing to the input directory or file. Here, "/user/data/input" is the correct path string.
Complete the code to get the number of input splits from the job context.
List<InputSplit> splits = inputFormat.getSplits([1]);The getSplits method requires the JobContext or Job object to generate splits. Here, job is the correct argument.
Fix the error in the code to ensure data locality is considered when processing splits.
for (InputSplit split : splits) { String[] locations = split.getLocations(); if (locations.length > [1]) { System.out.println("Data is local to node: " + locations[0]); } }
The condition should check if there is at least one location, so the length must be greater than 0.
Fill both blanks to create a map of split locations and their corresponding split sizes.
Map<String, Long> splitSizes = new HashMap<>(); for (InputSplit split : splits) { String[] locations = split.getLocations(); long size = split.[1](); for (String loc : locations) { splitSizes.put(loc, splitSizes.getOrDefault(loc, [2]) + size); } }
getSize() which does not exist.null as default value causing errors.The method to get the size of an InputSplit is getLength(). The default value for the map when a location is not present should be 0L (zero as a long).
Fill all three blanks to filter input splits larger than 128MB and print their first location.
for (InputSplit split : splits) { if (split.[1]() > [2]) { String[] locs = split.[3](); if (locs.length > 0) { System.out.println("Large split at: " + locs[0]); } } }
getSize() which is not a valid method.getLocations() with other methods.getLength() returns the size of the split in bytes. 128MB equals 134,217,728 bytes. getLocations() returns the array of nodes where the split data is located.