Bird
0
0

You have a large dataset that needs to be compressed and stored in HDFS. You want fast read access and moderate compression. Which codec should you choose and why?

hard📝 Application Q8 of 15
Hadoop - Performance Tuning
You have a large dataset that needs to be compressed and stored in HDFS. You want fast read access and moderate compression. Which codec should you choose and why?
ANo compression, to avoid any overhead
BGzip, because it provides the best compression ratio regardless of speed
CLZO, because it is the slowest but compresses best
DSnappy, because it offers fast decompression and reasonable compression
Step-by-Step Solution
Solution:
  1. Step 1: Define requirements

    Need fast read (decompression) and moderate compression ratio.
  2. Step 2: Match codec to requirements

    Snappy is optimized for fast decompression and moderate compression, ideal for fast read access.
  3. Final Answer:

    Snappy, because it offers fast decompression and reasonable compression -> Option D
  4. Quick Check:

    Fast read + moderate compression = Snappy [OK]
Quick Trick: Fast read needs Snappy codec [OK]
Common Mistakes:
  • Choosing Gzip ignoring speed
  • Thinking LZO is slowest
  • Avoiding compression loses storage benefits

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More Hadoop Quizzes