Challenge - 5 Problems
Document Parsing Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
What is the output of this document parsing code?
Consider the following code that loads and parses a JSON document string. What will be the value of
parsed_data after running this code?Prompt Engineering / GenAI
import json json_string = '{"name": "Alice", "age": 30, "skills": ["Python", "ML"]}' parsed_data = json.loads(json_string)
Attempts:
2 left
💡 Hint
Remember that json.loads converts JSON strings into Python dictionaries with single quotes.
✗ Incorrect
The json.loads function converts the JSON string into a Python dictionary. Python dictionaries use single quotes for string keys and values when printed.
🧠 Conceptual
intermediate1:30remaining
Which format is best for loading structured text documents for ML?
You want to load a large collection of structured text documents for machine learning. Which document format is most suitable for easy parsing and extracting fields?
Attempts:
2 left
💡 Hint
Think about formats that clearly separate data fields for easy extraction.
✗ Incorrect
CSV files organize data in rows and columns, making it easy to parse and extract fields for ML tasks.
❓ Metrics
advanced1:00remaining
How to measure parsing success rate on a document dataset?
You have a dataset of 1000 documents to parse. Your parser successfully extracts data from 920 documents without errors. What is the parsing success rate?
Attempts:
2 left
💡 Hint
Success rate = (number of successful parses / total documents) * 100
✗ Incorrect
Parsing success rate is the percentage of documents parsed without errors: (920/1000)*100 = 92%.
🔧 Debug
advanced2:00remaining
Why does this XML parsing code raise an error?
This code tries to parse an XML document but raises an error. What is the cause?
Prompt Engineering / GenAI
import xml.etree.ElementTree as ET xml_string = '<root><item>Value</item></root>' root = ET.parse(xml_string)
Attempts:
2 left
💡 Hint
Check the expected input type for ET.parse.
✗ Incorrect
ET.parse expects a filename or file object, not a string containing XML data. To parse a string, use ET.fromstring instead.
❓ Model Choice
expert3:00remaining
Which model is best for extracting structured data from scanned document images?
You have scanned images of invoices and want to extract structured fields like date, total, and vendor name. Which AI model type is best suited for this task?
Attempts:
2 left
💡 Hint
Think about models that convert images to text and then extract information.
✗ Incorrect
OCR converts images to text, and NER extracts structured fields from text, making this combination ideal for scanned document data extraction.