Experiment - Document loading and parsing
Problem:You want to load text documents and extract useful information for a machine learning model. Currently, the code reads documents but does not handle different formats well or clean the text properly.
Current Metrics:Parsing success rate: 70%, Text cleanliness score: 60%
Issue:The document loader misses some text parts and includes unwanted characters, causing noisy data for the model.