0
0
MLOpsdevops~20 mins

Tracking datasets with DVC in MLOps - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
DVC Dataset Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
💻 Command Output
intermediate
1:30remaining
DVC add command output
You run the command dvc add data/raw_images in your project folder. What is the expected output?
MLOps
dvc add data/raw_images
AWarning: data/raw_images is empty, nothing added.
B
Adding data/raw_images to DVC.
Computing checksum...
Adding 'data/raw_images.dvc' file.
CError: data/raw_images directory not found.
Ddvc: command not found
Attempts:
2 left
💡 Hint
Think about what happens when you add a directory with files to DVC.
🧠 Conceptual
intermediate
1:00remaining
Purpose of .dvc files
What is the main purpose of the .dvc files created when you track datasets with DVC?
AThey are temporary cache files created during data processing.
BThey store the actual dataset files inside the project folder.
CThey are configuration files for DVC remote storage credentials.
DThey contain metadata and checksums to track dataset versions without storing data directly.
Attempts:
2 left
💡 Hint
Think about how DVC tracks large files without putting them in Git.
🔀 Workflow
advanced
2:00remaining
Correct sequence to track and push dataset with DVC
Arrange the steps in the correct order to track a new dataset folder data/images and push it to remote storage using DVC.
A1,2,3,4
B2,1,3,4
C1,3,2,4
D3,2,1,4
Attempts:
2 left
💡 Hint
Think about adding data, then staging files for Git, committing, then pushing data to remote.
Troubleshoot
advanced
1:30remaining
DVC push error diagnosis
You run dvc push but get the error: ERROR: failed to upload data/images: Access denied to remote storage. What is the most likely cause?
AThe .dvc file for data/images is corrupted.
BThe dataset folder data/images does not exist locally.
CThe remote storage credentials are missing or incorrect.
DGit repository is not initialized.
Attempts:
2 left
💡 Hint
Think about what 'Access denied' means when pushing data.
Best Practice
expert
2:00remaining
Best practice for dataset versioning with DVC
Which practice ensures reliable dataset versioning and collaboration when using DVC in a team?
ATrack datasets with DVC, commit .dvc files to Git, and push data to a shared remote storage.
BCommit only the dataset files directly to Git to keep everything in one place.
CStore datasets only on local machines and share via email to avoid remote storage issues.
DUse DVC without Git to simplify version control.
Attempts:
2 left
💡 Hint
Think about how to keep data versions consistent and shareable in a team.