Practice

(1/5)

1. What does a Directed Acyclic Graph (DAG) represent in an MLOps pipeline?

easy

A. Tasks and their dependencies without any cycles

B. A loop of tasks that repeat indefinitely

C. Random tasks executed in parallel without order

D. Only the final output of a pipeline

Solution

Step 1: Understand DAG structure
A DAG is a graph with nodes and edges where edges show dependencies and no cycles exist.
Step 2: Relate DAG to pipeline tasks
In MLOps, tasks are nodes and dependencies are edges, ensuring tasks run in order without loops.
Final Answer:
Tasks and their dependencies without any cycles -> Option A
Quick Check:
DAG = tasks + dependencies without loops [OK]

Hint: DAG means no loops, just tasks linked in order [OK]

Common Mistakes:

Thinking DAG allows loops
Confusing DAG with random task order
Assuming DAG only shows final output

2. Which of the following is the correct syntax to define a simple DAG in Apache Airflow?

easy

A. dag = DAG('my_dag', interval='daily')

B. dag = DAG('my_dag' schedule='daily')

C. dag = DAG('my_dag', schedule='everyday')

D. dag = DAG('my_dag', schedule_interval='@daily')

Solution

Step 1: Check Airflow DAG syntax
The DAG constructor requires a name and a schedule_interval parameter for timing.
Step 2: Validate options
dag = DAG('my_dag', schedule_interval='@daily') uses correct parameter 'schedule_interval' with valid value '@daily'. Others use wrong parameter names or values.
Final Answer:
dag = DAG('my_dag', schedule_interval='@daily') -> Option D
Quick Check:
Correct DAG syntax uses schedule_interval [OK]

Hint: Use schedule_interval='@daily' for daily DAGs [OK]

Common Mistakes:

Using 'schedule' instead of 'schedule_interval'
Wrong interval value formats
Missing commas between parameters

3. Given this Airflow DAG snippet, what is the order of task execution?

task1 = DummyOperator(task_id='task1', dag=dag)
task2 = DummyOperator(task_id='task2', dag=dag)
task3 = DummyOperator(task_id='task3', dag=dag)
task1 >> task2 >> task3

medium

A. task3, then task2, then task1

B. task1, then task2, then task3

C. task2, then task1, then task3

D. All tasks run in parallel

Solution

Step 1: Analyze task dependencies
The '>>' operator sets order: task1 before task2, task2 before task3.
Step 2: Determine execution sequence
Tasks run in sequence: task1 first, then task2, then task3.
Final Answer:
task1, then task2, then task3 -> Option B
Quick Check:
task1 >> task2 >> task3 means sequential order [OK]

Hint: >> means run left task before right task [OK]

Common Mistakes:

Assuming tasks run in reverse order
Thinking tasks run in parallel
Ignoring the '>>' operator meaning

4. You wrote this DAG code but get an error: TypeError: 'DAG' object is not iterable. What is the likely cause?

with DAG('example_dag', schedule_interval='@daily') as dag:
    task1 = DummyOperator(task_id='task1')
    task2 = DummyOperator(task_id='task2')
    task1 >> task2

for task in dag:
    print(task.task_id)

medium

A. DAG object is not iterable, so 'for task in dag' causes error

B. DummyOperator requires a 'dag' parameter outside the context

C. Missing import for DummyOperator

D. schedule_interval '@daily' is invalid

Solution

Step 1: Identify error cause
The error says 'DAG' object is not iterable, likely from trying to loop over dag object.
Step 2: Understand DAG iterability
DAG objects in Airflow are not iterable directly; looping over them causes this error.
Final Answer:
DAG object is not iterable, so 'for task in dag' causes error -> Option A
Quick Check:
DAG is not iterable; use dag.tasks list instead [OK]

Hint: DAG is not iterable; use dag.tasks to loop [OK]

Common Mistakes:

Trying to loop directly over DAG object
Assuming DummyOperator needs dag param outside context
Misreading error as import issue

5. You want to create a pipeline where task A runs first, then tasks B and C run in parallel, and finally task D runs after both B and C finish. Which DAG structure correctly represents this?

hard

A. [A, B] >> C >> D

B. A >> B >> C >> D

C. A >> [B, C] >> D

D. A >> D >> [B, C]

Solution

Step 1: Understand task order requirements
Task A runs first, then B and C run at the same time, then D runs after both finish.
Step 2: Translate to DAG syntax
Using Airflow syntax, 'A >> [B, C] >> D' means A before B and C in parallel, then D after both.
Final Answer:
A >> [B, C] >> D -> Option C
Quick Check:
Parallel tasks in list brackets between sequential tasks [OK]

Hint: Use brackets [] for parallel tasks in DAG [OK]

Common Mistakes:

Placing tasks in wrong order
Not using brackets for parallel tasks
Assuming linear order for all tasks

Why Pipeline components and DAGs in MLOps? - Purpose & Use Cases

Start learning this pattern below

Practice

Solution

Step 1: Understand DAG structure

Step 2: Relate DAG to pipeline tasks

Final Answer:

Quick Check:

Solution

Step 1: Check Airflow DAG syntax

Step 2: Validate options

Final Answer:

Quick Check:

Solution

Step 1: Analyze task dependencies

Step 2: Determine execution sequence

Final Answer:

Quick Check:

Solution

Step 1: Identify error cause

Step 2: Understand DAG iterability

Final Answer:

Quick Check:

Solution

Step 1: Understand task order requirements

Step 2: Translate to DAG syntax

Final Answer:

Quick Check: