Ml-pythonHow-ToBeginner · 3 min read

How to Use ZenML: Simple Steps to Build ML Pipelines

To use ZenML, first install it with pip install zenml, then create a pipeline by defining steps as Python functions decorated with @step. Finally, assemble these steps into a pipeline using @pipeline and run it with pipeline.run().

📐

Syntax

ZenML uses simple Python decorators to define steps and pipelines. Each @step marks a function as a pipeline step, and @pipeline groups these steps into a workflow. You run the pipeline by calling pipeline.run().

@step: Decorates a function to become a pipeline step.
@pipeline: Decorates a function that connects steps.
pipeline.run(): Executes the pipeline.

python

from zenml import step, pipeline

@step
def say_hello() -> str:
    return "Hello, ZenML!"

@pipeline
def hello_pipeline():
    greeting = say_hello()
    print(greeting)

if __name__ == "__main__":
    hello_pipeline().run()

Output

Hello, ZenML!

💻

Example

This example shows a simple ZenML pipeline with two steps: one generates data, and the other processes it. It demonstrates how to define steps, create a pipeline, and run it.

python

from zenml import step, pipeline

@step
def generate_numbers() -> list[int]:
    return [1, 2, 3, 4, 5]

@step
def sum_numbers(numbers: list[int]) -> int:
    return sum(numbers)

@pipeline
def sum_pipeline():
    numbers = generate_numbers()
    total = sum_numbers(numbers)
    print(f"Sum is: {total}")

if __name__ == "__main__":
    sum_pipeline().run()

Output

Sum is: 15

⚠️

Common Pitfalls

Common mistakes when using ZenML include:

Not decorating functions with @step, so they won't be recognized as pipeline steps.
Forgetting to decorate the pipeline function with @pipeline.
Trying to print inside steps instead of returning values for better pipeline management.
Running pipeline functions without calling .run().

Always return data from steps and use print in the pipeline or after running it.

python

from zenml import step, pipeline

# Wrong: missing @step decorator
# def generate_data():
#     return [1, 2, 3]

@step
def generate_data():
    return [1, 2, 3]

# Wrong: missing @pipeline decorator
# def my_pipeline():
#     data = generate_data()
#     print(data)

@pipeline
def my_pipeline():
    data = generate_data()
    print(data)

if __name__ == "__main__":
    my_pipeline().run()

Output

[1, 2, 3]

📊

Quick Reference

Concept	Description	Example
@step	Marks a function as a pipeline step	@step\ndef step_func(): pass
@pipeline	Defines a pipeline connecting steps	@pipeline\ndef my_pipeline(): pass
pipeline.run()	Runs the defined pipeline	my_pipeline().run()
Return values	Steps should return data, not print	return data
Step inputs	Steps can accept inputs from other steps	def step_func(input_data):

✅

Key Takeaways

Install ZenML and use @step to define pipeline steps.

Group steps with @pipeline and run with pipeline.run().

Always return data from steps instead of printing inside them.

Decorators @step and @pipeline are required for ZenML to recognize functions.

Use pipeline.run() to execute your ML workflow.