How to Use ZenML: Simple Steps to Build ML Pipelines
To use
ZenML, first install it with pip install zenml, then create a pipeline by defining steps as Python functions decorated with @step. Finally, assemble these steps into a pipeline using @pipeline and run it with pipeline.run().Syntax
ZenML uses simple Python decorators to define steps and pipelines. Each @step marks a function as a pipeline step, and @pipeline groups these steps into a workflow. You run the pipeline by calling pipeline.run().
- @step: Decorates a function to become a pipeline step.
- @pipeline: Decorates a function that connects steps.
- pipeline.run(): Executes the pipeline.
python
from zenml import step, pipeline @step def say_hello() -> str: return "Hello, ZenML!" @pipeline def hello_pipeline(): greeting = say_hello() print(greeting) if __name__ == "__main__": hello_pipeline().run()
Output
Hello, ZenML!
Example
This example shows a simple ZenML pipeline with two steps: one generates data, and the other processes it. It demonstrates how to define steps, create a pipeline, and run it.
python
from zenml import step, pipeline @step def generate_numbers() -> list[int]: return [1, 2, 3, 4, 5] @step def sum_numbers(numbers: list[int]) -> int: return sum(numbers) @pipeline def sum_pipeline(): numbers = generate_numbers() total = sum_numbers(numbers) print(f"Sum is: {total}") if __name__ == "__main__": sum_pipeline().run()
Output
Sum is: 15
Common Pitfalls
Common mistakes when using ZenML include:
- Not decorating functions with
@step, so they won't be recognized as pipeline steps. - Forgetting to decorate the pipeline function with
@pipeline. - Trying to print inside steps instead of returning values for better pipeline management.
- Running pipeline functions without calling
.run().
Always return data from steps and use print in the pipeline or after running it.
python
from zenml import step, pipeline # Wrong: missing @step decorator # def generate_data(): # return [1, 2, 3] @step def generate_data(): return [1, 2, 3] # Wrong: missing @pipeline decorator # def my_pipeline(): # data = generate_data() # print(data) @pipeline def my_pipeline(): data = generate_data() print(data) if __name__ == "__main__": my_pipeline().run()
Output
[1, 2, 3]
Quick Reference
| Concept | Description | Example |
|---|---|---|
| @step | Marks a function as a pipeline step | @step\ndef step_func(): pass |
| @pipeline | Defines a pipeline connecting steps | @pipeline\ndef my_pipeline(): pass |
| pipeline.run() | Runs the defined pipeline | my_pipeline().run() |
| Return values | Steps should return data, not print | return data |
| Step inputs | Steps can accept inputs from other steps | def step_func(input_data): |
Key Takeaways
Install ZenML and use @step to define pipeline steps.
Group steps with @pipeline and run with pipeline.run().
Always return data from steps instead of printing inside them.
Decorators @step and @pipeline are required for ZenML to recognize functions.
Use pipeline.run() to execute your ML workflow.