Model Pipeline - State graphs and transitions
This pipeline shows how an agent uses a state graph to decide actions. The agent moves between states based on transitions, learning which paths lead to success.
Jump into concepts and practice - no test required
This pipeline shows how an agent uses a state graph to decide actions. The agent moves between states based on transitions, learning which paths lead to success.
Loss
1.0 |**********
0.8 |********
0.6 |******
0.4 |****
0.2 |**
0.0 |*
1 2 3 4 5 Epochs
| Epoch | Loss ↓ | Accuracy ↑ | Observation |
|---|---|---|---|
| 1 | 0.9 | 0.2 | Agent starts with random transitions, low accuracy |
| 2 | 0.7 | 0.4 | Agent learns better transitions, accuracy improves |
| 3 | 0.5 | 0.6 | Agent refines policy, loss decreases steadily |
| 4 | 0.3 | 0.8 | Agent approaches optimal transitions |
| 5 | 0.15 | 0.95 | Agent achieves high accuracy, low loss |
State1 --action--> State2.S1 --a--> S2 matches the standard arrow with action label; others use incorrect or unclear syntax.S1 --a--> S2 -> Option AS1 --a--> S2S2 --b--> S3transitions = { 'S1': {'a': 'S2'}, 'S2': {'b': 'S3'} }
current_state = 'S1'
actions = ['a', 'c']
for act in actions:
current_state = transitions[current_state][act]S1 --a--> S2, S2 --b--> S3, and S3 --c--> S1.