# Chapter 8.2 Dyna: Integrated Planning, Acting, and Learning

To demonstrate the flexibility of ReinforcementLearning.jl, the DynaAgent is also included and we'll explore its performance in this notebook.

12.8 μs
49.3 s

## The Maze Environment

In this chapter, the authors introduced a specific maze environment. So let's define it by implementing the interfaces in ReinforcementLearning.jl.

8.7 μs
40.4 ms
4.2 ms
29.7 μs
25.6 μs
21.6 μs
26.3 μs
21.4 μs
19.9 μs
* (generic function with 861 methods)
449 μs
x
# MazeEnv

## Traits

| Trait Type        |                                            Value |
|:----------------- | ------------------------------------------------:|
| NumAgentStyle     |          ReinforcementLearningBase.SingleAgent() |
| DynamicStyle      |           ReinforcementLearningBase.Sequential() |
| InformationStyle  | ReinforcementLearningBase.ImperfectInformation() |
| ChanceStyle       |           ReinforcementLearningBase.Stochastic() |
| RewardStyle       |           ReinforcementLearningBase.StepReward() |
| UtilityStyle      |           ReinforcementLearningBase.GeneralSum() |
| ActionStyle       |     ReinforcementLearningBase.MinimalActionSet() |
| StateStyle        |     ReinforcementLearningBase.Observation{Any}() |
| DefaultStateStyle |     ReinforcementLearningBase.Observation{Any}() |

## Is Environment Terminated?

No

## State Space

Base.OneTo(54)

## Action Space

Base.OneTo(4)

## Current State


3

336 ms

## Figure 8.2

5.5 μs
26.4 ms
plan_step (generic function with 1 method)
48.5 μs
8.9 s
26.2 ms
75.1 s

## Figure 8.4

4.9 μs
cumulative_dyna_reward (generic function with 1 method)
79.6 μs
walls (generic function with 1 method)
47.1 μs
change_walls (generic function with 1 method)
20.3 μs
36.8 s

## Figure 8.5

5.6 μs
new_walls (generic function with 1 method)
47.7 μs
new_change_walls (generic function with 1 method)
20.2 μs
120 s

## Example 8.4

5.3 μs
run_once (generic function with 2 methods)
83.2 μs
136 s