Chapter 10 Mountain Car

6.3 μs
66.3 s

The MountainCarEnv is already provided in ReinforcementLearning.jl. So we can use it directly here. Note that by default this environment will terminate at the maximum step of 200. While in the example on the book there's no such restriction.

5.1 μs
env
# MountainCarEnv

## Traits

| Trait Type        |                                            Value |
|:----------------- | ------------------------------------------------:|
| NumAgentStyle     |          ReinforcementLearningBase.SingleAgent() |
| DynamicStyle      |           ReinforcementLearningBase.Sequential() |
| InformationStyle  | ReinforcementLearningBase.ImperfectInformation() |
| ChanceStyle       |           ReinforcementLearningBase.Stochastic() |
| RewardStyle       |           ReinforcementLearningBase.StepReward() |
| UtilityStyle      |           ReinforcementLearningBase.GeneralSum() |
| ActionStyle       |     ReinforcementLearningBase.MinimalActionSet() |
| StateStyle        |     ReinforcementLearningBase.Observation{Any}() |
| DefaultStateStyle |     ReinforcementLearningBase.Observation{Any}() |

## Is Environment Terminated?

No

## State Space

`ReinforcementLearningBase.Space{Array{IntervalSets.Interval{:closed,:closed,Float64},1}}(IntervalSets.Interval{:closed,:closed,Float64}[-1.2..0.6, -0.07..0.07])`

## Action Space

`Base.OneTo(3)`

## Current State

```
[-0.5834095103051787, 0.0]
```
569 ms
S
279 ns

First let's define a Tiling structure to encode the state.

3.5 μs
encode (generic function with 2 methods)
20.3 ms
388 ms

The rest parts are simple, we initialize agent and env, then roll out experiments:

3.5 μs
create_env_agent (generic function with 3 methods)
131 μs
X
-1.2:0.046153846153846156:0.6
243 ms
Y
-0.07:0.0035897435897435897:0.07
9.0 μs
show_approximation (generic function with 1 method)
69.5 μs
n
10
84.0 ns
25.4 s
59.2 s
57.4 s
140 s