Again, we'll describe the car rental problem with a distributional model.

7.8 μs
51.7 s
24.9 ms
model
44.8 s
V
23.3 ms
p
TabularPolicy
├─ table => Dict
└─ n_action => 11
99.3 ms
300
30.0 s
4.6 s
217 ms