File tree 2 files changed +3
-3
lines changed
2 files changed +3
-3
lines changed Original file line number Diff line number Diff line change @@ -299,7 +299,7 @@ An example walkthrough of a Sarsa update with $\alpha = 0.3$ and $\gamma = 0.9$
299
299
300
300
The previous description had assumed that RL operators were selected in both
301
301
decision cycles $t$ and $t+1$. If the operator selected in $t+1$ is not an RL operator,
302
- then $Q(s_{t+1},a_{t+1})$ would not be defined, and an update for the RL operator
302
+ then $Q(s_{t+1}, a_{t+1})$ would not be defined, and an update for the RL operator
303
303
selected at time $t$ will be undefined. We will call a sequence of one or more
304
304
decision cycles in which RL operators are not selected between two decision
305
305
cycles in which RL operators are selected a gap. Conceptually, it is desirable
@@ -332,7 +332,7 @@ RL operator.
332
332
333
333
Gap propagation can be disabled by setting the **temporal-extension** parameter
334
334
of the [`rl` command](../reference/cli/cmd_rl.md) to off. When gap propagation
335
- is disabled, the RL rules preceding a gap are updated using $Q(s_{t+1},a_{t+1})
335
+ is disabled, the RL rules preceding a gap are updated using $Q(s_{t+1}, a_{t+1})
336
336
= 0$. The rl setting of the [`watch`](../reference/cli/cmd_trace.md) command is
337
337
useful in identifying gaps.
338
338
Original file line number Diff line number Diff line change @@ -221,7 +221,7 @@ the *Agents* directory.
221
221
222
222
### RL Rules
223
223
224
- Rules that are recognized as updateable by the RL mechanism must abide
224
+ Rules that are recognized as updatable by the RL mechanism must abide
225
225
by a specific syntax:
226
226
227
227
``` Soar
You can’t perform that action at this time.
0 commit comments