Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Commit d435ee8

Browse files
author
Błażej O
committed
rl/README extended.
1 parent 73e540d commit d435ee8

File tree

1 file changed

+44
-11
lines changed

1 file changed

+44
-11
lines changed

tensor2tensor/rl/README.md

Lines changed: 44 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,47 @@ for now and under heavy development.
77

88
Currently the only supported algorithm is Proximy Policy Optimization - PPO.
99

10-
## Sample usage - training in the Pendulum-v0 environment.
11-
12-
```python rl/t2t_rl_trainer.py --problems=Pendulum-v0 --hparams_set continuous_action_base [--output_dir dir_location]```
13-
14-
## Sample usage - training in the PongNoFrameskip-v0 environment.
15-
16-
```python tensor2tensor/rl/t2t_rl_trainer.py --problem stacked_pong --hparams_set atari_base --hparams num_agents=5 [--output_dir dir_location]```
17-
18-
## Sample usage - generation of trajectories data
19-
20-
```python tensor2tensor/bin/t2t-datagen --data_dir=~/t2t_data --tmp_dir=~/t2t_data/tmp --problem=gym_pong_trajectories_from_policy --model_path [model]```
10+
# Sample usages
11+
12+
## Training agent in the Pendulum-v0 environment.
13+
14+
```
15+
python rl/t2t_rl_trainer.py \
16+
--problems=Pendulum-v0 \
17+
--hparams_set continuous_action_base \
18+
[--output_dir dir_location]
19+
```
20+
21+
## Training agent in the PongNoFrameskip-v0 environment.
22+
23+
```
24+
python tensor2tensor/rl/t2t_rl_trainer.py \
25+
--problem stacked_pong \
26+
--hparams_set atari_base \
27+
--hparams num_agents=5 \
28+
[--output_dir dir_location]
29+
```
30+
31+
## Generation of trajectories data
32+
33+
```
34+
python tensor2tensor/bin/t2t-datagen \
35+
--data_dir=~/t2t_data \
36+
--tmp_dir=~/t2t_data/tmp \
37+
--problem=gym_pong_trajectories_from_policy \
38+
--model_path [model]
39+
```
40+
41+
## Training model for frames generation based on randomly played games
42+
43+
```
44+
python tensor2tensor/bin/t2t-trainer \
45+
--generate_data \
46+
--data_dir=~/t2t_data \
47+
--output_dir=~/t2t_data/output \
48+
--problems=gym_pong_random5k \
49+
--model=basic_conv_gen \
50+
--hparams_set=basic_conv_small \
51+
--train_steps=1000 \
52+
--eval_steps=10
53+
```

0 commit comments

Comments
 (0)