Skip to content

Commit dbdc190

Browse files
committed
v0.0.2 release
1 parent 1e568be commit dbdc190

File tree

23 files changed

+47
-34
lines changed

23 files changed

+47
-34
lines changed

README.md

Lines changed: 26 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,34 +4,36 @@
44
<br>
55
<p>
66

7-
[![pypi](https://img.shields.io/badge/pypi%20package-v0.0.1-blue)](https://pypi.org/project/genrl/)
7+
[![pypi](https://img.shields.io/badge/pypi%20package-v0.0.2-blue)](https://pypi.org/project/genrl/)
88
[![GitHub license](https://img.shields.io/github/license/SforAiDl/genrl)](https://github.com/SforAiDl/genrl/blob/master/LICENSE)
99
[![Build Status](https://travis-ci.com/SforAiDl/genrl.svg?branch=master)](https://travis-ci.com/SforAiDl/genrl)
1010
[![Total alerts](https://img.shields.io/lgtm/alerts/g/SforAiDl/genrl.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/SforAiDl/genrl/alerts/)
1111
[![Language grade: Python](https://img.shields.io/lgtm/grade/python/g/SforAiDl/genrl.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/SforAiDl/genrl/context:python)
1212
[![codecov](https://codecov.io/gh/SforAiDl/genrl/branch/master/graph/badge.svg)](https://codecov.io/gh/SforAiDl/genrl)
1313
[![Documentation Status](https://readthedocs.org/projects/genrl/badge/?version=latest)](https://genrl.readthedocs.io/en/latest/?badge=latest)
1414
[![Maintainability](https://api.codeclimate.com/v1/badges/c3f6e7d31c078528e0e1/maintainability)](https://codeclimate.com/github/SforAiDl/genrl/maintainability)
15-
![Lint, Test, Code Coverage](https://github.com/SforAiDl/genrl/workflows/Lint,%20Test,%20Code%20Coverage/badge.svg)
15+
[![Lint, Test, Code Coverage](https://github.com/SforAiDl/genrl/workflows/Lint,%20Test,%20Code%20Coverage/badge.svg)
16+
[![Slack - Chat](https://img.shields.io/badge/Slack-Chat-blueviolet)](https://join.slack.com/t/genrlworkspace/shared_invite/zt-gwlgnymd-Pw3TYC~0XDLy6VQDml22zg)
1617

1718
---
1819

1920
[![](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/images/0)](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/links/0)[![](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/images/1)](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/links/1)[![](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/images/2)](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/links/2)[![](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/images/3)](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/links/3)[![](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/images/4)](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/links/4)[![](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/images/5)](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/links/5)[![](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/images/6)](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/links/6)[![](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/images/7)](https://sourcerer.io/fame/Sharad24/Sharad24/genrl/links/7)
2021

2122
---
2223

23-
**GenRL is a PyTorch reinforcement learning library centered around reproducible and generalizable algorithm implementations.**
24+
**GenRL is a PyTorch reinforcement learning library centered around reproducible, generalizable algorithm implementations and improving accessibility in Reinforcement Learning**
2425

2526
Reinforcement learning research is moving faster than ever before. In order to keep up with the growing trend and ensure that RL research remains reproducible, GenRL aims to aid faster paper reproduction and benchmarking by providing the following main features:
2627

2728
- **PyTorch-first**: Modular, Extensible and Idiomatic Python
29+
- **Tutorials and Example**: 20+ Tutorials from basic RL to SOTA Deep RL algorithm (with explanations)!
2830
- **Unified Trainer and Logging class**: code reusability and high-level UI
2931
- **Ready-made algorithm implementations**: ready-made implementations of popular RL algorithms.
3032
- **Faster Benchmarking**: automated hyperparameter tuning, environment implementations etc.
3133

3234
By integrating these features into GenRL, we aim to eventually support **any new algorithm implementation in less than 100 lines**.
3335

34-
**If you're interested in contributing, feel free to go through the issues and open PRs for code, docs, tests etc. In case of any questions, please check out the [Contributing Guidelines](https://github.com/SforAiDl/genrl/wiki/Contributing-Guidelines)**
36+
**If you're interested in contributing, feel free to go through the issues and open PRs for code, docs, tests etc. In case of any questions, please check out the [Contributing Guidelines](CONTRIBUTING.md)**
3537

3638

3739
## Installation
@@ -55,10 +57,9 @@ To train a Soft Actor-Critic model from scratch on the `Pendulum-v0` gym environ
5557
```python
5658
import gym
5759

58-
from genrl import SAC, QLearning
59-
from genrl.classical.common import Trainer
60-
from genrl.deep.common import OffPolicyTrainer
60+
from genrl.agents import SAC, QLearning
6161
from genrl.environments import VectorEnv
62+
from genrl.trainers import ClassicalTrainer, OffPolicyTrainer
6263

6364
env = VectorEnv("Pendulum-v0")
6465
agent = SAC('mlp', env)
@@ -69,13 +70,30 @@ trainer.train()
6970
To train a Tabular Dyna-Q model from scratch on the `FrozenLake-v0` gym environment and plot rewards:
7071
```python
7172

73+
7274
env = gym.make("FrozenLake-v0")
7375
agent = QLearning(env)
74-
trainer = Trainer(agent, env, mode="dyna", model="tabular", n_episodes=10000)
76+
trainer = ClassicalTrainer(agent, env, mode="dyna", model="tabular", n_episodes=10000)
7577
episode_rewards = trainer.train()
7678
trainer.plot(episode_rewards)
7779
```
7880

81+
## Tutorials
82+
- [Multi Armed Bandits](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/bandit_overview.html)
83+
- [Upper Confidence Bound](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/ucb.html)
84+
- [Thompson Sampling](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/thompson_sampling.html)
85+
- [Bayesian](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/bayesian.html)
86+
- [Softmax Action Selection](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/gradients.html)
87+
- [Contextual Bandits](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/contextual_overview.html)
88+
- [Linear Posterior Inference](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/linpos.html)
89+
- [Variational Inference](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/variational.html)
90+
- [https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/bootstrap.html](Bootstrap)
91+
- [Parameter Noise Sampling](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/noise.html)
92+
- [Deep Reinforcement Learning Background](https://genrl.readthedocs.io/en/latest/usage/tutorials/Deep/Background.html)
93+
- [Vanilla Policy Gradients](https://genrl.readthedocs.io/en/latest/usage/tutorials/Deep/VPG.html)
94+
- [Advantage Actor Critic](https://genrl.readthedocs.io/en/latest/usage/tutorials/Deep/A2C.html)
95+
- [Proximal Policy Optimization](https://genrl.readthedocs.io/en/latest/usage/tutorials/Deep/PPO.html)
96+
7997
## Algorithms
8098

8199
### Deep RL

docs/source/api/agents/genrl.agents.classical.sarsa.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,3 @@ genrl.agents.classical.sarsa.sarsa module
88
:members:
99
:undoc-members:
1010
:show-inheritance:
11-

docs/source/api/agents/genrl.agents.deep.ppo1.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,3 @@ genrl.agents.deep.ppo1.ppo1 module
88
:members:
99
:undoc-members:
1010
:show-inheritance:
11-

docs/source/api/agents/genrl.agents.deep.sac.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,3 @@ genrl.agents.deep.sac.sac module
99
:members:
1010
:undoc-members:
1111
:show-inheritance:
12-

docs/source/api/agents/genrl.agents.deep.td3.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,3 @@ genrl.agents.deep.td3.td3 module
88
:members:
99
:undoc-members:
1010
:show-inheritance:
11-

docs/source/api/agents/genrl.agents.deep.vpg.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,3 @@ genrl.agents.deep.vpg.vpg module
99
:members:
1010
:undoc-members:
1111
:show-inheritance:
12-

docs/source/usage/tutorials/Classical/Q_Learning.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,4 +80,3 @@ Great so far so good! Now moving towards the training process it is just calling
8080
8181
8282
That's it! You have successfully trained a Q-Learning agent. You can now go ahead and play with your own environments using GenRL!
83-

docs/source/usage/tutorials/Classical/Sarsa.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,4 +67,4 @@ Great so far so good! Now moving towards the training process it is just calling
6767
trainer.train()
6868
trainer.evaluate()
6969
70-
That's it! You have successfully trained a SARSA agent. You can now go ahead and play with your own environments using GenRL!
70+
That's it! You have successfully trained a SARSA agent. You can now go ahead and play with your own environments using GenRL!

genrl/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
version = "0.0.2"

genrl/agents/bandits/contextual/common/base_model.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
from typing import Dict
33

44
import torch
5-
import torch.nn as nn
6-
import torch.nn.functional as F
5+
from torch import nn as nn
6+
from torch.nn import functional as F
77

88
from genrl.agents.bandits.contextual.common.transition import TransitionDB
99

0 commit comments

Comments
 (0)