Skip to content

Conversation

@dchichkov
Copy link

Does it makes sense?

@danijar
Copy link
Contributor

danijar commented Jan 12, 2019

Yes, this seems reasonable. Did you train an agent like this to see if it affects performance?

@dchichkov
Copy link
Author

I've seen in my environment that _penalty does go to exact zero, and "increase penalty" logic doesn't increase it as a result. I haven't performed enough runs to tell, if it affects performance or not.

It may as well be that it doesn't and a sensible change would be to stop wasting time on calculating KL term, once _penalty is zero!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants