Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

QikeLi · 2018-07-05T01:08:19Z

The solution provided for the Policy Evaluation does not agree with the equation on page 8 of Dr. David Silvers' slides for lecture 3.

amobiny · 2018-11-30T00:43:48Z

What you are saying is correct, but Denny is implementing a more general case.
In fact, in David Silver slides, there's an assumption that taking an action, a, in state s will give a reward, R, no matter what the state transition is. In Denny's implementation, he takes into account that an action could result in different rewards based on what state the environment puts you in. Since this environment is deterministic, both implementation gives the same answer.

Modify Policy Evaluation Solution according to David Silver's slides.

c21dec4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

Uh oh!

QikeLi commented Jul 5, 2018

Uh oh!

amobiny commented Nov 30, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

Are you sure you want to change the base?

Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

Uh oh!

Conversation

QikeLi commented Jul 5, 2018

Uh oh!

amobiny commented Nov 30, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants