Skip to content

who do struggle with tf.nn.softmax_cross_entropy_with_logits_v2 in Cartpole REINFORCE Monte Carlo Policy Gradients #85

@gekator

Description

@gekator

Guys, if you struggle with
neg_log_prob = tf.nn.softmax_cross_entropy_with_logits_v2(logits = fc3, labels = actions)
in n Cartpole REINFORCE Monte Carlo Policy Gradients.
I killed some time to understand what is happening there
You can change code as bellow:

y_hat_softmax = tf.nn.softmax(fc3)

y_cross = actions * tf.log(y_hat_softmax)

neg_log_prob = - tf.reduce_sum(y_cross, 1)

loss = tf.reduce_mean(neg_log_prob * discounted_episode_rewards_)

also change
actions = tf.placeholder(tf.float32, [None, action_size], name="actions")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions