Skip to content

Reinforcement Learning basic sample not working #112

@domswit

Description

@domswit

The basic example provided here does not seem to work because the output was always 0:
https://cs.stanford.edu/people/karpathy/convnetjs/docs.html

Proof:
I tried changing this line:
var reward = action === 0 ? 1.0 : 0.0;
into:
var reward = action === 1 ? 1.0 : 0.0;

*** and got the same result which is 0

Code Example:
/START CODE/
var brain = new deepqlearn.Brain(3, 2); // 3 inputs, 2 possible outputs (0,1)
var state = [Math.random(), Math.random(), Math.random()];
for(var k=0;k<10000;k++) {
var action = brain.forward(state); // returns index of chosen action
var reward = action === 0 ? 1.0 : 0.0;
brain.backward([reward]); // <-- learning magic happens here
state[Math.floor(Math.random()*3)] += Math.random()*2-0.5;
}
brain.epsilon_test_time = 0.0; // don't make any more random choices
brain.learning = false;
// get an optimal action from the learned policy
var action = brain.forward(state);
/END CODE/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions