Skip to content

Conversation

avdmitry
Copy link

Hello, Andrej @karpathy

I think here should be (1-this.drop_prob). Let's consider probability = 0.1 - we should drop 10% and after that scale weights only slightly onto 0.9.

WBR,
Dmitriy

@jruales
Copy link

jruales commented Jan 24, 2016

E[weight] = drop_prob_0 + (1-drop_prob)_w[i] = (1-drop_prob)*w[i], so I agree with @avdmitry

@jruales
Copy link

jruales commented Jan 24, 2016

@avdmitry 's solution agrees with Figure 2 in the third page of the dropout paper: http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf

The Keras library takes a different approach (but still somewhat equivalent): it divides by the retain probability during training time so that during testing the dropout layer is just an identity function
https://github.com/fchollet/keras/blob/master/keras/layers/core.py#L621
https://github.com/fchollet/keras/blob/master/keras/backend/theano_backend.py#L555

radioman pushed a commit to radioman/ConvNetSharp that referenced this pull request Nov 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants