-
Notifications
You must be signed in to change notification settings - Fork 158
[FLAVA]Change some initialization orders and corresponding tests #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
Codecov Report
@@ Coverage Diff @@
## gh/ankitade/4/base #105 +/- ##
=====================================================
Coverage ? 93.00%
=====================================================
Files ? 47
Lines ? 2758
Branches ? 0
=====================================================
Hits ? 2565
Misses ? 193
Partials ? 0 Continue to review full report at Codecov.
|
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
5 similar comments
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… tests" - Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model) - The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components - So splitting that PR into 2 to make sure my logic changes are not actually breaking anything 1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values 2. Next PR which makes projections part of flava model and doesn't touch the tests Test plan pytest Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221) [ghstack-poisoned]
@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Test plan
pytest
Stack from ghstack (oldest at bottom):
Differential Revision: D37466221