Skip to content

Conversation

ankitade
Copy link
Contributor

@ankitade ankitade commented Jun 21, 2022

  • Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model)
  • The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components
  • So splitting that PR into 2 to make sure my logic changes are not actually breaking anything
    1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values
    2. Next PR which makes projections part of flava model and doesn't touch the tests

Test plan
pytest

Stack from ghstack (oldest at bottom):

Differential Revision: D37466221

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 21, 2022
ankitade added a commit that referenced this pull request Jun 21, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jun 21, 2022

Codecov Report

❗ No coverage uploaded for pull request base (gh/ankitade/4/base@7cdddc5). Click here to learn what that means.
The diff coverage is n/a.

@@                  Coverage Diff                  @@
##             gh/ankitade/4/base     #105   +/-   ##
=====================================================
  Coverage                      ?   93.00%           
=====================================================
  Files                         ?       47           
  Lines                         ?     2758           
  Branches                      ?        0           
=====================================================
  Hits                          ?     2565           
  Misses                        ?      193           
  Partials                      ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7cdddc5...750525f. Read the comment docs.

ankitade added a commit that referenced this pull request Jun 21, 2022
ankitade added a commit that referenced this pull request Jun 22, 2022
ankitade added a commit that referenced this pull request Jun 22, 2022
ankitade added a commit that referenced this pull request Jun 26, 2022
ankitade added a commit that referenced this pull request Jun 26, 2022
@ankitade ankitade changed the title [FLAVA]Change ordering on contrastive loss initialization [FLAVA]Change some initialization orders and corresponding tests Jun 27, 2022
@ankitade ankitade marked this pull request as ready for review June 27, 2022 19:47
@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

… tests"

- Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model)
- The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components
- So splitting that PR into 2 to make sure my logic changes are not actually breaking anything
   1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values
   2. Next PR which makes projections part of flava model and doesn't touch the tests

Test plan 
pytest 


Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221)

[ghstack-poisoned]
ankitade added a commit that referenced this pull request Jun 28, 2022
ankitade added 2 commits June 28, 2022 06:36
… tests"

- Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model)
- The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components
- So splitting that PR into 2 to make sure my logic changes are not actually breaking anything
   1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values
   2. Next PR which makes projections part of flava model and doesn't touch the tests

Test plan 
pytest 


Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221)

[ghstack-poisoned]
… tests"

- Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model)
- The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components
- So splitting that PR into 2 to make sure my logic changes are not actually breaking anything
   1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values
   2. Next PR which makes projections part of flava model and doesn't touch the tests

Test plan 
pytest 


Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221)

[ghstack-poisoned]
@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

… tests"

- Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model)
- The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components
- So splitting that PR into 2 to make sure my logic changes are not actually breaking anything
   1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values
   2. Next PR which makes projections part of flava model and doesn't touch the tests

Test plan 
pytest 


Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221)

[ghstack-poisoned]
… tests"

- Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model)
- The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components
- So splitting that PR into 2 to make sure my logic changes are not actually breaking anything
   1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values
   2. Next PR which makes projections part of flava model and doesn't touch the tests

Test plan 
pytest 


Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221)

[ghstack-poisoned]
@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

5 similar comments
@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

… tests"

- Currently the projections are part of contrastive loss which means we need to use "flava for pretraining" for zero shot. This is weird since zero shot should just involve core model (and not pretraining model)
- The next PR in this stack tried to fix it but broke the tests because of changing initialization order of several components
- So splitting that PR into 2 to make sure my logic changes are not actually breaking anything
   1. This PR which simply changes the initialization order of codebook and contrastive loss and changes the test assert values
   2. Next PR which makes projections part of flava model and doesn't touch the tests

Test plan 
pytest 


Differential Revision: [D37466221](https://our.internmc.facebook.com/intern/diff/D37466221)

[ghstack-poisoned]
@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot facebook-github-bot deleted the gh/ankitade/4/head branch July 29, 2022 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants