Positional Embeddings for Permutation Invariance

Hello, first of all congratulations for this work, keep it up!

Reading the Relational Proxies paper, i spotted some reasonable argument (page 6, paragraph 1, last sentence in [Arxiv version](https://arxiv.org/pdf/2210.02149)) which states "Unlike the usual vision transformer, we omit the usage of positional embedding".

It completely makes sense, and I agree with that. But if I correctly understood, this contradicts with `src.networks.ast.AST`, on which a learnable positional encoding is used. Is it still valid as no inductive bias is done on this encoding? Doesn't it change the final embedding if we shuffle the crops? It can make sense as well, I think both approaches are valid, but I want to double-check I'm tackling the problem properly.

(edit): Line 29 from `src.models.relational_proxies.py`, the proxy criterion is defined after the optimizer. In the [official documentation](https://kevinmusgrave.github.io/pytorch-metric-learning/losses/#proxyanchorloss) it specifies that the parameters of this loss (proxies) are optimized among the other parameters. I suspect that it implies that this code optimizes with fixed proxies, did I miss something? 

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Positional Embeddings for Permutation Invariance #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Positional Embeddings for Permutation Invariance #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions