FlyVec-10 nearest neighbor words in the hash code space #1

SizhaoXu · 2022-09-02T10:02:45Z

I'm also doing the FlyVec evaluation experiment, but I don't know how to get 10 nearest neighbor words in the hash code space. After reading your code, I think the top_k_similar_words in your experiment is not the 10 nearest neighbor words in the paper. Do you have any ideas about this experiment?

Flowshu · 2022-09-02T12:25:56Z

Hi.
Are you referring to the sim function?
That should calculate the similarity between two embeddings. Since the sparse embeddings are binary, we can just compare two embeddings across all dimensions with equality (==) and sum them up.
This should be equivalent to calculating a proper distance (e.g. L2) since the (squared) differences between 0 and 1 can only be 0 or 1. I think this is also how it is described in the paper in section 3.1.
Does that make sense or did I misunderstand your question?

SizhaoXu · 2022-09-04T06:22:26Z

Hi. Are you referring to the sim function? That should calculate the similarity between two embeddings. Since the sparse embeddings are binary, we can just compare two embeddings across all dimensions with equality (==) and sum them up. This should be equivalent to calculating a proper distance (e.g. L2) since the (squared) differences between 0 and 1 can only be 0 or 1. I think this is also how it is described in the paper in section 3.1. Does that make sense or did I misunderstand your question?

Thank you, I also do the same for sim function. My question is Firgure 4 in the paper, I don't know how to get it. Emmm, do you know how to use FlyVec to get the context-dependent word embeddings?

Flowshu · 2022-09-05T20:53:59Z

The functions get_sparse_embeddings and get_dense_embeddings provided in the library only produce static embeddings.
I am not sure about the context-dependent case.
The authors of the original paper can probably help you here and I saw you already reached out to them in their repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FlyVec-10 nearest neighbor words in the hash code space #1

FlyVec-10 nearest neighbor words in the hash code space #1

SizhaoXu commented Sep 2, 2022

Flowshu commented Sep 2, 2022

Uh oh!

SizhaoXu commented Sep 4, 2022

Uh oh!

Flowshu commented Sep 5, 2022

Uh oh!

FlyVec-10 nearest neighbor words in the hash code space #1

FlyVec-10 nearest neighbor words in the hash code space #1

Comments

SizhaoXu commented Sep 2, 2022

Flowshu commented Sep 2, 2022

Uh oh!

SizhaoXu commented Sep 4, 2022

Uh oh!

Flowshu commented Sep 5, 2022

Uh oh!