You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm also doing the FlyVec evaluation experiment, but I don't know how to get 10 nearest neighbor words in the hash code space. After reading your code, I think the top_k_similar_words in your experiment is not the 10 nearest neighbor words in the paper. Do you have any ideas about this experiment?
The text was updated successfully, but these errors were encountered:
Hi.
Are you referring to the sim function?
That should calculate the similarity between two embeddings. Since the sparse embeddings are binary, we can just compare two embeddings across all dimensions with equality (==) and sum them up.
This should be equivalent to calculating a proper distance (e.g. L2) since the (squared) differences between 0 and 1 can only be 0 or 1. I think this is also how it is described in the paper in section 3.1.
Does that make sense or did I misunderstand your question?
Hi. Are you referring to the sim function? That should calculate the similarity between two embeddings. Since the sparse embeddings are binary, we can just compare two embeddings across all dimensions with equality (==) and sum them up. This should be equivalent to calculating a proper distance (e.g. L2) since the (squared) differences between 0 and 1 can only be 0 or 1. I think this is also how it is described in the paper in section 3.1. Does that make sense or did I misunderstand your question?
Thank you, I also do the same for sim function. My question is Firgure 4 in the paper, I don't know how to get it. Emmm, do you know how to use FlyVec to get the context-dependent word embeddings?
The functions get_sparse_embeddings and get_dense_embeddings provided in the library only produce static embeddings.
I am not sure about the context-dependent case.
The authors of the original paper can probably help you here and I saw you already reached out to them in their repo.
I'm also doing the FlyVec evaluation experiment, but I don't know how to get 10 nearest neighbor words in the hash code space. After reading your code, I think the top_k_similar_words in your experiment is not the 10 nearest neighbor words in the paper. Do you have any ideas about this experiment?
The text was updated successfully, but these errors were encountered: