-
Notifications
You must be signed in to change notification settings - Fork 38
Validation Score #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The link you provided is the latest paper. But this repo corresponds to the v3 version (https://arxiv.org/pdf/1705.02798v3.pdf) which was published on 5 Sep 2017. If I have free time, I will check the places that have been improved. Of course, welcome you to help to find the difference. Thanks! |
Running the training right now on 2 GPUs. Epoch 21 - EM - 72.93 and F1 - 81.69. Looks like I might have already gotten better than published results. Happy to help if you point me in some direction. |
Hi Sean, The following are the changes to the new paper vs V3. The embedding layers are the same as before as they aren't really mentioned in the new paper at all. 1.) Reattention Mechanism - In addition to Iterative Alignment and Self Alignment there is a proposed Alignment memory layer to address the problem that each alignment is not directly aware of previous alignments. The intuition is that two words should be correlated if their attentions about same texts are highly overlapped, and be less related vice versa. Suppose that we have access to previous attentions, and then we can compute their dot product to obtain a “similarity of attention”. This is the biggest addition to the model versus V3 and would be the main effort in coding. 2.) Dynamic-critical Reinforcement Learning - This is essentially a combination of previous Memory based Answer Pointer and Reinforcement Learning with changes. The changes are few and should be easy to implement. Let me know what you think about the above. Thanks in Advance Muni |
@muni2773 do you have any progress in the EM or F1 compared to V3 version? |
Hi You Mention in your readme that the originial paper results were
But the paper references much higher results ?
https://arxiv.org/pdf/1705.02798.pdf
Could you shed some light on why this is different?
The text was updated successfully, but these errors were encountered: