Combined loss implementation

Hi, I am trying to understand how you combined the `hard negative loss` L<sub>s</sub> with the `in-batch random negative loss` L<sub>r</sub>, as in the paper the `in-batch random negative loss` is scaled by an `alpha` hyperparameter but there is no mention of the value of `alpha` you used in the experiments.

Following `star/train.py` I found the `RobertaDot_InBatch` model, whose `forward` function calls the `inbatch_train` method.

A the end of the `inbatch_train` method ([line 182](https://github.com/jingtaozhan/DRhard/blob/1f7ba20798534459ba6106c1ee5c44eb110e0a5e/model.py#L182)), I found
```python
return ((first_loss + second_loss) / (first_num + second_num),)
```
which is different from the combined loss proposed in the paper (Eq. 13).

Am I missing something?

Also,  for each query in the batch, did you consider all the possible `in-batch random negatives` or just one?

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Combined loss implementation #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Combined loss implementation #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions