DPR finetuning #5645

rnyak · 2023-08-28T15:14:10Z

rnyak
Aug 28, 2023

Hello, I was going over this tutorial, but one thing is not clear to me. The DPR format is asking for hard-negs and the default it is 30. but then there is a statement in the tutorial saying:

DPR is standardly trained using a method known as in-batch negatives. This means that positive contexts for a given query are treated as negative contexts for the other queries in the batch. Doing so allows for a high degree of computational efficiency, thus allowing the model to be trained on large amounts of data.

so my question, if there is in-batch negative sampling, why hard-negs are required in the dpr dataset format? Does that mean, if one provides hard-negs, are those gonna be used during fine-tuning, instead of generating hard-negs via in-batch sampling?

Thanks.

bogdankostic · 2023-08-29T09:42:33Z

bogdankostic
Aug 29, 2023

Hi @rnyak! Hard-negatives are added to the batch and are additionally used as in-batch negatives. A batch for DPR training would therefore look like this:

If we wouldn't add the hard-negative passages, the training task would be a lot easier for the model as the in-batch negatives would only consist of random passages that probably are not related at all to the query. By adding hard-negative passages, we make sure to add passages that are related to the query (by for example being about the same topic), but are not relevant for answering the query.

1 reply

rnyak Aug 29, 2023
Author

@bogdankostic thanks so much for the explanation! The default hard neg number is 30, but in the example you give only one Hard-neg passage 1 (red X) is used in the batch for the Question 1 (or the other questions)? is that what's happening, or all hard_negs in hard_negative_ctxs of a qiven question would be used + other hard negs from other passages..?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DPR finetuning #5645

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

DPR finetuning #5645

Uh oh!

rnyak Aug 28, 2023

Replies: 1 comment · 1 reply

Uh oh!

bogdankostic Aug 29, 2023

Uh oh!

Uh oh!

rnyak Aug 29, 2023 Author

rnyak
Aug 28, 2023

Replies: 1 comment 1 reply

bogdankostic
Aug 29, 2023

rnyak Aug 29, 2023
Author