How to Evaluate Retriever on own test_data? #4034

VikasRathod314 · 2023-02-02T06:04:43Z

VikasRathod314
Feb 2, 2023

I trained the retriever on our dataset using the PseudoLabelGenerator model and passed it to the model and trained
Also, found the eval_bier method to evaluate the models but they used different data formats to eval the model. but it isn't easy to evaluate our trained model. Is there any other way to eval retrievers on document search?

bogdankostic · 2023-02-02T09:29:51Z

bogdankostic
Feb 2, 2023

Hi @VikasRathod314! If you already have an annotated dataset, evaluating the retriever should be no problem: have a look at our Evaluation Documentation page and at our Evaluation tutorial.

4 replies

VikasRathod314 Feb 2, 2023
Author

Yes, @bogdankostic, I have checked the above evaluation documentation, I am using PseudoLabelGenerator (questions,pos_doc,neg_doc, and score) data format to train the retriever. I have not used the haystack annotation tool to annotate data.

bogdankostic Feb 2, 2023

If you don't have an annotated evaluation set, it's not possible to run evaluation. For evaluation, you need to define what would be the expected outcome for a given query in order to assess whether your model produces the expected outcome.

VikasRathod314 Feb 2, 2023
Author

I have compared word by word from predicted with actual answers. Is there any method to check the accuracy of the respective retriever.

bogdankostic Feb 2, 2023

A Retriever is usually evaluated by checking whether the relevant (expected) Document is among the retrieved Documents. You might also want to have a look at this page where we explain the different metrics for retriever evaluation.

VikasRathod314 · 2023-02-08T11:38:57Z

VikasRathod314
Feb 8, 2023
Author

@bogdankostic how i can improve the accuracy of retriever. I have trained and tested the retriever but getting less accuracy increased the dataset but also not getting improvement. Please suggest.

1 reply

mayankjobanputra Feb 8, 2023

@VikasRathod314 are you using an appropriate base model that is trained on the same/similar domain? Also how big is your test set? Did you have a look at the data generated by PseudoLabelGenerator and it looks good to you? Also can you share the exact numbers?

VikasRathod314 · 2023-02-09T06:19:03Z

VikasRathod314
Feb 9, 2023
Author

Good Morning @mayankjobanputra, Test data is about (650 rows) Generated from PseudoLabelGenerator. Also, I have checked those data and removed unwanted Questions and their respective pos_doc and neg_doc. Please do suggest to me make a better retriever as.

1 reply

mayankjobanputra Feb 13, 2023

@VikasRathod314 can you please help me with following questions:

are you using an appropriate base model that is trained on the same/similar domain?
can you share the exact numbers for evaluation metrics?
How big is your training data?

PhilipMay · 2023-04-26T14:07:42Z

PhilipMay
Apr 26, 2023

@bogdankostic this is strange. On multiple different places I find "Accuracy@k" as the main metric for DPR.
Heystack does not seem to mention this....

Example: https://arxiv.org/pdf/2004.04906.pdf Page 5

1 reply

bogdankostic May 2, 2023

Hey @PhilipMay, the accuracy@k metric used in the DPR paper is equivalent to our recall metric, we just call it differently.

sarahraith13 · 2023-09-29T13:34:39Z

sarahraith13
Sep 29, 2023

Hey @bogdankostic is it possible to evaluate the throughput and the indexing time using eval()? I know I get the logger information but is there any way I could get measures that point towards the time it takes to index, throughput, etc? Thank you :)

0 replies

How to Evaluate Retriever on own test_data? #4034

Uh oh!

Replies: 5 comments · 7 replies

Uh oh!

Uh oh!

Uh oh!

VikasRathod314 Feb 2, 2023 Author

Uh oh!

Uh oh!

Uh oh!

VikasRathod314 Feb 2, 2023 Author

Uh oh!

Uh oh!

Uh oh!

VikasRathod314 Feb 8, 2023 Author

Uh oh!

Uh oh!

VikasRathod314 Feb 9, 2023 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 5 comments 7 replies

VikasRathod314 Feb 2, 2023
Author

VikasRathod314 Feb 2, 2023
Author

VikasRathod314
Feb 8, 2023
Author

VikasRathod314
Feb 9, 2023
Author