MultiHopEmbeddingRetriever Poor Performance on Hotpot QA Dataset #5867

ss2342 · 2023-09-24T17:08:33Z

ss2342
Sep 24, 2023

I have been using Haystack to build out an Extractive QA Engine for my project. However, I wanted to go a step further and try to see if I can extend my capabilities to Multi-Hop questions. I came across the MultiHopEmbeddingRetriever and have been trying to use that on the Hotpot QA dataset to see how it would work.

Setup

I first loaded in the dev set from Hotpot QA website. I took the contexts which are in lists and joined them into strings, which I then chunked up into 400-word chunks using Haystack's PreProcessor. After that, I loaded the data into the FAISSDocumentStore. I then initialized the MultiHopEmbeddingRetriever and FARMReader to set up my ExtractiveQAPipeline

from haystack.nodes import MultihopEmbeddingRetriever, FARMReader
from haystack.pipelines import ExtractiveQAPipeline

EMBEDDING_MODEL = "sentence-transformers/multi-qa-mpnet-base-dot-v1"
mhop_retriever = MultihopEmbeddingRetriever( 
    EMBEDDING_MODEL,
    document_store=document_store,
    model_format="sentence_transformers",
    num_iterations=20,
    use_gpu=True)
reader = FARMReader(model_name_or_path="deepset/roberta-large-squad2", use_gpu=True)

pipe = ExtractiveQAPipeline(reader=reader, retriever=retriever)

Methodology

Before running my pipeline against all of the questions from the HotpotQA dataset, I tried running against just a few questions to see what the output looked like. However, I was unable to get the right answer against any of the questions I tried. I tried with various combinations of num_iterations, pooling_strategy, top_k_retriever, top_k_reader but none of them seemed to every yield the correct answer.

Example

This is one of the examples from the hotpot qa dataset that I tried.

{'_id': '5a8c7595554299585d9e36b6',
 'answer': 'Chief of Protocol',
 'question': 'What government position was held by the woman who portrayed Corliss Archer in the film Kiss and Tell?',
 'supporting_facts': [['Kiss and Tell (1945 film)', 0],
  ['Shirley Temple', 0],
  ['Shirley Temple', 1]],
 'context': [['A Kiss for Corliss',
   ['A Kiss for Corliss is a 1949 American comedy film directed by Richard Wallace and written by Howard Dimsdale.',
    ' It stars Shirley Temple in her final starring role as well as her final film appearance.',
    ' It is a sequel to the 1945 film "Kiss and Tell".',
    ' "A Kiss for Corliss" was retitled "Almost a Bride" before release and this title appears in the title sequence.',
    ' The film was released on November 25, 1949, by United Artists.']],
  ['Lord High Treasurer',
   ['The post of Lord High Treasurer or Lord Treasurer was an English government position and has been a British government position since the Acts of Union of 1707.',
    ' A holder of the post would be the third-highest-ranked Great Officer of State, below the Lord High Steward and the Lord High Chancellor.']],
  ['Meet Corliss Archer (TV series)',
   ['Meet Corliss Archer is an American television sitcom that aired on CBS (July 13, 1951 - August 10, 1951) and in syndication via the Ziv Company from April to December 1954.',
    ' The program was an adaptation of the radio series of the same name, which was based on a series of short stories by F. Hugh Herbert.']],
  ['Village accountant',
   ['The Village Accountant (variously known as "Patwari", "Talati", "Patel", "Karnam", "Adhikari", "Shanbogaru","Patnaik" etc.) is an administrative government position found in rural parts of the Indian sub-continent.',
    ' The office and the officeholder are called the "patwari" in Telangana, Bengal, North India and in Pakistan while in Sindh it is called "tapedar".',
    ' The position is known as the "karnam" in Andhra Pradesh, "patnaik" in Orissa or "adhikari" in Tamil Nadu, while it is commonly known as the "talati" in Karnataka, Gujarat and Maharashtra.',
    ' The position was known as the "kulkarni" in Northern Karnataka and Maharashtra.',
    ' The position was known as the "shanbogaru" in South Karnataka.']],
  ['Joseph Kalite',
   ['Joseph Kalite (died 24 January 2014) was a Central African politician.',
    ' As a government minister he either held the housing or health portfolio.',
    ' Kalite, a Muslim, was reported to be killed by anti-balaka outside the Central Mosque in the capital Bangui during the Central African Republic conflict.',
    ' He was killed with machetes on the day in Bangui after interim president Catherine Samba-Panza took power.',
    ' At the time of the attack Kalite held no government position, nor did he under the Séléka rule.',
    ' He was reported to have supported the rule of Séléka leader Michel Djotodia.']],
  ['Charles Craft',
   ['Charles Craft (May 9, 1902 – September 19, 1968) was an English-born American film and television editor.',
    ' Born in the county of Hampshire in England on May 9, 1902, Craft would enter the film industry in Hollywood in 1927.',
    ' The first film he edited was the Universal Pictures silent film, "Painting the Town".',
    ' Over the next 25 years, Craft would edit 90 feature-length films.',
    ' In the early 1950s he would switch his focus to the small screen, his first show being "Racket Squad", from 1951–53, for which he was the main editor, editing 93 of the 98 episodes.',
    ' He would work on several other series during the 1950s, including "Meet Corliss Archer" (1954), "Science Fiction Theatre" (1955–56), and "Highway Patrol" (1955–57).',
    ' In the late 1950s and early 1960s he was one of the main editors on "Sea Hunt", starring Lloyd Bridges, editing over half of the episodes.',
    ' His final film work would be editing "Flipper\'s New Adventure" (1964, the sequel to 1963\'s "Flipper".',
    ' When the film was made into a television series, Craft would begin the editing duties on that show, editing the first 28 episodes before he retired in 1966.',
    ' Craft died on September 19, 1968 in Los Angeles, California.']],
  ['Meet Corliss Archer',
   ["Meet Corliss Archer, a program from radio's Golden Age, ran from January 7, 1943 to September 30, 1956.",
    ' Although it was CBS\'s answer to NBC\'s popular "A Date with Judy", it was also broadcast by NBC in 1948 as a summer replacement for "The Bob Hope Show".',
    ' From October 3, 1952 to June 26, 1953, it aired on ABC, finally returning to CBS.',
    " Despite the program's long run, fewer than 24 episodes are known to exist."]],
  ['Janet Waldo',
   ['Janet Marie Waldo (February 4, 1920 – June 12, 2016) was an American radio and voice actress.',
    ' She is best known in animation for voicing Judy Jetson, Nancy in "Shazzan", Penelope Pitstop, and Josie in "Josie and the Pussycats", and on radio as the title character in "Meet Corliss Archer".']],
  ['Kiss and Tell (1945 film)',
   ['Kiss and Tell is a 1945 American comedy film starring then 17-year-old Shirley Temple as Corliss Archer.',
    ' In the film, two teenage girls cause their respective parents much concern when they start to become interested in boys.',
    " The parents' bickering about which girl is the worse influence causes more problems than it solves."]],
  ['Secretary of State for Constitutional Affairs',
   ['The office of Secretary of State for Constitutional Affairs was a British Government position, created in 2003.',
    " Certain functions of the Lord Chancellor which related to the Lord Chancellor's Department were transferred to the Secretary of State.",
    ' At a later date further functions were also transferred to the Secretary of State for Constitutional Affairs from the First Secretary of State, a position within the government held by the Deputy Prime Minister.']]],
 'type': 'bridge',
 'level': 'hard'}

And these are what the results look like from the ExtractiveQAPipeline (num_iterations=20, pooling_strategy="reduce_mean", top_k_retriever=10, top_k_reader=5)

('Query: What government position was held by the woman who portrayed Corliss '
 'Archer in the film Kiss and Tell?')
'Answers:'
[   {   'answer': 'the first female member of the DGA',
        'context': ', George Cukor ("A Bill of Divorcement"), Dorothy Arzner '
                   '(the first female member of the DGA, on "Christopher '
                   'Strong"), Anthony Mann ("Strangers in th'},
    {   'answer': "executive director of the New York State Governor's Office "
                  'for Motion Picture and Television Development',
        'context': 'man (born 1950) is the executive director of the New York '
                   "State Governor's Office for Motion Picture and Television "
                   'Development and the deputy commiss'},
    {   'answer': 'Shirley Temple',
        'context': 's and Tell is a 1945 American comedy film starring then '
                   '17-year-old Shirley Temple as Corliss Archer. In the film, '
                   'two teenage girls cause their respe'},
    {   'answer': 'Shirley Temple',
        'context': 'irected by Richard Wallace and written by Howard Dimsdale. '
                   'It stars Shirley Temple in her final starring role as well '
                   'as her final film appearance. It'},
    {   'answer': 'Woman of the Year',
        'context': 'creenwriter who shared an Academy Award with Ring Lardner '
                   'Jr. in 1942 for writing the Katharine Hepburn-Spencer '
                   'Tracy film comedy "Woman of the Year".'}]

I was curious if I am doing something incorrectly from my end. I would like to be able to use the MultiHopEmbeddingRetriever for my application as these kinds of multi-hop questions are frequently asked.

Limitations

I am limited to use open-source models so I cannot try to use openai agents.

Additional Info

After writing the chunked up docs into my FAISS Document Store, I have 66k documents in the document store, all coming from Hotpot QA Dataset.

ZanSara · 2023-09-25T11:47:19Z

ZanSara
Sep 25, 2023

Hey @ss2342, MultiHopEmbeddingRetriever was an external contribution by @deutschmn: maybe he can help here?

In case this doesn't work and you can still consider using a generative model instead, PromptNode support a variety of open-source models as well, so you're not limited to OpenAI for this. Have a look: https://docs.haystack.deepset.ai/docs/agent

1 reply

ss2342 Sep 25, 2023
Author

hi @ZanSara , thank you for the reply! So I have tried following the Multi-Hop Questions with Agents tutorial, but I still run into issues with that.

Approach

I use pretty much the same exact code from the tutorial but just swap out the openai model with the default flan-t5 base.

from haystack.agents import Agent
from haystack.nodes import PromptNode

prompt_node = PromptNode(stop_words=["Observation:"])
agent = Agent(prompt_node=prompt_node)

But I got incorrect answers from the tutorial question ("What year was the 1st president of the USA born?"). Would you recommend any particular models that are more geared towards question answering that I can use with the PromptNode?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MultiHopEmbeddingRetriever Poor Performance on Hotpot QA Dataset #5867

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

MultiHopEmbeddingRetriever Poor Performance on Hotpot QA Dataset #5867

Uh oh!

ss2342 Sep 24, 2023

Setup

Methodology

Example

Limitations

Additional Info

Replies: 1 comment · 1 reply

Uh oh!

ZanSara Sep 25, 2023

Uh oh!

Uh oh!

ss2342 Sep 25, 2023 Author

Approach

ss2342
Sep 24, 2023

Replies: 1 comment 1 reply

ZanSara
Sep 25, 2023

ss2342 Sep 25, 2023
Author