CMU-CS-22-149
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-22-149

The Use of Explicit Question Decompositions
for Multihop Question Answering Retrieval

Neha Nishikant

M.S. Thesis

December 2022

CMU-CS-22-149.pdf


Keywords: Natural language processing, multihop question answering, information retrieval, dense retrieval

Open domain question answering is a problem in NLP where a system answers learns to answer questions based on a large corpus of documents. Commonly, a retriever model first retrieves relevant documents and then a reader model extracts the correct answer. We specifically seek to improve the retrieval of "multihop questions", which are questions that can be decomposed into multiple subquestions, making them more complex and realistic. We explore if using gold standard annotated explicit question decompositions in the retriever model can improve results. We define a model MEX and MEX-oracle and perform experiments on a multihop QA dataset. Our results show that explicit decompositions are useful for sparse retrieval models, but not for dense retrieval.

36 pages

Thesis Committee:
Eric Nyberg (Chair)
Graham Neubig

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu