CMU-CS-22-149 Computer Science Department School of Computer Science, Carnegie Mellon University
The Use of Explicit Question Decompositions Neha Nishikant M.S. Thesis December 2022
Open domain question answering is a problem in NLP where a system answers learns to answer questions based on a large corpus of documents. Commonly, a retriever model first retrieves relevant documents and then a reader model extracts the correct answer. We specifically seek to improve the retrieval of "multihop questions", which are questions that can be decomposed into multiple subquestions, making them more complex and realistic. We explore if using gold standard annotated explicit question decompositions in the retriever model can improve results. We define a model MEX and MEX-oracle and perform experiments on a multihop QA dataset. Our results show that explicit decompositions are useful for sparse retrieval models, but not for dense retrieval.
36 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |