|
|
CMU-CS-25-132 Computer Science Department School of Computer Science, Carnegie Mellon University
Decomposing Complexity: An LLM-Based Approac Zhijie Xu M.S. Thesis August 2025
Task decomposition in software engineering enables the division of complex engi- neering tasks into manageable components, facilitating modularization and collaborative development. However, supporting newcomer onboarding in open source projects remains challenging, as complex issues often assume substantial domain knowledge that prevents meaningful contributions. While maintainers understand the value of providing entry-level tasks, manually creating approachable entry points competes with other development demands. In this work, we investigate task decomposition as a foundation for human-augmentation, creating and analyzing a dataset of decomposition patterns across ten Apache projects that reveals how experienced developers naturally break down complex tasks into 3 different patterns. Building on these insights, we integrate a decomposition component into SWE-agent to validate that structured task breakdown creates genuine problem-solving value. Our system achieves a 24% performance improvement over the non-decomposed baseline on SWE-Bench verified dataset. While evaluation focused on AI agents rather than human contributors, this technical validation provides necessary evidence that decomposition creates structural value. This research reframes newcomer onboarding from "finding newcomer-oriented tasks" to "creating navigable pathways into meaningful work", establishing the foundation for validating decomposition benefits through human studies with real newcomers in live open source projects. 46 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
|
|
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |
|