CMU-ISR-21-103
Institute for Software Research
School of Computer Science, Carnegie Mellon University



CMU-ISR-21-103

External Factors in Sustainability of Open Source Software

Marat Valiev

February 2021

Ph.D. Thesis
Societal Computing

CMU-ISR-21-103.pdf


Keywords: Collaborative Software Development, Empirical Software Engineering, Open Source

Modern software development is heavily reliant on Open Source. It saves time and money, but, as any other non-commercial software, it comes on as-is basis. If not properly maintained or even abandoned by its community, Open Source Software (OSS) might impose extra costs or introduce bugs to the downstream projects. While software developers are well aware of these premises, existing techniques of mitigating such risks assume sustainability to be an intrinsic property of OSS, largely ignoring external factors. With plenty of examples of even high profile projects failing because of bugs in upstream dependencies, funding issues, lost competition or key developers left, it is important to understand the effect of these factors on OSS sustainability.

Using a combination of quantitative and qualitative methods, this dissertation explores effects of external factors in OSS sustainability, the mechanisms behind them, and proposes tools to make certain risk factors more visible. The findings indicate that multiple external factors, such as reused libraries, dependent projects and organizational involvement, play a signficant role in OSS projects sustainability. Projects serving end users and end programmers are particularly at risk to be overwhelmed by excessive number of requests, especially questions. We found, however, that there are simple heuristics that can help getting additional insight into the structure of effort demand in OSS. For example, since established users of software mostly report bugs and new adopters mostly ask questions, we can estimate project's lifecycle stage and user base structure using already existing issue classification. Finally, this work shows that in many cases simple tools, such as autoencoder-based embeddings, can be used to detect less visible sustainability factors, such as competition and surrounding communities.

88 pages

Thesis Committee:
James Herbsleb (Co-Chair)
Bogdan Vasilescu (Co-Chair)
Audris Mockus (University of Tennessee)
Vladiir Fikov (University of California, Davis)

James D. Herbsleb, Director, Institute for Software Research
Martial Hebert, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu