Institute for Software Research
School of Computer Science, Carnegie Mellon University


Modeling Coordination and Productivity in
Open-Source GitHub Projects

Samridhi Shree Choudhary, Christopher Bogart,
Carolyn Penstein Rosé, James D. Herbsleb

June 2018


Keywords: Bursty Streams, Socio-Technical Congruence, Productivity, Activity Phases, Hidden Markov Model, Open-Source, GitHub

In open-source software development, coordination between globally separated developers is often structured in ways not immediately visible to them, such as implicit groupings of people working on similar code and related issues. This opacity is despite the availability and accessibility of a large quantity of low-level project activity data on platforms like GitHub. This paper uses this low-level data to construct meaningful indicators that could offer improved transparency into how coordination is conducted in open source. Prior work has shown the value of Socio-Technical Congruence for evaluating the quality of coordination in commercial software systems. However, little work has successfully translated this analysis to the domain of open source, primarily due to their less formal and inconsistent ways of partitioning work, assigning tasks and measuring success. We present a technique for distinguishing the active phases of coordination and define a measure of productivity for these projects. We perform a quantitative analysis of the influence of congruence on productivity in these phases demonstrating that the associations between our measure of productivity and the measures of congruence and other control variables are subtle but consistent with the prior work in commercial software development and discuss some applications of our work.

23 pages

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by