CMU-S3D-26-102
Software and Societal Systems Department
School of Computer Science, Carnegie Mellon University



CMU-S3D-26-102

Facilitating Collaboration in Building Machine Learning Products

Nadia Nahar

April 2026

Ph.D. Thesis
Software Engineering

CMU-S3D-26-102.pdf


Keywords: Software Engineering for Machine Learning, Machine Learning, Large Language Models, Collaboration, Responsible AI

In this dissertation, I investigate the collaboration challenges between software engineers and data scientists in building machine learning (ML) products, and identify and propose interventions to facilitate their collaboration by bridging the identified knowledge boundaries.

Despite significant advancements in ML algorithms and model development, integrating ML models into operational products remains challenging, with collaboration issues frequently cited as one of the major challenges. I identify challenges in integrating ML into software products by synthesizing the collective knowledge of the field through a qualitative meta-summary of the academic literature, and then examine collaboration challenges–an aspect largely underexplored in prior work–through a qualitative interview study with industry practitioners; I further investigate challenges when integrating large language models into software products. I then demonstrate principles for addressing these collaboration challenges through three studies: (a) a mixed-method study examining emerging practices for evaluating large language model-generated content in software applications, (b) a set of interventions (i.e., policy guidance, role-playing chatbots, and educational support) to guide the development of explainable AI and improve transparency and interpretability of ML models in products, and (c) an intervention designed to engage practitioners in responsible AI practices, fostering ethical awareness and compliance. These interventions are designed to address the syntactic, semantic, and pragmatic knowledge boundaries that hinder effective teamwork in ML product development.

By systematically identifying and addressing collaboration challenges among practitioners, this dissertation aims to support the successful development and deployment of ML products in real-world settings.

245 pages

Thesis Committee:
Christian Kästner (Chair)
James D. Herbsleb
Claire Le Goues
Kenneth Holstein
Samir Passi (Microsoft)

Nicolas Christin, Head, Software and Societal Systems Department
Martial Hebert, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu