Abstract
While open-source software has become ubiquitous, its sustainability is in question: without a constant supply of contributor effort, open-source projects are at risk. While prior work has extensively studied the motivations of open-source contributors in general, relatively little is known about how people choose which project to contribute to, beyond personal interest. This question is especially relevant in transparent social coding environments like GitHub, where visible cues on personal profile and repository pages, known as signals, are known to impact impression formation and decision making. In this paper, we report on a mixed-methods empirical study of the signals that influence the contributors' decision to join a GitHub project. We first interviewed 15 GitHub contributors about their project evaluation processes and identified the important signals they used, including the structure of the README and the amount of recent activity. Then, we proceeded quantitatively to test out the impact of each signal based on the data of 9,977 GitHub projects. We reveal that many important pieces of information lack easily observable signals, and that some signals may be both attractive and unattractive. Our findings have direct implications for open-source maintainers and the design of social coding environments, e.g., features to be added to facilitate better project searching experience.
Topics

No keywords indexed for this article. Browse by subject →

References
86
[3]
George A Akerlof . 1978. The market for “lemons”: Quality uncertainty and the market mechanism . In Uncertainty in Economics . Elsevier , 235--251. George A Akerlof. 1978. The market for “lemons”: Quality uncertainty and the market mechanism. In Uncertainty in Economics. Elsevier, 235--251.
[5]
Saeideh Bakhshi , Partha Kanuparthy , and David A. Shamma . 2015. Understanding Online Reviews: Funny, Cool or Useful? . In Proceedings of the ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW). ACM, 1270--1276 . Saeideh Bakhshi, Partha Kanuparthy, and David A. Shamma. 2015. Understanding Online Reviews: Funny, Cool or Useful?. In Proceedings of the ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW). ACM, 1270--1276.
[9]
Richard Blundell and James L Powell . 2003 . Endogeneity in nonparametric and semiparametric regression models . Econometric Society Monographs , Vol. 36 (2003), 312 -- 357 . Richard Blundell and James L Powell. 2003. Endogeneity in nonparametric and semiparametric regression models. Econometric Society Monographs, Vol. 36 (2003), 312--357.
[15]
Jacob Cohen , Patricia Cohen , Stephen G West , and Leona S Aiken . 2013. Applied multiple regression/correlation analysis for the behavioral sciences . Routledge . Jacob Cohen, Patricia Cohen, Stephen G West, and Leona S Aiken. 2013. Applied multiple regression/correlation analysis for the behavioral sciences .Routledge.
[16]
Benjamin C. Collier and Robert Hampshire. 2010. Sending Mixed Signals: Multilevel Reputation Effects in Peer-to-peer Lending Markets . In Proceedings of the ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW). ACM, 197--206 . Benjamin C. Collier and Robert Hampshire. 2010. Sending Mixed Signals: Multilevel Reputation Effects in Peer-to-peer Lending Markets. In Proceedings of the ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW). ACM, 197--206.
[19]
Cristian Danescu-Niculescu-Mizil , Moritz Sudhof , Daniel Jurafsky , Jure Leskovec , and Christopher Potts . 2013 . A computational approach to politeness with application to social factors . meeting of the association for computational linguistics , Vol. 1 (2013), 250--259. Cristian Danescu-Niculescu-Mizil, Moritz Sudhof, Daniel Jurafsky, Jure Leskovec, and Christopher Potts. 2013. A computational approach to politeness with application to social factors. meeting of the association for computational linguistics, Vol. 1 (2013), 250--259.
[20]
Cristian Danescu-Niculescu-Mizil , Moritz Sudhof , Dan Jurafsky , Jure Leskovec , and Christopher Potts . 2013 . A computational approach to politeness with application to social factors . In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) , Vol. 1 . 250--259. Cristian Danescu-Niculescu-Mizil, Moritz Sudhof, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. A computational approach to politeness with application to social factors. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Vol. 1. 250--259.
[22]
Steve Easterbrook , Janice Singer , Margaret-Anne Storey , and Daniela Damian . 2008. Selecting empirical methods for software engineering research . In Guide to Advanced Empirical Software Engineering . Springer , 285--311. Steve Easterbrook, Janice Singer, Margaret-Anne Storey, and Daniela Damian. 2008. Selecting empirical methods for software engineering research. In Guide to Advanced Empirical Software Engineering. Springer, 285--311.
[23]
Nadia Eghbal . 2016. Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure . Ford Foundation . Nadia Eghbal. 2016. Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure .Ford Foundation.
[30]
Andrew Gelman and Jennifer Hill . 2006. Data analysis using regression and multilevel/hierarchical models . Cambridge University Press . Andrew Gelman and Jennifer Hill. 2006. Data analysis using regression and multilevel/hierarchical models .Cambridge University Press.
[33]
William H Greene. 2003. Econometric analysis .Pearson Education India. William H Greene. 2003. Econometric analysis .Pearson Education India.
[34]
Receiver psychology and the evolution of animal signals

Tim Guilford, Marian Stamp Dawkins

Animal Behaviour 10.1016/s0003-3472(05)80600-1
[38]
Jerry A Hausman . 1978. Specification tests in econometrics. Econometrica: Journal of the Econometric Society ( 1978 ), 1251--1271. Jerry A Hausman. 1978. Specification tests in econometrics. Econometrica: Journal of the Econometric Society (1978), 1251--1271.
[43]
Michael L Johnson , William Crown , Bradley C Martin , Colin R Dormuth , and Uwe Siebert . 2009 . Good research practices for comparative effectiveness research: Analytic methods to improve causal inference from nonrandomized studies of treatment effects using secondary data sources: The ISPOR Good Research Practices for Retrospective Database Analysis Task Force Report--Part III . Value in Health , Vol. 12 , 8 (2009), 1062 -- 1073 . Michael L Johnson, William Crown, Bradley C Martin, Colin R Dormuth, and Uwe Siebert. 2009. Good research practices for comparative effectiveness research: Analytic methods to improve causal inference from nonrandomized studies of treatment effects using secondary data sources: The ISPOR Good Research Practices for Retrospective Database Analysis Task Force Report--Part III. Value in Health, Vol. 12, 8 (2009), 1062--1073. 10.1111/j.1524-4733.2009.00602.x

Showing 50 of 86 references

Metrics
61
Citations
86
References
Details
Published
Nov 07, 2019
Vol/Issue
3(CSCW)
Pages
1-29
License
View
Cite This Article
Huilian Sophie Qiu, Yucen Lily Li, Susmita Padala, et al. (2019). The Signals that Potential Contributors Look for When Choosing Open-source Projects. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), 1-29. https://doi.org/10.1145/3359224
Related

You May Also Like

Reliability and Inter-rater Reliability in Qualitative Research

Nora McDonald, Sarita Schoenebeck · 2019

889 citations

To Trust or to Think

Zana Buçinca, Maja Barbara Malaya · 2021

606 citations

Deconstructing Community-Based Collaborative Design

Christina Harrington, Sheena Erete · 2019

470 citations

User Perceptions of Smart Home IoT Privacy

Serena Zheng, Noah Apthorpe · 2018

373 citations