目录
目录README.md

Cross_Community_Study

This repository complements our recent work on the “Open Source Oriented Cross-community Survey”. The repository includes papers related to our research as well as related resources (datasets, etc.).

Search String

Specific search strings employed for each database:

Search Database Search Strings Paper Numbers
ACM (Title: ((“open source” OR “oss”)) OR Abstract: ((“open source” OR “oss”))) AND (Title: ((“across communities” OR “across platforms” OR “across networks” OR “across systems” OR “cross community” OR “cross platform” OR “cross network” OR “cross system” OR “multi community” OR “multi platform” OR “multi network” OR “multi system” OR “multiple communities” OR “multiple platforms” OR “multiple networks” OR “multiple systems”)) OR Abstract: ((“across communities” OR “across platforms” OR “across networks” OR “across systems” OR “cross community” OR “cross platform” OR “cross network” OR “cross system” OR “multi community” OR “multi platform” OR “multi network” OR “multi system” OR “multiple communities” OR “multiple platforms” OR “multiple networks” OR “multiple systems”))) 111
Springer (“open source” OR “oss”) AND (“across communities” OR “across platforms” OR “across networks” OR “across systems” OR “cross community” OR “cross platform” OR “cross network” OR “cross system” OR “multi community” OR “multi platform” OR “multi network” OR “multi system” OR “multiple communities” OR “multiple platforms” OR “multiple networks” OR “multiple systems”) 1209
IEEE Xplore ((“Document Title”:”open source” OR “Document Title”:”oss”) OR (“Abstract”:”open source” OR “Abstract”:”oss”)) AND (“Document Title”:”across communities” OR “Document Title”:”across platforms” OR “Document Title”:”across networks” OR “Document Title”:”across systems” OR “Document Title”:”cross community” OR “Document Title”:”cross platform” OR “Document Title”:”cross network” OR “Document Title”:”cross system” OR “Document Title”:”multi community” OR “Document Title”:”multi platform” OR “Document Title”:”multi network” OR “Document Title”:”multi system” OR “Document Title”:”multiple communities” OR “Document Title”:”multiple platforms” OR “Document Title”:”multiple networks” OR “Document Title”:”multiple systems” OR “Abstract”:”across communities” OR “Abstract”:”across platforms” OR “Abstract”:”across networks” OR “Abstract”:”across systems” OR “Abstract”:”cross community” OR “Abstract”:”cross platform” OR “Abstract”:”cross network” OR “Abstract”:”cross system” OR “Abstract”:”multi community” OR “Abstract”:”multi platform” OR “Abstract”:”multi network” OR “Abstract”:”multi system” OR “Abstract”:”multiple communities” OR “Abstract”:”multiple platforms” OR “Abstract”:”multiple networks” OR “Abstract”:”multiple systems”) 222
Scopus (TITLE-ABS-KEY (((“open source” OR “oss”) AND (“across communities” OR “across platforms” OR “across networks” OR “across systems” OR “cross community” OR “cross platform” OR “cross network” OR “cross system” OR “multi community” OR “multi platform” OR “multi network” OR “multi system” OR “multiple communities” OR “multiple platforms” OR “multiple networks” OR “multiple systems”)))) AND PUBYEAR > 2012 AND PUBYEAR < 2025 AND (LIMIT-TO (DOCTYPE, “cp”) OR LIMIT-TO (DOCTYPE, “ar”) OR LIMIT-TO (DOCTYPE, “re”) OR LIMIT-TO (DOCTYPE, “cr”)) AND (LIMIT-TO (SUBJAREA, “COMP”)) AND (LIMIT-TO (LANGUAGE, “English”)) 714

Papers

Searching phase Related study
Initial Search Chenxi Song, Tao Wang, Gang Yin, Xunhui Zhang, and Cheng Yang. A novel open source software ecosystem: From a graphic point of view and its application. In SEKE, pages 71–74, 2016.
Xiaotao Song, Jiafei Yan, Yuexin Huang, Hailong Sun, and Hongyu Zhang. A collaboration-aware approach to profiling developer expertise with cross-community data. In 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), pages 344–355, 2022.
Hao Huang, Yao Lu, and Xinjun Mao. Gathering github oss requirements from q&a community: An empirical study. In 2020 25th International Conference on Engineering of Complex Computer Systems (ICECCS), 2020.
Hongbo Fang, Bogdan Vasilescu, and James Herbsleb. Understanding information diffusion about open-source projects on twitter, hackernews, and reddit. In 2023 IEEE/ACM 16th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE), pages 56–67. IEEE, 2023.
Snowball-1 Hongbo Fang, Hemank Lamba, James Herbsleb, and Bogdan Vasilescu. “this is damn slick!”: Estimating the impact of tweets on open source project popularity and new contributors. In ICSE ’22: Proceedings of the 44th International Conference on Software Engineering, 2022.
W. Huang, W. Mo, B. Shen, Y. Yang, and N. Li, “Cpdscorer: Modeling and evaluating developer programming ability across software communities.” in SEKE, 2016, pp. 87–92.
Ali Sajedi Badashian, Abram Hindle, and Eleni Stroulia. Crowdsourced bug triaging: Leveraging q&a platforms for bug assignment. In Fundamental Approaches to Software Engineering: 19th International Conference, FASE 2016, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2016, Eindhoven, The Netherlands,April 2–8, 2016, Proceedings 19, pages 231–248. Springer, 2016.
Yunxiang Xiong, Zhangyuan Meng, Beijun Shen, and Wei Yin. Developer identity linkage and behavior mining across github and stackoverflow. International Journal of Software Engineering and Knowledge Engineering, 27(09n10):1409–1425, 2017.
X. Zhang, T. Wang, G. Yin, C. Yang, Y. Yu, and H. Wang, “Devrec: A developer recommendation system for open source repositories,” in International Conference on Software Reuse. Springer, 2017, pp. 3–11.
Roy Ka-Wei Lee and David Lo. Github and stack overflow: Analyzing developer interests across multiple social collaborative platforms. In Social Informatics: 9th International Conference, SocInfo 2017, Oxford, UK, September 13-15, 2017,Proceedings, Part II 9, pages 245–256. Springer, 2017.
BORGES Hudson Silva and VALENTE Marco Tulio. How do developers promote open source projects?
Giuseppe Silvestri, Jie Yang, Alessandro Bozzon, Andrea Tagarelli, et al. Linking accounts across social networks: the case of stackoverflow, github and twitter. In KDWeb, pages 41–52, 2015.
Hongbo Fang, Daniel Klug, Hemank Lamba, James Herbsleb, and Bogdan Vasilescu. Need for tweet: How open source developers talk about their github work on twitter. In MSR ’20: Proceedings of the 17th International Conference on Mining Software Repositories, 2020.
Jiafei Yan, Hailong Sun, Xu Wang, Xudong Liu, and Xiaotao Song. Profiling developer expertise across software communities with heterogeneous information network analysis. In Proceedings of the 10th Asia-Pacific Symposium on Internetware, Internetware ’18, New York, NY, USA, 2018. Association for Computing Machinery.
Yao Wan, Liang Chen, Guandong Xu, Zhou Zhao, Jie Tang, and Jian Wu. Scsminer: mining social coding sites for software developer recommendation with relevance propagation. World Wide Web, 21:1523–1543, 2018.
Leif Singer, Fernando Figueira Filho, and Margaret-Anne Storey. Software engineering at the speed of light: how developers stay current using twitter. In Proceedings of the 36th International Conference on Software Engineering, pages 211–221,2014.
Bogdan Vasilescu, Vladimir Filkov, and Alexander Serebrenik. Stackoverflow and github: Associations between software development and crowdsourced knowledge. In 2013 International Conference on Social Computing, pages 188–195. IEEE,2013.
Wenkai Mo, Beijun Shen, Yuting Chen, and Jiangang Zhu. Tbil: A tagging-based approach to identity linkage across software communities. In 2015 Asia-Pacific Software Engineering Conference (APSEC), pages 56–63. IEEE, 2015.
Snowball-2 Takahiro Komamizu, Yasuhiro Hayase, Toshiyuki Amagasa, and Hiroyuki Kitagawa. Exploring identical users on github and stack overflow. In SEKE, pages 584–589, 2017.
Saraj Singh Manes and Olga Baysal. How often and what stackoverflow posts do developers reference in their github projects? In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pages 235–239. IEEE, 2019.
Yujie Fan, Yiming Zhang, Shifu Hou, Lingwei Chen, Yanfang Ye, Chuan Shi, Liang Zhao, and Shouhuai Xu. idev: Enhancing social coding security by cross-platform user identification between github and stack overflow. In 28th International Joint Conference on Artificial Intelligence (IJCAI), 2019, 2019.
Ali Sajedi Badashian, Afsaneh Esteki, Ameneh Gholipour, Abram Hindle, and Eleni Stroulia. Involvement, contribution and influence in github and stack overflow. In CASCON, pages 19–33, 2014.
Di Yang, Pedro Martins, Vaibhav Saini, and Cristina Lopes. Stack overflow in github: any snippets there? In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pages 280–290. IEEE, 2017.
Sri Lakshmi Vadlamani and Olga Baysal. Studying software developer expertise and contributions in stack overflow and github. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 312–323. IEEE,2020.
Saraj Singh Manes and Olga Baysal. Studying the change histories of stack overflow and github snippets. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pages 283–294, 2021.
Jungil Kim and Eunjoo Lee. Understanding the working habits of gh-so users on github commit activity and stack overflow post activity. International Journal of Software Engineering and Knowledge Engineering, 31(10):1399–1419, 2021.
Sebastian Baltes and Stephan Diehl. Usage and attribution of stack overflow code snippets in github projects. Empirical Software Engineering, 24(3):1259–1295, 2019.
Roy Ka-Wei Lee and David Lo. Wisdom in sum of parts: Multi-platform activity prediction in social collaborative sites. In Proceedings of the 10th ACM Conference on Web Science, pages 77–86, 2018.
Snowball-3 Rahul Venkataramani, Atul Gupta, Allahbaksh Asadullah, Basavaraju Muddu, and Vasudev Bhat. Discovery of technical expertise from open source code repositories. In WWW ’13 Companion: Proceedings of the 22nd International Conference on World Wide Web, 2013.
Yuan Huang, Furen Xu, Haojie Zhou, Xiangping Chen, Xiaocong Zhou, and Tong Wang. Towards exploring the code reuse from stack overflow during software development. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, pages 548–559, 2022.

Additional papers:

  • Xiangping Chen, Furen Xu, Yuan Huang, Xiaocong Zhou, and Zibin Zheng. An empirical study of code reuse between github and stack overflow during software development. Journal of Systems and Software, 210:111964, 2024.
  • Syful Islam, Yusuf Sulistyo Nugroho, Chy Md Shahrear, Nuhash Wahed, Dedi Gunawan, Endang Wahyu Pamungkas, Mohammed Humayun Kabir, Yogiek Indra Kurniawan, and Md Kamal Uddin. An empirical study of software ecosystem related tweets by npm maintainers. PeerJ Computer Science, 10:e1669, 2024.
  • Aref Talebzadeh Bardsiri and Abbas Rasoolzadegan. Evaluating developers’ expertise in serverless functions by mining activities from multiple platforms. Computer and Knowledge Engineering, 2024.
  • Hanzhi Jiang, Lin Shi, Meiru Che, Yuxia Zhang, and Qing Wang. Bringing open source communication and development together: A cross-platform study on gitter and github. IEEE Transactions on Software Engineering, 2024.

Analysis of user characteristics.

  • Aref Talebzadeh Bardsiri and Abbas Rasoolzadegan. Evaluating developers’ expertise in serverless functions by mining activities from multiple platforms. Computer and Knowledge Engineering, 2024.
  • Xiaotao Song, Jiafei Yan, Yuexin Huang, Hailong Sun, and Hongyu Zhang. A collaboration-aware approach to profiling developer expertise with cross-community data. In 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), pages 344–355, 2022.
  • Jungil Kim and Eunjoo Lee. Understanding the working habits of gh-so users on github commit activity and stack overflow post activity. International Journal of Software Engineering and Knowledge Engineering, 31(10):1399–1419, 2021.
  • Hongbo Fang, Daniel Klug, Hemank Lamba, James Herbsleb, and Bogdan Vasilescu. Need for tweet: How open source developers talk about their github work on twitter. In MSR ’20: Proceedings of the 17th International Conference on Mining Software Repositories, 2020.
  • Sri Lakshmi Vadlamani and Olga Baysal. Studying software developer expertise and contributions in stack overflow and github. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 312–323. IEEE,2020.
  • Roy Ka-Wei Lee and David Lo. Wisdom in sum of parts: Multi-platform activity prediction in social collaborative sites. In Proceedings of the 10th ACM Conference on Web Science, pages 77–86, 2018.
  • Jiafei Yan, Hailong Sun, Xu Wang, Xudong Liu, and Xiaotao Song. Profiling developer expertise across software communities with heterogeneous information network analysis. In Proceedings of the 10th Asia-Pacific Symposium on Internetware,Internetware ’18, New York, NY, USA, 2018. Association for Computing Machinery.
  • Yao Wan, Liang Chen, Guandong Xu, Zhou Zhao, Jie Tang, and Jian Wu. Scsminer: mining social coding sites for software developer recommendation with relevance propagation. World Wide Web, 21:1523–1543, 2018.
  • Roy Ka-Wei Lee and David Lo. Github and stack overflow: Analyzing developer interests across multiple social collaborative platforms. In Social Informatics: 9th International Conference, SocInfo 2017, Oxford, UK, September 13-15, 2017,Proceedings, Part II 9, pages 245–256. Springer, 2017.
  • Yunxiang Xiong, Zhangyuan Meng, Beijun Shen, and Wei Yin. Developer identity linkage and behavior mining across github and stackoverflow. International Journal of Software Engineering and Knowledge Engineering, 27(09n10):1409–1425, 2017.
  • Xunhui Zhang, Tao Wang, Gang Yin, Cheng Yang, Yue Yu, and Huaimin Wang. Devrec: A developer recommendation system for open source repositories. In 16th International Conference on Software Reuse, ICSR 2017, 2017.
  • Ali Sajedi Badashian, Abram Hindle, and Eleni Stroulia. Crowdsourced bug triaging: Leveraging q&a platforms for bug assignment. In Fundamental Approaches to Software Engineering: 19th International Conference, FASE 2016, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2016, Eindhoven, The Netherlands,April 2–8, 2016, Proceedings 19, pages 231–248. Springer, 2016.
  • Leif Singer, Fernando Figueira Filho, and Margaret-Anne Storey. Software engineering at the speed of light: how developers stay current using twitter. In Proceedings of the 36th International Conference on Software Engineering, pages 211–221,2014.
  • Ali Sajedi Badashian, Afsaneh Esteki, Ameneh Gholipour, Abram Hindle, and Eleni Stroulia. Involvement, contribution and influence in github and stack overflow. In CASCON, pages 19–33, 2014.
  • Bogdan Vasilescu, Vladimir Filkov, and Alexander Serebrenik. Stackoverflow and github: Associations between software development and crowdsourced knowledge. In 2013 International Conference on Social Computing, pages 188–195. IEEE,2013.

Analysis of code reuse.

  • Xiangping Chen, Furen Xu, Yuan Huang, Xiaocong Zhou, and Zibin Zheng. An empirical study of code reuse between github and stack overflow during software development. Journal of Systems and Software, 210:111964, 2024.
  • Yuan Huang, Furen Xu, Haojie Zhou, Xiangping Chen, Xiaocong Zhou, and Tong Wang. Towards exploring the code reuse from stack overflow during software development. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, pages 548–559, 2022.
  • Saraj Singh Manes and Olga Baysal. Studying the change histories of stack overflow and github snippets. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pages 283–294, 2021.
  • Saraj Singh Manes and Olga Baysal. How often and what stackoverflow posts do developers reference in their github projects? In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pages 235–239. IEEE, 2019.
  • Sebastian Baltes and Stephan Diehl. Usage and attribution of stack overflow code snippets in github projects. Empirical Software Engineering, 24(3):1259–1295, 2019.
  • Di Yang, Pedro Martins, Vaibhav Saini, and Cristina Lopes. Stack overflow in github: any snippets there? In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pages 280–290. IEEE, 2017.

Analysis of community interactions.

  • Hanzhi Jiang, Lin Shi, Meiru Che, Yuxia Zhang, and Qing Wang. Bringing open source communication and development together: A cross-platform study on gitter and github. IEEE Transactions on Software Engineering, 2024.
  • Syful Islam, Yusuf Sulistyo Nugroho, Chy Md Shahrear, Nuhash Wahed, Dedi Gunawan, Endang Wahyu Pamungkas, Mohammed Humayun Kabir, Yogiek Indra Kurniawan, and Md Kamal Uddin. An empirical study of software ecosystem related tweets by npm maintainers. PeerJ Computer Science,10:e1669, 2024.
  • Hongbo Fang, Bogdan Vasilescu, and James Herbsleb. Understanding information diffusion about open-source projects on twitter, hackernews, and reddit. In 2023 IEEE/ACM 16th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE), pages 56–67. IEEE, 2023.
  • Hongbo Fang, Hemank Lamba, James Herbsleb, and Bogdan Vasilescu. ”this is damn slick!”: Estimating the impact of tweets on open source project popularity and new contributors. In ICSE ’22: Proceedings of the 44th International Conference on Software Engineering, 2022.
  • Hao Huang, Yao Lu, and Xinjun Mao. Gathering github oss requirements from q&a community: An empirical study. In 2020 25th International Conference on Engineering of Complex Computer Systems (ICECCS), 2020.
  • Chenxi Song, Tao Wang, Gang Yin, Xunhui Zhang, and Cheng Yang. A novel open source software ecosystem: From a graphic point of view and its application. In SEKE, pages 71–74, 2016.
  • Bogdan Vasilescu, Alexander Serebrenik, Prem Devanbu, and Vladimir Filkov. How social q&a sites are changing knowledge sharing in open source software communities. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing, pages 342–354, 2014.
  • BORGES Hudson Silva and VALENTE Marco Tulio. How do developers promote open source projects?

Designed experiments

Dataset

Dataset Name/Related Study Dataset Description Access Link
GHTorrent Gathered more than 900GB of raw data and 10GB of metadata, encompassing millions of events, commits, and other entities. GitHub Mirror
SOTorrent 38.4 million Stack Overflow posts, extracted 11 million URLs, and identified 5.81 million post links in 430,521 GitHub projects. Zenodo - SOTorrent (Note: Multiple dataset versions for different periods are available on Zenodo.)
Stack Overflow Data Dump Archived Stack Overflow content, including posts, polls, tags, badges, etc (Updated every quarter). Internet Archive - Stack Exchange
GH Archive Provided GitHub activities such as coding, documentation, and other contributions (Updated every hour). GH Archive
Fang-tweet-impact 15,975 original tweets, 28,569 retweets, 2,370 GitHub projects (2018-11-01 to 2019-04-30). [Zenodo DOI: 10.5281/zenodo.6321448](/hrlsm/Cross_Community_Study/tree/master/DOI: 10.5281/zenodo.6321448)
Fang-2020-Need-for-Tweet 70,427 GitHub-Twitter user pairs. [Zenodo DOI: 10.5281/zenodo.3711629](/hrlsm/Cross_Community_Study/tree/master/DOI: 10.5281/zenodo.3711629)
Manes-code-snippets 22,900 projects, 33,765 SO references mapped to 4,634 SO posts, 73,322 commits. GitHub - GHCodeSnippetHistory
badashian2014involvement 255,375 GitHub-Stack Overflow user pairs (2008-09-01 to 2013-08-31). University of Alberta - Merged Dataset
Zhang-devrec 136 popular projects, 99 unpopular projects (2014-09-14 to 2016). Trustie - Statistics
chen2024empirical 793 Java projects with a total of 342,148 modified code snippets and 1,355,617 Stack Overflow posts Code Reuse Analysis
islam2024empirical 14,330 GitHub-Twitter npm maintainers and 39,425 tweets Zenodo Record
  • Fang-tweet-impact:Hongbo Fang, Hemank Lamba, James Herbsleb, and Bogdan Vasilescu. “this is damn slick!”: Estimating the impact of tweets on open source project popularity and new contributors. In ICSE ’22: Proceedings of the 44th International Conference on Software Engineering, 2022.
  • Fang-2020-Need-for-Tweet:Hongbo Fang, Daniel Klug, Hemank Lamba, James Herbsleb, and Bogdan Vasilescu. Need for tweet: How open source developers talk about their github work on twitter. In MSR ’20: Proceedings of the 17th International Conference on Mining Software Repositories, 2020.
  • Manes-code-snippets:Saraj Singh Manes and Olga Baysal. Studying the change histories of stack overflow and github snippets. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pages 283–294, 2021.
  • badashian2014involvement:Ali Sajedi Badashian, Afsaneh Esteki, Ameneh Gholipour, Abram Hindle, and Eleni Stroulia. Involvement, contribution and influence in github and stack overflow. In CASCON, pages 19–33, 2014.
  • Zhang-devrec:X. Zhang, T. Wang, G. Yin, C. Yang, Y. Yu, and H. Wang, “Devrec: A developer recommendation system for open source repositories,” in International Conference on Software Reuse. Springer, 2017, pp. 3–11.
  • chen2024empirical:Xiangping Chen, Furen Xu, Yuan Huang, Xiaocong Zhou, and Zibin Zheng. An empirical study of code reuse between github and stack overflow during software development. Journal of Systems and Software, 210:111964, 2024.
  • islam2024empirical:Syful Islam, Yusuf Sulistyo Nugroho, Chy Md Shahrear, Nuhash Wahed, Dedi Gunawan, Endang Wahyu Pamungkas, Mohammed Humayun Kabir, Yogiek Indra Kurniawan, and Md Kamal Uddin. An empirical study of software ecosystem related tweets by npm maintainers. PeerJ Computer Science,10:e1669, 2024.

Cites

关于

Open Source Oriented Cross-community Survey

45.0 KB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

©Copyright 2023 CCF 开源发展委员会
Powered by Trustie& IntelliDE 京ICP备13000930号