This repository contains the code and data for paper ‘Quantifying Community Evolution in Developer Social Networks’.
Requirements
You need Python 3.9
, R
, and RStudio
to run the code.
Important files
- The entry of Python code for community detection and index calculation is in
main_calculate_indexes.py
- Data are located at
data/
- After executing
main_calculate_indexes.py
, the results will be recorded in an auto-generated subfolder in the result/
folder of the project
- The R script
RScript/productivity_analysis.Rmd
is used to analyze the correlation between community evolution indices and team productivity
How to obtain the results
- Set the threshold of community matching used for existing approach in
global_settings.py
, e.g., EVOLUTION_PATTERN_THRESHOLD = 0.3
- Run
main_calculate_indexes.py
- Find the results in
concurrent_validity.txt
in the auto-generated subfolder under result/
Discriminant Validity: Spearman’s Correlations Coefficients between pairs of indices
- Run
main_calculate_indexes.py
(Optional if you’ve already run the code for concurrent validity)
- Find the results in
discrimant_validity.txt
in the auto-generated subfolder under result/
Regression Analysis: Correlation with team productivity
- Run
main_calculate_indexes.py
(Optional if you’ve already run the code for concurrent validity)
- Find the path of file
index_productivity.csv
in the auto-generated subfolder under result/
- For example:
../result/2022-03-13T10-07-10Z_interval_7_days_x_12/index_productivity.csv
- Modify line 22 of
RScript/productivity_analysis.Rmd
to specify the path of the data file.
- For example:
table_data<-read.table("../result/2022-03-13T10-07-10Z_interval_7_days_x_12/index_productivity.csv", head=T, sep=',', stringsAsFactors = FALSE)
- Run
RScript/productivity_analysis.Rmd
in RStudio
, and get the results.
部署脚本
针对每个main_xxx.py
(目前已经实现了xxx=main_calculate_indexes
),需要添加两个部署脚本:
./deploy_x_main_xxx.sh
:需要复制./deploy_0_main_calculate_indexes.sh
脚本,重命名该脚本,并修改其中的TASK_SCRIPT路径为下面的server_scripts/run_xxx.sh
文件路径
server_scripts/run_xxx.sh
:需要复制server_scripts/run_main_calculate_indexes.sh
脚本,重命名该脚本,并修改其中的py文件名
完成上述脚本添加后,本地用Linux Shell打开项目目录(Windows平台推荐安装WSL2),并执行下列语句:
dos2unix ./deploy_x_main_xxx.sh #只需要曾经执行过一次就行
./deploy_x_main_xxx.sh
该脚本会提示输入密码三次,其功能为将本地的代码清理后上传172.27.135.32
服务器,登录用户wangliang
并自动运行,忽略以下提示信息:
bash: cannot set terminal process group (1469755): Inappropriate ioctl for device
bash: no job control in this shell
上传服务器的代码位于服务器目录:~/workspace/deploy/oss_community_evolution_indexes/
,会保存每次上传的代码副本,以时间区分
运行结果会保存在服务器目录:~/workspace/data/oss_community_evolution_indexes/result/
,以时间区分
在使用ssh登录服务器后,可以使用htop
命令查看当前程序是否启动(是否有用户wangliang
的python3
程序在执行),同时在~/workspace/deploy/oss_community_evolution_indexes/
目录中使用tail output.log
命令查看程序输出情况。
This repository contains the code and data for paper ‘Quantifying Community Evolution in Developer Social Networks’.
Requirements
You need
Python 3.9
,R
, andRStudio
to run the code.Important files
main_calculate_indexes.py
data/
main_calculate_indexes.py
, the results will be recorded in an auto-generated subfolder in theresult/
folder of the projectRScript/productivity_analysis.Rmd
is used to analyze the correlation between community evolution indices and team productivityHow to obtain the results
Concurrent Validity: Community evolution pattern detection
global_settings.py
, e.g.,EVOLUTION_PATTERN_THRESHOLD = 0.3
main_calculate_indexes.py
concurrent_validity.txt
in the auto-generated subfolder underresult/
Discriminant Validity: Spearman’s Correlations Coefficients between pairs of indices
main_calculate_indexes.py
(Optional if you’ve already run the code for concurrent validity)discrimant_validity.txt
in the auto-generated subfolder underresult/
Regression Analysis: Correlation with team productivity
main_calculate_indexes.py
(Optional if you’ve already run the code for concurrent validity)index_productivity.csv
in the auto-generated subfolder underresult/
../result/2022-03-13T10-07-10Z_interval_7_days_x_12/index_productivity.csv
RScript/productivity_analysis.Rmd
to specify the path of the data file.RScript/productivity_analysis.Rmd
inRStudio
, and get the results.部署脚本
针对每个
main_xxx.py
(目前已经实现了xxx=main_calculate_indexes
),需要添加两个部署脚本:./deploy_x_main_xxx.sh
:需要复制./deploy_0_main_calculate_indexes.sh
脚本,重命名该脚本,并修改其中的TASK_SCRIPT路径为下面的server_scripts/run_xxx.sh
文件路径server_scripts/run_xxx.sh
:需要复制server_scripts/run_main_calculate_indexes.sh
脚本,重命名该脚本,并修改其中的py文件名完成上述脚本添加后,本地用Linux Shell打开项目目录(Windows平台推荐安装WSL2),并执行下列语句:
该脚本会提示输入密码三次,其功能为将本地的代码清理后上传
172.27.135.32
服务器,登录用户wangliang
并自动运行,忽略以下提示信息:上传服务器的代码位于服务器目录:
~/workspace/deploy/oss_community_evolution_indexes/
,会保存每次上传的代码副本,以时间区分运行结果会保存在服务器目录:
~/workspace/data/oss_community_evolution_indexes/result/
,以时间区分在使用ssh登录服务器后,可以使用
htop
命令查看当前程序是否启动(是否有用户wangliang
的python3
程序在执行),同时在~/workspace/deploy/oss_community_evolution_indexes/
目录中使用tail output.log
命令查看程序输出情况。