Instructions for use are given in the header of each of the six scripts. Script "goMine.py": - extracts metadata from GitHub API; - takes a list of project references as input (a CSV where each line is a list of GitHub repository references / affiliated to a project); - produces for each project a JSON file with a reference all branches; - produces for each project a JSON file with all commits of all branches. Script "goCreateGraphs.py": - takes as input a list of JSON files containing all commit information related to a project produced by goMine.py; - creates for each project the following graphs in GraphML: * a commit graph (as seen in Insights/Network in GitHub), * contributor graphs (where each node is a contributor and each edge is the edition of the same file by two contributors), filtered per filetype, * graphs of all committed file changes (one subgraph per file), filtered per filetype. Script "analysisActivityVolume.py": - computes indicators related to activity volume (number of file changes over time and per project); - takes as input the graphs of file changes produced by goCreateGraphs.py. Script "analysisActivityDistribution.py": - computes indicators related to activity distribution; - takes as input the contributor graphs produced by goCreateGraphs.py; - produces a CSV with computed indicators for all considered projects. Script "clustering.py": - applies a k-means clustering to the topological indicators computed on the contributor graphs; - takes as input the computed list of topological indicators produced by analysisActivityDistribution.py. Script "timeStop.py" is just a utility to add timestamps in traces.