• What is the Gist? Understanding the Use of Public Gists on GitHub

    Weliang Wang, German Camano-Poo, Evan Wilde, Daniel German

    MSR 2015, 10.1109/MSR.2015.36

    Abstract

    GitHub is a popular source code hosting site which serves as a collaborative coding platform. The many features of GitHub have greatly facilitated developers’ collaboration, communication, and coordination. Gists are one feature of GitHub, which defines them as “a simple way to share snippets and pastes with others.” This three-part study explores how users are using Gists. The first part is a quantitative analysis of Gist metadata and contents. The second part investigates the information contained in a Gist: We sampled 750k users and their Gists (totalling 762k Gists), then manually categorized the contents of 398. The third part of the study investigates what users are saying Gists are for by reading the contents of web pages and twitter feeds. The results indicate that Gists are used by a small portion of GitHub users, and those that use them typically only have a few. We found that Gists are usually small and composed of a single file. However, Gists serve a wide variety of uses, from saving snippets of code, to creating reusable components for web pages.

  • Merge-tree: Visualizing the Integration of Commits into Linux

    Evan Wilde, Daniel German

    VISSOFT 2016, 10.1109/VISSOFT.2016.18

    Abstract

    With an average of more than 900 top-level merges into the Linux kernel per release, many containing hundreds of commits and some containing thousands, maintenance of older versions of the kernel becomes nearly impossible. Various commercial products, such as the Android platform, run older versions of the kernel. Due to security, performance, and changing hardware needs, maintainers must understand what changes (commits) are added to the current version of the kernel since the last time they inspected it in order to make the necessary patches. Current tools provide information about repositories through the directed acyclic graph (DAG) of the repository, which is helpful for smaller projects. However, with the scale and number of branches in the kernel the DAG becomes overwhelming very quickly. Furthermore, the DAG contains every ancestor of every commit, while maintainers are more interested in how and when a commit arrives to the official Linux repository. In this paper, we propose the merge-tree, a simplified transformation of the DAG of the Linux git repository that shows the way in which commits are merged into the master branch of Linux. Using the merge-tree, we build Linvis, a tool that is designed to allow users to explore how commits are merged into the Linux kernel.

  • Merge-tree: Visualizing the Integration of Commits into Linux

    Evan Wilde, Daniel German

    Journal of Software: Evolution and Practice, Volume 30, Issue 2 Special Issue: Special Issue on Software Visualization, 10.1002/smr.1936

    Abstract

    With an average of more than 900 merges into the Linux kernel per release, many containing hundreds of commits and some containing thousands, maintenance of older versions of the kernel becomes nearly impossible. Various commercial products, such as the Android platform, run older versions of the kernel; due to security, performance, and changing hardware needs, maintainers must understand what changes (commits) are added to the current version of the kernel since the last time they inspected it in order to make the necessary patches. Current tools provide information about repositories through the directed acyclic graph (DAG) of the repository, which is helpful for smaller projects. However, with the scale and number of branches in the kernel the DAG becomes overwhelming very quickly. Furthermore, the DAG contains every parents of every commit, while maintainers are more interested in how and when a commit arrives to the official Linux repository. This paper make three contributions; a conversion from DAG to Merge-Tree, an implementation of a tool built on the Merge-Tree model, and a user study to evaluate and validate the implementation and model.

  • Merge-tree: Visualizing the Integration of Commits into Linux

    Evan Wilde

    Masters Thesis, 2018

    Abstract

    Version control systems are an asset to software development, enabling developers to keep snapshots of the code as they work. Stored in the version control system is the entire history of the software project, rich in information about who is contributing to the project, when contributions are made, and to what part of the project they are being made. Presented in the right way, this information can be made invaluable in helping software developers continue the development of the project, and maintainers to understand how the changes to the current version can be applied to older versions of projects. Maintainers are unable to effectively use the information stored within a software repository to assist with the maintenance older versions of that software in highly-collaborative projects. The Linux kernel repository is an example of such a project. This thesis focuses on improving visualizations of the Linux kernel repository, developing new visualizations that help answer questions about how commits are integrated into the project. Older versions of the kernel are used in a variety of systems where it is impractical to update to the current version of the kernel. Some of these applications include the controllers for spacecrafts, the core of mobile phones, the operating system driving internet routers, and as Internet-Of-Things (IOT) device firmware. As vulnerabilities are discovered in the kernel, they are patched in the current version. To ensure that older versions are also protected against the vulnerabilities, the patches applied to the current version of the kernel must be applied back to the older version. To do this, maintainers must be able to understand how the patch that fixed the vulnerability was integrated into the kernel so that they may apply it to the old version as well. This thesis makes four contributions: (1) a new tree-based model, the Merge-Tree, that abstracts the commits in the repository, (2) three visualizations that use this model, (3) a tool called Linvis that uses these visualizations, (4) a user study that evaluates whether the tool is effective in helping users answer questions related to how commits are integrated about the Linux repository. The first contribution includes the new tree-based model, the algorithm that constructs the trees from the repository, and the evaluation of the results of the algorithm. The second contribution demonstrates some of the potential visualizations of the repository that are made possible by the model, and how these visualizations can be used depending on the structure of the tree. The third contribution is an application that applies the visualizations to the Linux kernel repository. The tool was able to help the participants of the study with understanding how commits were integrated into the Linux kernel repository. Additionally, the participants were able to summarize information about merges, including who made the most contributions, which file were altered the most, more quickly and accurately than with Gitk and the command line tools.