Gail Murphy: Professor at Department of Computer Science, UBC Faculty of Science

Prospective Graduate Students / Postdocs

This faculty member is currently not looking for graduate students or Postdoctoral Fellows. Please do not contact the faculty member with any such requests.

Professor

Faculty of Science

Research Classification

Programming languages and software engineering

Research Interests

Software Development

knowledge worker productivity

software design

software engineering

software evolution

Relevant Thesis-Based Degree Programs

View all programs

Affiliations to Research Centres, Institutes & Clusters

CAIDA: UBC ICICS Centre for Artificial Intelligence Decision-making and Action

Institute for Computing, Information and Cognitive Systems (ICICS)

Open All

Graduate Student Supervision

Doctoral Student Supervision

Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.

Leveraging developer discussions to improve design accessibility (2022)

Since the inception of software engineering, the design of a software system has been recognized as one of its most important attributes. A software system’s design determines many of its properties, such as maintainability and performance. One might expect that there is a common and well-established understanding about what software design is and is not. Such an understanding is not evident in the literature, where design has been described in many ways such as large-scale architecture and low-level design patterns, to name just a few.At the same time, an understanding of design is also needed to maintain system properties as changes to the system are made. When developers lose track of the overall design, the system may not conform to its intended properties.Unfortunately, many systems do not have up-to-date design documentation and approaches to recover design often focus on how a system works by extracting structural and behavioural information rather than why it was designed to work like that.In this thesis, we propose an automated approach to extract design information from written discussions between developers. The aim is to make this information accessible to developers, helping them understand the design of a system and make better design decisions.First, we present an interview study we conducted to understand what researchers and practitioners consider as software design. These interviews revealed five recurring topics that can help inform what software design truly represents.We then introduce a classifier able to locate paragraphs in discussions, which we call design points, that pertain to design. Results show that this classifier is able to locate design information with high accuracy even in systems that it was not trained on.We describe a study conducted with software developers that shows that newcomers to a project, when provided with design points relevant to a programming task, are able to interpret and use the design information to consider additional design alternatives. We finally discuss an early exploration into the use of semantic frames to identify useful design points.

View record

Supporting a developer's discovery of task-relevant information (2022)

The information that a developer seeks to aid in the completion of a task typically exists across different kinds of software artifacts that include substantial natural language text. For instance, artifacts vary from conversational discussions about bug reports to tutorial descriptions of features in a library. In the artifacts that a developer consults, only some portions of the text will be useful to a developer's task and locating such portions can be time-consuming as the artifacts can include substantial text to peruse organized in different ways. For example, it might be easier to locate information in tutorial artifacts with structured headings whereas artifacts consisting of developer conversations might need to be read in detail.Given the limited time developers have to spend on any task, researchers have attempted to aid the developers by proposing a range of techniques to automate the identification of relevant text. However, this prior work is generally constrained to one or only a few types of artifacts. Enabling a developer access to artifact-specific approaches is difficult to deploy and support to the multitude of artifact types that is constantly evolving is challenging, if not impractical.In this dissertation, we propose a set of generalizable techniques to aid developers in locating a portion of text that might be useful for a task. These techniques are based on semantic patterns that arise from the empirical analysis of the text relevant to a task in multiple kinds of artifacts, leading us to propose techniques that incorporate the semantics of words and sentences to identify text likely relevant to a developer's task automatically. We evaluate the proposed techniques assessing the extent to which they identify text that developers deem relevant in different kinds of artifacts associated with Android development tasks. We then investigate how a tool that embeds the most promising semantic-based technique might assist developers while they perform a task. Results show that semantic-based techniques perform equivalently well across multiple artifact types and that a tool that automates the provision of task-relevant text assists developers in effectively completing a software development task.

View record

Summarizing Software Artifacts (2013)

To answer an information need while performing a software task, a software developer sometimes has to interact with a lot of software artifacts. This interaction may involve reading through large amounts of information and many details of artifacts to find relevant information. In this dissertation, we propose the use of automatically generated natural language summaries of software artifacts to help a software developer more efficiently interact with software artifacts while trying to answer an information need. We investigated summarization of bug reports as an example of natural language software artifacts, summarization of crosscutting code concerns as an example of structured software artifacts and multi-document summarization of project documents related to a code change as an example of multi-document summarization of software artifacts. We developed summarization techniques for all the above cases. For bug reports, we used an extractive approach based on an existing supervised summarization system for conversational data. For crosscutting code concerns, we developed an abstractive summarization approach. For multi-document summarization of project documents, we developed an extractive supervised summarization approach. To establish the effectiveness of generated summaries in assisting software developers, the summaries were extrinsically evaluated by conducting user studies. Summaries of bug reports were evaluated in the context of bug report duplicate detection tasks. Summaries of crosscutting code concerns were evaluated in the context of software code change tasks. Multi-document summaries of project documents were evaluated by investigating whether project experts find summaries to contain information describing the reason behind the corresponding code changes. The results show that reasonably accurate natural language summaries can be automatically produced for different types of software artifacts and that the generated summaries are effective in helping developers address their information needs.

View record

Developer-Centric Models: Easing Access to Relevant Information in a Software Development Environment (2011)

During the development of a software system, large amounts of new information, such as source code, work items and documentation, are produced continuously. As a developer works, one of his major activities is to consult portions of this information pertinent to his work to answer the questions he has about the system and its development. Current development environments are centered around models of the artifacts used in development, rather than of the people who perform the work, making it difficult and sometimes infeasible for the developer to satisfy his information needs.We introduce two developer-centric models, the degree-of-knowledge (DOK) model and the information fragments model, which support developers in accessing the small portions of information needed to answer the questions they have. The degree-of-knowledge model computes automatically, for each source code element in the development environment, a real value that represents a developer's knowledge of that element based on a developer's authorship and interaction data. We present evidence that shows that both authorship and interaction information are important in characterizing a developer's knowledge of code. We report on the usage of our model in case studies on expert finding, knowledge transfer and identifying changes of interest. We show that our model improves upon an existing expertise finding approach and can accurately identify changes for which a developer should likely be aware. Finally, we discuss the robustness of the model across multiple development sites and teams.The information fragment model automates the composition of different kinds of information and allows developers to easily choose how to display the composed information. We show that the model supports answering 78 questions that involve the integration of information siloed by existing programming environments. We identified these questions from interviews with developers. We also describe how 18 professional developers were able to use a prototype tool based on our model to successfully and quickly answer 94% of eight of the 78 questions posed in a case study. The separation of composition and presentation supported by the model, allowed the developers to answer the questions according to their personal preferences.

View record

Master's Student Supervision

Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.

Automatically associating resources with tasks based on a software developer’s activity (2023)

Developing and maintaining software is a complex process that consists of many differenttasks and activities. Despite substantial research into how software developers work, thereare few techniques to help track which resources, in particular which parts of source code,a developer needs to complete a task. In this thesis, we explore whether it is possibleto associate automatically the resources a software developer works on as part of a taskwith the appropriate task assigned to a developer based on semantic similarity between theresource content and the description of a task. We explore a design space involving threesimilarity techniques—Term Frequency - Inverse Document Frequency (TF-IDF), BidirectionalEncoder Representation from Transformers (BERT), and word2vec—and three waysof segmenting the work a developer performs—time intervals, a set number of interactionsa developer undertakes and a sliding time window. To explore this design space, we undertookthree case studies on three developers from the same open source project, focusing onthe effectiveness, measured in precision, with which different techniques and segmentationtechniques are able to associate resources with tasks. Despite variation by developer, wefound that TF-IDF combined with segmenting a developer’s activity by time results inthe highest precision score, but with low recall. We found that BERT, combined withsegmenting a developer’s activity by a set number of interactions, results in the best balancebetween the precision and recall. Future research should explore how to personalizethe right combination of similarity with a segmentation approach for a developer to bestassociate resources with a task being worked on by the developer.

View record

Investigating data-flow reachability questions (2022)

Software developers frequently ask and answer questions about code that involve data and its path through a program, such as “How was this value created?” or “How is this value modified?” These questions are instances of reachability questions, which require developers to locate points of interest within program paths. Despite how frequently developers encounter reachability questions, existing tools place the burden on the developer to translate the question of interest into a low-level analysis, or to examine analysis results out of context of the reachability question, or both. In this thesis, we introduce the ReachHover tool to investigate whether direct user interface support for asking and answering reachability questions makes it easier for developers to answer these questions accurately. We focused ReachHover’s support on data-flow reachability questions after conducting a formative study of 72 practicing developers about the type and frequency of reachability questions they encounter in their work. We evaluated ReachHover through a controlled user study with 20 practicing developers, finding that participants who used ReachHover answered data-flow reachability questions involving multiple files more correctly than those who used standard tooling, and that those developers better maintained context while determining their answers.

View record

Reducing the cognitive and temporal costs of software history exploration (2022)

Software developers traverse several commits and issues from issue tracking systems when exploring software revision history to answer questions about the rationale behind presently written code. Existing tools impose a cognitive burden on developers as developers must sift through many commits and must transition between commit information and issue tracking information presented separately. More effective support for software revision history exploration would reduce the cognitive burden for developers, allowing them to answer code rationale questions in the limited time available for tasks.We introduce Intelligent History, which uses commit history highlighting to reduce the search space of commits in a revision history, recommending to a developer which commits in a history might merit further investigation. Additionally, Intelligent History minimizes the distance between a developer's IDE and issues from an issue tracking system by directly integrating issue information in the same context.To evaluate Intelligent History, we conducted a controlled laboratory study and recruited 10 software developers. We asked two sets of questions related to the intent behind source code, requiring the participants to explore commit histories for two Java classes from the open-source Apache Kafka project. We also conducted a semi-structured interview on the participants' experiences with examining issues and commits and their use of Intelligent History in the experiment.The results from our analysis shows: (1) participants who used Intelligent History examined fewer commits than participants who did not use Intelligent History while producing correct answers to questions posed; (2) direct integration of issue information in the IDE with Intelligent History supports developers who employ a commit history exploration approach we describe as linear or cyclic and backtracking to view slightly more issues than if they did not have direct issue integration; and (3) our heuristics were accurate to the extent that there was overlap between at least half of the commits that participants examined without Intelligent History and the commits that Intelligent History highlighted.

View record

Automatic identification and description of software developers tasks (2020)

A software developer works on many tasks per day, frequently switching back and forth between their tasks. This constant churn of tasks makes it difficult for a developer to know the specifics of what tasks they worked on, and when they worked on them. Consequently, activities such as task resumption, planning, retrospection,and reporting become complicated. To help a developer determine which tasks they worked on and when these tasks were performed, we introduce two novel approaches. First, an approach that captures the contents of a developer’s active window at regular intervals to create vector and visual representations of the work in a particular time interval. Second, an approach that automatically detects the times at which developers switch tasks, as well as coarse grained information about the type of the task. To evaluate the first approach, we created a data set with multiple developers working on the same set of six information seeking tasks. To evaluate the second approach, we conducted two field studies, collecting data from a total of 25 professional developers. Our analyses show that our approaches enable: 1) segments of a developer’s work to be automatically associated with a task from a known set of tasks with average accuracy of 70.6%, 2) a visual representation of a segment of work performed such that a developer can recognize the task with average accuracy of 67.9%, 3) the boundaries of a developer’s task to be detected with an accuracy as high as 84%, and 4) the coarse grained type of a task that a developer works on to be detected with 61% accuracy.

View record

Investigating completeness and consistency of links between issues and commits (2017)

Software developers use commits to track source code changes made to a project, and to allow multiple developers to make changes simultaneously. To ensure that the commits can be traced to the issues that describe the work to be performed, developers typically add the identifier of the issue to the commit message to link commits to issues. However, developers are not infallible and not all desirable links are captured manually. To help find and improve links that have been manually specified, several techniques have been created. Although many software engineering tools, like defect predictors, depend on the links between commits and issues, there is currently no way to assess the quality of existing links. To provide a means of assessing the quality of links, I propose two quality attributes: completeness and consistency. Completeness measures whether all appropriate commits link to an issue, and consistency measures whether commits are linked to the most specific issue. I applied these quality attributes to assess a number of existing link techniques and found that existing techniques to link commits to issues lack both completeness and consistency in the links that they created. To enable researchers to better assess their techniques, I built a dataset that improves the link data for two open source projects. In addition, I provide an analysis of information in issue repositories in the form of relationships between issues that might help improve existing link augmentation techniques.

View record

Investigating software developers' understanding of open source software licensing (2017)

Software provided under open source licenses is widely used, fromforming high-profile stand-alone applications (e.g., Mozilla Firefox)to being embedded in commercial offerings (e.g., network routers).Despite the high frequency of use of open source licenses, there hasbeen little work about whether software developers understand the opensource licenses they use. To helpunderstand whether or not developers understand the open sourcelicenses they use, I conducted a survey that posed developmentscenarios involving three popular open source licenses (GNU GPL 3.0,GNU LGPL 3.0 and MPL 2.0) both alone and in combination. The 375respondents to the survey, who were largely developers, gave answersconsistent with those of a legal expert's opinion in 62% of 42cases. Although developers clearly understood cases involving one license,they struggled when multiple licenses were involved. To understand the context in which licensing issues arise in practice, I analyzed real-world questions posed by developers on online question-and-answer communities. The analysis of these questions indicate that licensing issues can constrain software evolution and technical decisions can have an impact on future licensing issues. Finally, I interviewed software developers in industry to understand how developers reason about and handle license incompatibility in practice. The developers I interviewed are cautious of restrictive licenses. To identify potential licensing issues, these developers rely on licensing guidelines provided by their organization and sometimes use specialized tools to automatically detect licensing issues in their projects. When faced with a situation in which a component that suits their needs is not compatible, developers tend to look for alternative components made available by open source communities. They sometimes leverage the technical architecture of their projects to enable the use of components under restrictive licenses and might rewrite the required functionality if necessary. An analysis ofthe results indicate a need for tool support to help guide developers in understanding the structure of the code and the technical details of a project while taking into account the exact requirements imposed by the licenses involved.

View record

Optimizing Modern Code Review through Recommendation Algorithms (2016)

Software developers have many tools at their disposal that use a variety of sophisticated technology, such as static analysis and model checking, to help find defects before software is released. Despite the availability of such tools, software development still relies largely on human inspection of code to find defects. Many software development projects use code reviews as a means to ensure this human inspection occurs before a commit is merged into the system. Known as modern code review, this approach is based on tools, such as Gerrit, that help developers track commits for which review is needed and that help perform reviews asynchronously. As part of this approach, developers are often presented with a list of open code reviews requiring attention. Existing code review tools simply order this list of open reviews based on the last update time of the review; it is left to a developer to find a suitable review on which to work from a long list of reviews. In this thesis, we present an investigation of four algorithms that recommend an ordering of the list of open reviews based on properties of the reviews. We use a simulation study over a dataset of six projects from the Eclipse Foundation to show that an algorithm based on ordering reviews from least lines of code modified in the changes to be reviewed to most lines of code modified out performs other algorithms. This algorithm shows promise for eliminating stagnation of reviews and optimizing the average duration reviews are open.

View record

An Exploratory Study of Socio-Technical Congruence in an Ecosystem of Software Developers (2015)

Software is not built in isolation but builds on other software. When one project relies on software produced by another project, we say there is a technical dependence between the projects. The socio-technical congruence literature suggests that when there is a technical dependence there may need to be a social dependence. We investigate the alignment between social interactions and technical dependence in a software ecosystem.We performed an exploratory study of 250 Java projects on GitHub that use Maven for build dependences. We create a social interaction graph based on developers’ interactions on issue and pull requests. We compare the social interaction graph with a technical dependence graph representing library dependences between the projects in the ecosystem, to get an overview of the congruence, or lack thereof, between social interactions and technical dependences. We found that in 23.6% of the cases in which there is a technical dependence between projects there is also evidence of social interaction between project members. We found that in 8.67% of the cases in which there is a social interaction between project members, there is a technical dependence between projects.To better understand the situations in which there is congruence between the social and technical graphs, we examine pairs of projects that meet this criteria. We identify three categories of these project pairs and provide a quantitative and qualitative comparison of project pairs from each category. We found that for 45 (32%) of project pairs, no social interaction had taken place before the introduction of technical dependence and interactions after the introduction of the dependence are often about upgrading the library being depended upon. For 49 (35%) of project pairs, 75% of the interaction takes place after the introduction of the technical dependence. For the remaining 45 (32%) of project pairs, less than 75% of the interaction takes place after the introduction of the technical dependence. In the latter two cases, although there is interaction before the technical dependence is introduced, it is not always about the dependence.

View record

Do Developers Respond to Code Stability Warnings? (2015)

Ideally, developers would always release code without bugs. Given the impossibility of achieving this ideal, there has been growing interest in ways to alert a developer earlier in the development process to code that may be more bug prone. A recent study found Google developers were unsure of how to act on file-level bug prediction information provided during code reviews as developers were confused about how files were flagged and how potential problems indicated by flagged files could be addressed. We hypothesize that developers may find simpler information provided earlier than code reviews easier to act upon. We introduce a plugin we built called ChangeMarkup that indicates code age and commit sizes via per-line markers in the editor. To understand if this approach has value, we performed a field study with five industry participants working on JavaScript code; our rationale was that warnings might be of more use to developers working in a dynamic language. We found that participants were interested in whether code is recent but do not care precisely how recent and that participants are generally unwilling to change their work habits in response to code stability warnings, regardless of the indicated risk level of performing an edit. Reasons for this relucatance were limited choice as to which edits must be performed and how, a reliance on resilient company release procedures such as dark launching, and confidence in their own work. Based on participant feedback, we propose future adaptations of ChangeMarkup such as an adaptive plugin that anticipates developer activities and presents information accordingly, and a version further simplified to mark only the most recently committed code.

View record

What to Learn Next: Recommending Commands in a Feature-Rich Environment (2015)

Despite an abundance of commands to make tasks easier to perform, the users of feature-rich applications, such as development environments and AutoCAD applications, use only a fraction of the commands available due to a lack of awareness of the existence of many commands. Earlier work has shown that command recommendation can improve the usage of a range of commands available within such applications.In this thesis, we address the command recommendation problem, in which, given the command usage history of a set of users, the objective is to predict a command that is likely useful for the user to learn. We investigate two approaches to address the problem.The first approach is built upon the hypothesis that users of feature-rich applications who have similar features tend to use the same commands, and also, a specific user tends to use commands with similar features. Building on this hypothesis, we describe a supervised learning framework that exploits features from a user-command network to predict new links among users and commands. The second approach is built upon three hypotheses. First, we hypothesize that in feature-rich applications there exists co-occurrence patterns between commands. Second, we hypothesize that users of feature-rich applications have prevalent discovery patterns. Finally, we hypothesize that users need different recommendations based on the time elapsed between their last activity and the time of recommendation. To generate recommendations, we obtain co-occurrence and discovery patterns from the command usage history of a large set of users of the same feature-rich application. Subsequently, for each user, we produce recommendations based on the user's command usage history, co-occurrence and discovery patterns, and time elapsed since the last command usage. We refer to the algorithm we developed according to this approach as CoDis.Empirical experiments on data submitted by users of an integrated development environment (Eclipse) demonstrate that CoDis achieves significant performance improvements over link prediction, standard algorithms used for command recommendation, and matrix factorization techniques that are known to perform well in other domains. Compared to ADAGRAD, the best performing baseline, it achieves an improvement of 10.22% in recall, for a top-N recommendation task (N=20).

View record

Explanations for Command Recommendations - An Experimental Study (2014)

Recently, evaluation of a recommender system has been beyond evaluating just the algorithm. In addition to accuracy of algorithms, user-centric approaches evaluate a system’s effectiveness in presenting recommendations, explaining recommendations and gaining users’ confidence in the system. Existing research focuses on explaining recommendations that are related to user’s current task. However, explaining recommendations can prove useful even when recommendations are not directly related to user’s current task. Recommendations of development environment commands to soft- ware developers is an example of recommendations that are not related to the user’s current task, which is primarily focussed on programming, rather than inspecting recommendations.In this dissertation, we study three different kinds of explanations for IDE commands recommended to software developers. These explanations are inspired by the common approaches based on literature in the domain. We describe a lab-based experimental study with 24 participants where they performed programming tasks on an open source project. Our results suggest that explanations affect users’ trust of recommendations, and explanations reporting the system’s confidence in recommendation affects their trust more. The explanation with system’s confidence rating of the recommendations resulted in more recommendations being investigated. However, explanations did not affect the uptake of the commands. Our qualitative results suggest that recommendations, when not user’s primary focus, should be in context of his task to be accepted more readily.

View record

Reverb: dynamic bookmarks for developers (2012)

The web is an increasingly important source of development-related resources, such as code examples, tutorials, and API documentation. Yet existing integrated development environments do little to assist the developer in finding and utilizing these resources. In this work, we explore how to provide useful web page recommendations to developers by focusing on the problem of refinding previously-visited web pages. We present the results of a formative study, in which we measured how often developers return to code-related web pages, and the methods they use to find those pages. Considering only revisits which occurred at least 15 minutes after the previous visit, and are therefore unlikely to be a consequence of browsing search results, we found a code-related recurrence rate of 13.7%. Only 7.4% of these code-related revisits were initiated through a bookmark of some kind, indicating the majority involved some manual effort to refind. To assist developers with code-related revisits, we developed Reverb, a tool which displays a list of dynamic bookmarks that pertain to the code visible in the editor. Reverb’s bookmarks are generated by building queries from the classes and methods referenced in the local code context and running these queries against a full-text index of the developer’s browsing history, as collected from popular browsers used. We describe Reverb’s implementation and present results from a study in which developers used Reverb while working on their own coding tasks. Our results suggest that local code context can help in making useful recommendations.

View record

Supporting software history exploration (2011)

Software developers often confront questions such as "Why was the code implemented this way"? To answer such questions, developers make use of information in a software system's bug and source repositories. In this thesis, we consider two user interfaces for helping a developer to explore information from such repositories. One user interface, from Holmes and Begel's Deep Intellisense tool, exposes historical information across several integrated views, favouring exploration from a single code element to all of that element's historical information. The second user interface, in a tool called Rationalizer that we introduce in this thesis, integrates historical information into the source code editor, favouring exploration from a particular code line to its immediate history. We introduce a model to express how software repository information is connected and use this model to compare the two interfaces. Through a laboratory study, we found that our model can help to predict which interface is helpful for two particular kinds of historical questions. We also found deficiencies in the interfaces that hindered users in the exploration of historical information. These results can help inform tool developers who are presenting historical information from software repositories, whether that information is retrieved directly from the repository or derived through software history mining.

View record