Ali Mesbah

Professor

Relevant Thesis-Based Degree Programs

Affiliations to Research Centres, Institutes & Clusters

 
 

Graduate Student Supervision

Doctoral Student Supervision

Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.

UI driven dynamic analysis and testing of web applications (2023)

Modern web applications have evolved to become highly complex, interact-able software capable of replicating the functionalities of traditional software. The web application development itself evolved to present a variety of options in terms of development frameworks and programming languages available to the developers. This however, also leads to one of core challenges in web application testing, the heterogenous nature of web applications, which makes software testing techniques that rely on analyzing source-code ineffective. Instead, end-to-end UI testing is the preferred mode of ensuring the web applications do not contain faults or bugs. However, the web UI testing ecosystem has failed to evolve at a similar pace to web development but instead relies on human effort in practice, making web application testing a costly endeavour.The goal of this thesis is to enable automatic testing of web applications reducing reliance on human effort. First, we identified web page equivalence to be a core challenge faced by existing automatic UI test generation techniques, performed an empirical evaluation of ten existing state comparison techniques and identified the characteristics of modern web apps that render these techniques ineffective in generating optimal UI test suites. Thereafter, we designed a novel state comparison and test generation technique which treats a web page as a set of individual functionalities that can be represented in a hierarchy of fragments, helping us test modern web applications in an effective manner. Next, we designed the first universally applicable mutation analysis framework for web applications regardless of the back-end and front-end technologies they were built upon. It is capable of assessing the quality of a given UI test suite without even requiring the source-code of the web application. Finally, we tackle the challenge of enabling API testing universally for any web application. Our API testing framework exercises the UI to carve API test suites and generates API specification that would enable API testing for any given web application. We implement our techniques in open-source tools and evaluate them through a set of controlled experiments. The results show that our techniques succeeded in accomplishing the set research goals.

View record

Automated visual analysis of non-functional web app properties (2022)

Non-functional software properties capture qualitative and generic aspects aboutsoftware. Such aspects are often high level and more semantic compared to themore precise and quantitative functional properties or requirements, and thereforehave been more difficult to analyze and automate. A scarcely explored, and po-tentially useful, alternative paradigm is the adoption of what might be referred toas a visual analysis approach to software engineering, which involves extractingor analyzing visual information pertaining to the software, with the objective ofaddressing software engineering problems.The goal of the work presented in this dissertation is to improve non-functionalweb UI properties using automated visual analysis. We focus on particular prob-lems of testability, accessibility, and maintainability because they have not beenamenable to automation so far. First, we improve testability by converting theinherently non-testable web canvas elements into testable ones. The automatedtechnique is based on visually analyzing the structure and properties of the canvascontents, then augmenting them into the canvas element to make it testable. Then,we propose an approach to test semantic accessibility. It is based on visually ana-lyzing various regions of the page and then inferring any associated semantic roles, after which the UI markup is examined to assert the presence of the roles. Next, weintroduce an automated technique for addressing the common problem of inacces-sible web form labeling. It is based on constructing visual cues from the form, thensolving for the optimal labeling associations, which are finally augmented into theinaccessible web forms to make them accessible. Finally, we present a UI compo-nent generation technique to improve maintainability. The technique first detectsvisual patterns in the UI, then combines subsets of these patterns into a shared tem-plate, which is finally formulated as a UI component. Our evaluations show thatthe proposed techniques are able to carry the inferences, analyses, and tests in anaccurate and effective manner.

View record

Understanding motifs of program behaviour and change (2018)

Program comprehension is crucial in software engineering; a necessary step for performing many tasks. However, the implicit and intricate relations between program entities hinder comprehension of program behaviour and change. It is particularly a difficult endeavour to understand dynamic and modern programming languages such as JavaScript, which has grown to be among the most popular languages. Comprehending such applications is challenging due to the temporal and implicit relations of asynchronous, DOM-related and event-driven entities spread over the client and server sides.The goal of the work presented in this dissertation is to facilitate program comprehension through the following techniques. First, we propose a generic technique for capturing low-level event-based interactions in a web application and mapping those to a higher-level behavioural model. This model is then transformed into an interactive visualization, representing episodes of execution through different semantic levels of granularity. Then, we present a DOM-sensitive hybrid change impact analysis technique for JavaScript through a combination of static and dynamic analysis. Our approach incorporates a novel ranking algorithm for indicating the importance of each entity in the impact set. Next, we introduce a method for capturing a behavioural model of full-stack JavaScript applications’ execution. The model is temporal and context-sensitive to accommodate asynchronous events, as well as the scheduling and execution of lifelines of callbacks. We present a visualization of the model to facilitate program comprehension for developers. Finally, we propose an approach for facilitating comprehension by creating an abstract model of software behaviour. The model encompasses hierarchies of recurring and application-specific motifs. The motifs are abstract patterns extracted from traces through our novel technique, inspired by bioinformatics algorithms. The motifs provide an overview of the behaviour at a high level, while encapsulating semantically related sequences in execution. We design a visualization that allows developers to observe and interact with inferred motifs.We implement our techniques in open-source tools and evaluate them through a set of controlled experiments. The results show that our techniques significantly improve developers’ performance in comprehending the behaviour and impact of change in software systems.

View record

Directed test generation and analysis for web applications (2017)

The advent of web technologies has led to the proliferation of modern web applications with enhanced user interaction and client-side execution. JavaScript (the most widely used programming language) is extensively used to build responsive modern web applications. The event-driven and dynamic nature of JavaScript, and its interaction with the Document Object Model (DOM), make it challenging to understand and test effectively. The ultimate goal of this thesis is to improve the quality of web applications through automated testing and maintenance. The work presented in this dissertation has focused on advancing the state-of-the-art in testing and maintaining web applications by proposing a new set of techniques and tools. We proposed (1) a feedback-directed exploration technique and a tool to cover a subset of the state-space of a given web application; the exploration is guided towards achieving higher functionality, navigational, and page structural coverage while reducing the test model size, (2) a technique and a tool to generate UI tests using existing tests; it mines the existing test suite to infer a model of the covered DOM states and event-based transitions including input values and assertions; it then expands the inferred model by exploring alternative paths and generates assertions for the new states; finally it generates a new test suite from the extended model, (3) the first empirical study on JavaScript tests to characterize their prevalence and quality metrics, and to find out root causes for the uncovered (missed) parts of the code under test, (4) a DOM-based JavaScript test fixture generation technique and a tool, which is based on dynamic symbolic execution; it guides the executing through different branches of a function by producing expected DOM instances, (5) a technique and a tool to detect JavaScript code smells using static and dynamic analysis. We evaluated the presented techniques by conducting various empirical studies and comparisons. The evaluation results point to the effectiveness of the proposed techniques in terms of fault detection capability and code coverage for test generation, and in terms of accuracy for code smell detection.

View record

Mobile App Development: Challenges and Opportunities for Automated Support (2016)

Mobile app development is a relatively new phenomenon that is increasing rapidly due to the ubiquity and popularity of smartphones among end-users. As with any new domain, mobile app development has its own set of new challenges. The work presented in this dissertation has focused on improving the state-of-the-art by understanding the current practices and challenges in mobile app development as well as proposing a new set of techniques and tools based on the identified challenges.To understand the current practices, real challenges and issues in mobile development, we first conducted an explorative field study, in which we interviewed 12 senior mobile developers from nine different companies, followed by a semi-structured survey, with 188 respondents from the mobile development community. Next, we mined and quantitatively and qualitatively analyzed 32K non-reproducible bug reports in one industrial and five open-source bug repositories. Then, we performed a large-scale comparative study of 80K iOS and Android app-pairs and 1.7M reviews by mining the Google Play and Apple app stores.Based on the identified challenges, we first proposed a reverse engineering technique that automatically analyzes a given iOS mobile app and generates a state model of the app. Finally, we proposed an automated technique for detecting inconsistencies in the same mobile app implemented for iOS and Android platforms. To measure the effectiveness of the proposed techniques, we evaluated our methods using various industrial and open-source mobile apps. The evaluation results point to the effectiveness of the proposed model generation and mapping techniques in terms of accuracy and inconsistency detection capability.

View record

On the detection, localization and repair of client-side JavaScript faults (2016)

With web application usage becoming ubiquitous, there is greater demand for making such applications more reliable. This is especially true as more users rely on web applications to conduct day-to-day tasks, and more companies rely on these applications to drive their business. Since the advent of Web 2.0, developers often implement much of the web application’s functionality at the client-side, using client-side JavaScript. Unfortunately, despite repeated complaints from developers about confusing aspects of the JavaScript language, little work has been done analyzing the language’s reliability characteristics. With this problem in mind, we conducted an empirical study of real-world JavaScript bugs, with the goal of understanding their root cause and impact. We found that most of these bugs are DOM-related, which means they occur as a result of the JavaScript code’s interaction with the Document Object Model (DOM). Having gained a thorough understanding of JavaScript bugs, we designed techniques for automatically detecting, localizing and repairing these bugs. Our localization and repair techniques are implemented as the AutoFLox and Vejovis tools, respectively, and they target bugs that are DOM-related. In addition, our detection techniques – Aurebesh and Holocron – attempt to find inconsistencies that occur in web applications written using JavaScript Model-View-Controller (MVC) frameworks. Based on our experimental evaluations, we found that these tools are highly accurate, and are capable of finding and fixing bugs in real-world web applications.

View record

Effective test generation and adequacy assessment for JavaScript-based web applications (2015)

Today's modern Web applications rely heavily on JavaScript and client-side run-time manipulation of the DOM (Document Object Model) tree. One way to provide assurance about the correctness of such highly evolving and dynamic applications is through testing. However, JavaScript is loosely typed, dynamic, and notoriously challenging to analyze and test. The work presented in this dissertation has focused on advancing the state-of-the-art in testing JavaScript-based web applications by proposing a new set of techniques and tools. We proposed (1) a new automated technique for JavaScript regression testing, which is based on inferring invariant assertions, (2) the first JavaScript mutation testing tool, capable of guiding the mutation generation towards behaviour-affecting mutants in error-prone portions of the code, (3) an automatic technique to generate test cases for JavaScript functions and events; Mutation analysis is used to generate test oracles, capable of detecting regression JavaScript and DOM-level faults, and (4) utilizing existing DOM-dependent assertions as well as useful execution information inferred from a DOM-based test suite to automatically generate assertions for unit-level testing of JavaScript functions. To measure the effectiveness of the proposed approaches, we evaluated each method presented in this thesis by conducting various empirical studies and comparisons with existing testing techniques. The evaluation results point to the effectiveness of the proposed test generation and test assessment techniques in terms of accuracy and fault detection capability.

View record

Master's Student Supervision

Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.

The role of dual slicing in automatic program repair (2023)

Contextual information plays a vital role for software developers when understanding and fixing a bug. Context can also be important in deep learning-based program repair to provide extra information about the bug and its fix. Existing techniques, however, treat context in an arbitrary manner, by extracting code in close proximity of the buggy statement within the enclosing file, class, or method, without any analysis to find actual relations with the bug. To reduce noise, they use a predefined maximum limit on the number of tokens to be used as context. We present a program slicing-based approach, in which instead of arbitrarily including code as context, we analyze statements that have a control or data dependency on the buggy statement. We propose a novel concept called dual slicing, which leverages the context of both buggy and fixed versions of the code to capture relevant repair ingredients. We present our technique and tool called KATANA, the first to apply slicing-based context for a program repair task. The results show KATANA effectively preserves sufficient information for a model to choose contextual information while reducing noise. We compare against four recent state-of-the-art context-aware program repair techniques. Our results show KATANA fixes between 1.5 to 3.7 times more bugs than existing techniques.

View record

The impact of code representation on deep learning-based program repair (2022)

Training a deep learning model on source code has gained significant traction recently. Since such models reason about vectors of numbers, source code needs to be converted to a code representation and then will be transformed into vectors. Numerous approaches have been proposed to represent source code, from sequences of tokens to abstract syntax trees. However, there is no systematic study to understand the effect of code representation on learning performance. Through a controlled experiment, we examine the impact of various code representations on model accuracy and usefulness in learning-based program repair. We train 21 different models, including 14 different homogeneous code representations, four mixed representations for the buggy and fixed code, and three different embeddings. We also conduct a user study to qualitatively evaluate the usefulness of inferred fixes in different code representations. Our results highlight the importance of code representation and its impact on learning and usefulness. Our findings indicate that (1) while code abstractions help the learning process, they can adversely impact the usefulness of inferred fixes from a developer’s point of view; this emphasizes the need to look at the patches generated from the end user’s perspective, which is often neglected in the literature, (2) mixed representations can outperform homogeneous code representations, (3) bug type can affect the effectiveness of different code representations; although current techniques use a single code representation for all bug types, there is no single best code representation applicable to all bug types. This calls for the need to look at the repair task from the practitioner’s in mind which is often overlooked in the literature. We emphasize the effectiveness of code representation varies significantly depending on the bug types.

View record

Bugs and development challenges in IoT systems (2021)

IoT systems are rapidly adopted in various domains, from embedded systems tosmart homes. Despite their growing adoption and popularity, there has been nothorough study to understand IoT development challenges from the practitioners’point of view. We provide the first systematic study of bugs and challenges thatIoT developers face in practice, through a large-scale empirical investigation. Wecollected 5,565 bug reports from 91 representative IoT project repositories and categorizeda random sample of 323 based on the observed failures, root causes, andthe locations of the faulty components. In addition, we conducted nine interviewswith IoT experts to uncover more details about IoT bugs and to gain insight intoIoT developers’ challenges. Lastly, we surveyed 194 IoT developers to validateour findings and gain further insights. We propose the first bug taxonomy for IoTsystems based on our results.We highlight frequent bug categories and their root causes, correlations betweenthem, and common pitfalls and challenges that IoT developers face. Werecommend future directions for IoT areas that require research and developmentattention.

View record

A Study of the Influence of Assertions and Mutants on Test Suite Effectiveness (2017)

Test suite effectiveness is measured by assessing the portion of faults that canbe detected by tests. To precisely measure a test suite’s effectiveness, oneneed to pay attention to both tests and the set of faults used. Code coverageis a popular test adequacy criterion in practice. Code coverage, however,remains controversial as there is a lack of coherent empirical evidence for itsrelation with test suite effectiveness. More recently, test suite size has beenshown to be highly correlated with effectiveness. However, previous studiestreat test methods as the smallest unit of interest, and ignore potential factorsinfluencing the correlation between test suite size and test suite effectiveness.We propose to go beyond test suite size, by investigating test assertions insidetest methods. First, we empirically evaluate the relationship between a testsuite’s effectiveness and the (1) number of assertions, (2) assertion coverage,and (3) different types of assertions. We compose 6,700 test suites in total,using 24,000 assertions of five real-world Java projects. We find that thenumber of assertions in a test suite strongly correlates with its effectiveness,and this factor positively influences the relationship between test suite sizeand effectiveness. Our results also indicate that assertion coverage is stronglycorrelated with effectiveness. Second, instead of only focusing on the testingside, we propose to investigate test suite effectiveness also by consideringfault types (the ways faults are generated) and faults in different types ofstatements. Measuring a test suite’s effectiveness can be influenced by usingfaults with different characteristics. Assessing test suite effectiveness withoutpaying attention to the distribution of faults is not precise. Our resultsindicate that fault type and statement type where the fault is located cansignificantly influence a test suite’s effectiveness.

View record

Mining and characterizing cross-platform apps (2017)

Smartphones and the applications (apps), which run on them, have grown tremendously over the past few years. To capitalize on this growth and attract more users, developing the same app for different platforms has become a common industry practice. However, each mobile platform has its own development language, Application program interfaces (APIs), software development kits (SDKs) and online stores for distributing apps to users. To understand the characteristics of and differences in how users perceive the same app implemented for and distributed through different platforms, we present a large-scale comparative study of cross-platform apps. We mine the characteristics of 80,000 app-pairs (160K apps in total) from a corpus of 2.4 million apps collected from the Apple and Google Play app stores. We quantitatively compare their app-store attributes, such as stars, versions, and prices. We measure the aggregated user-perceived ratings and find many differences across the platforms. Further, we employ machine learning to classify 1.7 million textual user reviews obtained from 2,000 of the mined app-pairs. We analyze discrepancies and root causes of user complaints to understand cross-platform development challenges that impact cross-platform user-perceived ratings. We also follow up with the developers to understand the reasons behind identified differences.Further, we take a closer look at a special category of cross-platform apps, which are built using Cross Platform Tools (CPTs). CPTs allow developers to use a common code-base to simultaneously create apps for multiple platforms. Apps created using these CPTs are called hybrid apps. We mine 15,512 hybrid apps; measure their aggregated user-perceived ratings and compare them to native apps of the same category.

View record

A study of bugs in test code and a test model for analyzing tests (2016)

Testing has become a wide-spread practice among practitioners. Test cases are written to verify that production code functions as expected and are modified alongside the production code. Over time the quality of the test code can degrade. The test code might contain bugs, or it can accumulate redundant test cases or very similar ones with many redundant parts. The work presented in this dissertation has focused on addressing these issues by characterizing bugs in test code, and proposing a test model to analyze test cases and support test reorganization. To characterize the prevalence and root causes of bugs in the test code, we mine the bug repositories and version control systems of 448 Apache Software Foundation projects. Our results show that around half of all the projects had bugs in their test code; the majority of test bugs are false alarms, i.e., test fails while the production code is correct, while a minority of these bugs result in silent horrors, i.e., test passes while the production code is incorrect; missing and incorrect assertions are the dominant root cause of silent horror bugs; semantic, flaky, environment related bugs are the dominant root cause categories of false alarms. We present a test model for analyzing tests and performing test reorganization tasks in test code. Redundancies increase the maintenance overhead of the test suite and increase the test execution time without increasing the test suite coverage and effectiveness. We propose a technique that uses our test model to reorganize test cases in a way that reduces the redundancy in the test suite. We implement our approach in a tool and evaluate it on four open-source softwares. Our empirical evaluation shows that our approach can reduce the number of redundant test cases up to 85% and the test execution time by up to 2.5% while preserving the test suite’s behaviour.

View record

Characterizing and Refactoring Asynchronous Javascript Callbacks (2016)

Modern web applications make extensive use of JavaScript, which is now estimated to be one of the most widely used languages in the world. Callbacks are a popular language feature in JavaScript. However, they are also a source of comprehension and maintainability issues. We studied several features of callback usage across a large number of JavaScript applications and found out that over 43 of all callback-accepting function call sites are anonymous, the majority of callbacks are nested, and more than half of all callbacks are invoked asynchronously. Promises have been introduced as an alternative to callbacks for composing complex asynchronous execution flow and as a robust mechanism for error checking in JavaScript. We use our observations of callback usage to build a developer tool that refactors asynchronous callbacks into Promises. We show that our technique and tool is broadly applicable to a wide range of JavaScript applications.

View record

Code Smells in Cascading Style Sheets: An Empirical Study and a Predictive Model (2015)

Cascading Style Sheets (CSS) is widely used in today's web applications to separate presentation semantics from HTML content. Despite the simple syntax of CSS, the language has some characteristics, such as inheritance,cascading and specificity, which make authoring and maintaining CSS a challenging task. In this thesis, we describe a set of 26 CSS smells and errors, collected from various development resources and propose an automated technique to detect them. Additionally, we conduct a large empirical study on 500 websites, 5060 CSS files in total which consist of more than 10 millionlines of CSS code, to investigate which smells and errors are more prevalent and to what extent they occur in CSS code of today's web applications. Finally, we propose a model based on the findings of our empirical study that is capable of predicting the total number of CSS code smells in any given website which can be used by developers as a CSS code quality guidance. A study of unused CSS code on 187 websites and its results are also described in this thesis.

View record

Hidden-Web Induced by Client-Side Scripting: An Empirical Study (2014)

Client-side JavaScript is increasingly used for enhancing web application functionality, interactivity, and responsiveness. Through the execution of JavaScript code in browsers, the DOM tree representing a webpage at runtime, can be incrementally updated without requiring a URL change. This dynamically updated content is hidden from general search engines. We present the first empirical study on measuring and characterizing the hidden-web induced as a result of client-side JavaScript execution. Our study reveals that this type of hidden-web content is prevalent in online web applications today: from the 500 websites we analyzed, 95% contain client-side hidden-web content; On those websites that contain client-side hidden-web content, (1) on average, 62% of the web states are hidden, (2) per hidden state, there is an average of 19 kilobytes of data that is hidden from which 0.6 kilobytes contain textual content, (3) the DIV element is the most common clickable element used (61%) to initiate this type of hidden-web state transition, and (4) on average 25 minutes is required to dynamically crawl 50 DOM states. Further, our study indicates that there is a correlation between DOM tree size and hidden-web content, but no correlation exists between the amount of JavaScript code and client-side hidden-web.

View record

Performance improvements in crawling modern web applications (2014)

Today, a considerable portion of our society relies on Web applications to perform numerous tasks in every day life; for example, transferring money over wire or purchasing flight tickets. To ascertain such pervasive Web applications perform robustly, various tools are introduced in the software engineering research community and the industry. Web application crawlers are an instance of such tools used in testing and analysis of Web applications. Software testing, and in particular testing Web applications, play an imperative role in ensuring the quality and reliability of software systems. In this thesis, we aim at optimizing the crawling of modern Web applications in terms of memory and time performances.Modern Web applications are event driven and have dynamic states in contrast to classic Web applications. Aiming at improving the crawling process of modern Web applications, we focus on state transition management and scalability of the crawling process. To improve the time performance of the state transition management mechanism, we propose three alternative techniques revised incrementally. In addition, aiming at increasing the state coverage, i.e. increasing the number of states crawled in a Web application, we propose an alternative solution, reducing the memory consumption, for storage and retrieval of dynamic states in Web applications. Moreover, a memory analysis is performed by using memory profiling tools to investigate the areas of memory performance optimization.The enhancements proposed are able to improve the time performance of the state transition management by 253.34%. That is, the time consumption of the default state transition management is 3.53 times the proposed solution time, which in turn means time consumption is reduced by 71.69%. Moreover, the scalability of the crawling process is improved by 88.16%. That is, the proposed solution covers a considerably greater number of states in crawling Web applications. Finally, we identified the bottlenecks of scalability so as to be addressed in future work.

View record

Understanding Web Application Test Assertion Failures (2014)

Developers often write test cases that assert the behaviour of a web application from an end-user’s perspective. However, when such test cases fail, it is difficult to relate the assertion failure to the faulty line of code. The challenges mainly stem from the existing disconnect between front-end test cases that assert the DOM and the application’s underlying JavaScript code. We propose an automated technique to help developers localize the fault related to a test failure. Through a combination of selective code instrumentation and dynamic backward slicing, our technique bridges the gap between test cases and program code. Through an interactive visualization, our approach, implemented in a tool called Camellia, allows developers to easily understand the dynamic behaviour of their application and its relation to the test cases. The results of our controlled experiment show that Camellia improves the fault localization accuracy of developers by a factor of two. Moreover, the implemented approach incurs a low performance overhead.

View record

 
 

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.

 
 

Discover the amazing research that is being conducted at UBC!