Karthik Pattabiraman

Associate Professor

Relevant Degree Programs


Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - Mar 2019)
Understanding motifs of program behaviour and change (2018)

Program comprehension is crucial in software engineering; a necessary step for performing many tasks. However, the implicit and intricate relations between program entities hinder comprehension of program behaviour and change. It is particularly a difficult endeavour to understand dynamic and modern programming languages such as JavaScript, which has grown to be among the most popular languages. Comprehending such applications is challenging due to the temporal and implicit relations of asynchronous, DOM-related and event-driven entities spread over the client and server sides.The goal of the work presented in this dissertation is to facilitate program comprehension through the following techniques. First, we propose a generic technique for capturing low-level event-based interactions in a web application and mapping those to a higher-level behavioural model. This model is then transformed into an interactive visualization, representing episodes of execution through different semantic levels of granularity. Then, we present a DOM-sensitive hybrid change impact analysis technique for JavaScript through a combination of static and dynamic analysis. Our approach incorporates a novel ranking algorithm for indicating the importance of each entity in the impact set. Next, we introduce a method for capturing a behavioural model of full-stack JavaScript applications’ execution. The model is temporal and context-sensitive to accommodate asynchronous events, as well as the scheduling and execution of lifelines of callbacks. We present a visualization of the model to facilitate program comprehension for developers. Finally, we propose an approach for facilitating comprehension by creating an abstract model of software behaviour. The model encompasses hierarchies of recurring and application-specific motifs. The motifs are abstract patterns extracted from traces through our novel technique, inspired by bioinformatics algorithms. The motifs provide an overview of the behaviour at a high level, while encapsulating semantically related sequences in execution. We design a visualization that allows developers to observe and interact with inferred motifs.We implement our techniques in open-source tools and evaluate them through a set of controlled experiments. The results show that our techniques significantly improve developers’ performance in comprehending the behaviour and impact of change in software systems.

View record

Security analysis and intrusion detection for embedded systems (2017)

No abstract available.

On the detection, localization and repair of client-side JavaScript faults (2016)

With web application usage becoming ubiquitous, there is greater demand for making such applications more reliable. This is especially true as more users rely on web applications to conduct day-to-day tasks, and more companies rely on these applications to drive their business. Since the advent of Web 2.0, developers often implement much of the web application’s functionality at the client-side, using client-side JavaScript. Unfortunately, despite repeated complaints from developers about confusing aspects of the JavaScript language, little work has been done analyzing the language’s reliability characteristics. With this problem in mind, we conducted an empirical study of real-world JavaScript bugs, with the goal of understanding their root cause and impact. We found that most of these bugs are DOM-related, which means they occur as a result of the JavaScript code’s interaction with the Document Object Model (DOM). Having gained a thorough understanding of JavaScript bugs, we designed techniques for automatically detecting, localizing and repairing these bugs. Our localization and repair techniques are implemented as the AutoFLox and Vejovis tools, respectively, and they target bugs that are DOM-related. In addition, our detection techniques – Aurebesh and Holocron – attempt to find inconsistencies that occur in web applications written using JavaScript Model-View-Controller (MVC) frameworks. Based on our experimental evaluations, we found that these tools are highly accurate, and are capable of finding and fixing bugs in real-world web applications.

View record

Tolerating intermittent hardware errors : characterization, diagnosis and recovery (2013)

Over three decades of continuous scaling in CMOS technology has led to tremendous improvements in processor performance. At the same time, the scaling has led to an increase in the frequency of hardware errors due to high process variations, extreme operating conditions and manufacturing defects. Recent studies have found that 40% of the processor failures in real-world machines are due to intermittent hardware errors. Intermittent hardware errors are non-deterministic bursts of errors that occur in the same physical location. Intermittent errors have characteristics that are different from transient and permanent errors, which makes it challenging to devise efficient fault tolerance techniques for them.In this dissertation, we characterize the impact of intermittent hardware faults on programs using fault injection experiments at the micro-architecture level. We find that intermittent errors are likely to generate software visible effects when they occur. Based on our characterization results, we build intermittent error tolerance techniques with focus on error diagnosis and recovery. We first evaluate the impact of different intermittent error recovery scenarios on a processor's performance and availability. We then propose DIEBA (Diagnose Intermittent hardware Errors in microprocessors by Backtracing Application), a software-based technique to diagnose the fault-prone functional units in a processor.

View record

Master's Student Supervision (2010-2017)
Configurable detection of SDC-causing errors in programs (2015)

Silent Data Corruption (SDC) is a serious reliability issue in many domains, including embedded systems. However, current protection techniques are brittle, and do not allow programmers to trade off performance for SDC coverage. Further, many of them require tens of thousands of fault injection experiments, which are highly time-intensive. In this paper, we propose two empirical models, namely SDCTune and SDCAuto, to predict the SDC proneness of a program’s data. Both models are based on static and dynamic features of the program alone, and do not require fault injections to be performed. We then develop an algorithm using both models to selectively protect the most SDC-prone data in the program subject to a given performance overhead bound. Our results show that both models are accurate at predicting the SDC rate of an application. And in terms of efficiency of detection (i.e., ratio of SDC coverage provided to performance overhead), our technique outperforms full duplication by a factor of 0.78x to 1.65x with SDCTune model, and 0.62x to 0.96x with SDCAuto model.

View record

Finding resilience-friendly compiler optimizations using meta-heuristic search techniques (2015)

With the projected increase in hardware error rates in the future, software needs to be resilient to hardware faults. An important factor affecting a program's error resilience is the set of optimizations used when compiling it. Compiler optimizations typically optimize for performance or space, and rarely for error resilience. However, prior work has found that applying optimizations injudiciously can lower the program's error resilience as they often eliminate redundancy in the program. In this work, we propose automated techniques to find the set of compiler optimizations that can boost performance without degrading its overall resilience. Due to the large size of the search space, we use search heuristic algorithms to efficiently explore the space and find an optimal sequence of optimizations for a given program. We find that the resulting optimization sequences have significantly higher error resilience than the standard optimization levels (i.e., O1, O2, O3), while attaining comparable performance improvements with the optimizations levels. We also find that the resulting sequences reduce the overall vulnerability of the applications compared to the standard optimization levels.

View record

Mining Stack Overflow for questions asked by web developers (2015)

No abstract available.

Evaluating the Error Resilience of GPGPU Applications (2014)

No abstract available.

Failure analysis and prediction in compute clouds (2014)

Most cloud computing clusters are built from unreliable, commercial off-the-shelf components compared with supercomputer clusters. The high failure rates in their hardware and software components result in frequent node and application failures. Therefore, it is important to understand their failures to design a reliable cloud system. This thesis presents a characterization study of cloud application failures, and proposes a method to predict application failures in order to save resources.We first analyze a workload trace from a production cloud cluster and characterize the observed failures. The goal of our work is to improve the understanding of failures in compute clouds. We present the statistical properties of job and task failures, and attempt to correlate them with key scheduling constraints, node operations, and attributes of users in the cloud. We observe that there are many opportunities to enhance the reliability of the applications running in the cloud, and further nd that resource usage patterns of the jobs can be leveraged by failure prediction techniques.Next, we propose a prediction method based on recurrent neural networks to identify the failures. It takes the resource usage measurements or performance data, and generate features to categorize the applications into different classes. We then evaluate the method on the cloud workload trace. Our results show that the model is able to predict application failures. Moreover, we explore early classification to identify failures, and find that the prediction algorithm provides the cloud system enough time to take proactive actions much earlier than the termination of applications to avoid resource wastage.

View record

Integrated hardware-software diagnosis of intermittent faults (2014)

Intermittent hardware faults are hard to diagnose as they occur non-deterministically. Hardware-only diagnosis techniques incur significant power and area overheads. On the other hand, software-onlydiagnosis techniques have low power and area overheads, but have limited visibility into many micro-architecturalstructures and hence cannot diagnose faults in them. To overcome these limitations, we propose a hardware-softwareintegrated framework for diagnosing intermittent faults. The hardware part of our framework, called SCRIBEcontinuously records the resource usage information of every instruction in the processor, and exposes it tothe software layer. SCRIBE has 0.95% on-chip area overhead, incurs a performance overhead of 12% and power overhead of 9%, on average.The software part of our framework is called SIED and uses backtracking from the program's crash dump to find the faulty micro-architectural resource. Our technique has an average accuracy of 84% in diagnosing the faulty resource, which in turn enables fine-grained deconfiguration with less than 2% performance loss after deconfiguration.

View record

Error detection for soft computing applications (2013)

Hardware errors are on the rise with reducing chip sizes, and power constraints have necessitated the involvement of software in hardware error detection. At the same time, emerging workloads in the form of soft computing applications, (e.g., multimedia applications) can tolerate most hardware errors as long as the erroneous outputs do not deviate significantly from error-free outcomes. We term outcomes that deviate significantly from the error-free outcomes as Egregious Data Corruptions (EDCs). In this thesis, we propose a technique to place detectors for selectively detecting EDC causing errors in an application. Our technique identifies program locations for placing high coverage detectors for EDCs using static analysis and runtime profiling. We evaluate our technique on six benchmarks to measure the EDC coverage under given performance overhead bounds. Our technique achieves an average EDC coverage of 82%, under performance overheads of 10%, while detecting only 10% of the Non-EDC and benign faults. We also explore the performance-resilience tradeoff space, by studying the effect of compiler optimizations on the error resilience of soft computing applications, both with and without our technique.

View record

Characterizing the JavaScript errors that occur in production web applications : an empirical study (2012)

Client-side JavaScript is being widely used in popular web applications to improve functionality, increase responsiveness, and decrease load times. However, it is challenging to build reliable applications using JavaScript. This work presents an empirical characterization of the error messages printed by JavaScript code in web applications, and attempts to understand their root causes.We find that JavaScript errors occur in production web applications, and that the errors fall into a small number of categories. In addition, we find that certain types of web applications are more prone to JavaScript errors than others. We further find that both non-deterministic and deterministic errors occur in the applications, and that the speed of testing plays an important role in exposing errors. Finally, we study the correlations among the static and dynamic properties of the application and the frequency of errors in it in order to understand the root causes of the errors.

View record

Hardware error detection in multicore parallel programs (2012)

The scaling of Silicon devices has exacerbated the unreliability of modern computer systems, and power constraints have necessitated the involvement of software in hardware error detection. Simultaneously, the multi-core revolution has impelled software to become parallel. Therefore, there is a compelling need to protect parallel programs from hardware errors.Parallel programs’ tasks have significant similarity in control data due to the use of high-level programming models. In this thesis, we propose BlockWatch to leverage the similarity in parallel program’s control data for detecting hardware errors. BlockWatch statically extracts the similarity among different threads of a parallel program and checks the similarity at runtime. We evaluate BlockWatch on eight SPLASH-2 benchmarks to measure its performance overhead and error detection coverage. We find that BlockWatch incurs an average overhead of 15% across all programs, and provides an average SDC coverage of 97% for faults in the control data.

View record

News Releases

This list shows a selection of news releases by UBC Media Relations over the last 5 years.

Current Students & Alumni


If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.