Sathish Gopalakrishnan

Associate Professor

Relevant Degree Programs


Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - Nov 2019)
High assurance real-time systems : scheduling, budgeting, and safety (2019)

In this dissertation we investigate resource allocation and task scheduling problems pertaining to real-time systems that mandate high levels of assurance. The required assurance guarantees are those of safety and quality of service, atop the basic guarantee required by any (classical) real-time system, namely, timely delivery of responses. We explore three major ideas within the landscape of high assurance real-time systems. First, we study the problem of scheduling Mixed-Criticality real-time tasks, with probabilistic guarantees, on a single processor. The major contribution is that we incorporate software failure rates into the scheduling problem, and we synthesize feasible scheduling policies under which the input task set will satisfy certain failure rate requirements. We model the task scheduling problem as a path-Constrained Markov Decision Process (CMDP). Second, realizing the exorbitant cost of developing safety-critical software, we develop cost-effective methods for achieving both safety integrity and timing predictability through the notion of run-time monitors, coupled with isochronous execution. This model gives rise to interesting multiversion task scheduling problems on multiple processors. We show the intractability of non-preemptive variants of the scheduling problems, and we derive optimal solutions to preemptive variants in polynomial-time. Finally, we consider a general resource budgeting problem where recurrent tasks should maintain certain quality of service levels over an indefinite operational horizon, and where both the task execution demands and the available budgets are noisy and fluctuate randomly. We derive large deviation bounds that are sufficient conditions that would allow the system designer to properly dimension system resources and allocate tasks sufficient budgets to sustain the desired quality of service levels. We then apply the bounds to a case study involving safety-critical systems that employ run-time monitors and isochronous execution, and we show how to synthesize minimum monitor worst-case execution times sufficient to achieve the desired quality of service and meet hard deadlines.

View record

Delay, reliability, and trust in information dissemination in vehicular networks (2016)

The cost of road accidents globally is more than $500 B every year. Intelligent Transportation Systems (ITS) is a promising solution and Vehicular Networks is an ideal candidate for providing a communication platform for ITS applications. Safety-critical applications form the main motivation for intelligent transportation systems. Studying the major concerns in such applications, i.e., delay and reliability, through mathematical analysis is extremely beneficial because it enables us to design optimized schemes. Such analysis is, however, challenging due to the dynamics of a vehicular network. In this research, we have three main contributions. First, we present a mathematical model to study delay and reliability of emergency message dissemination in vehicular networks. This model includes three modules to address effects of channel, contention, and partitioning on delivery delay. This is the first delay model to the best of our knowledge that does all of these, and it gives network architects insight into the limitations of the network and helps them tune parameters such as transmission power and slot time duration. Simulation studies indicate that our model does capture the delay characteristics of vehicular networks for both highway and urban scenarios.An interesting observation from the analytical model confirms the fact that using the vehicle density on the road is a good metric for setting the right forwarding probability in vehicles. We exploit this conclusion and as our second contribution, we propose a completely distributed forwarding strategy, called Middle Is Next or MIN. Extensive simulations affirm the effectiveness of MIN in terms of delay and single-hop reliability in comparison with other well-known routing methods. Trusted communication in vehicular networks is of crucial importance without which all efforts for minimizing the delay or maximizing the reliability could be voided. As our third contribution, we propose FACT: Framework for Application-oriented Context-aware Trust-based communication in vehicular networks. FACT assigns a trust value to each road segment and one to each neighbourhood, instead of each car. Thus, it scales up easily and is completely distributed. Experimental results demonstrate that FACT outperforms other well-known routing protocols since it routes the messages via trusted paths.

View record

Resource allocation and scheduling in wireless mesh networks (2013)

The unreliability of wireless mesh networks creates challenge in designing high performance wireless networks in terms of network throughput, end-to-end delay, and fairness provisioning. In this thesis, the goal is to improve the network performance in terms of these metrics. We explore several techniques such as multipath routing, channel coding, network coding, and interference alignment. We consider resource allocation both in terms of average data rate provisioning and scheduling policies in a time slot basis.First, we propose data rate and channel code rate allocation algorithms for networks with multiple paths to maximize the network throughput while all users can fairly exploit the network resources. We study the effect of adaptive and non-adaptive channel coding schemes. We also consider the end-to-end delay that each network flow experiences for data transmission. For that purpose, we formulate the problem of decreasing the end-to-end delay for network flows while improving the network throughput. Simulation results show that we can decrease the delay at the cost of a slight decrease in network throughput. We also formulate a data rate allocation problem in networks with network coding. Simulation results show that considering link reliabilities in the network coding design dramatically increases the network performance.Data rate allocation algorithms provide the average data rates at which the source must transmit data. They do not determine scheduling on a time slot basis. To address that, we consider transmission scheduling in wireless networks. We also compare the suggested algorithm with a centralized optimal data rate allocation algorithm to verify that our algorithm follows the optimal solution. Through simulations, we show that fairness provisioning leads to higher network performance. We show that the suggested algorithm outperforms the current algorithms in the literature in terms of both network throughput and fairness provisioning.Finally, we consider transmission scheduling in wireless multi-input multi-output (MIMO) systems. We formulate the problem of joint scheduling, interference alignment, and admission control in those networks and use Lyapunov stability theory to solve it. We also develop a heuristic approach to solve the problem in a semi-distributed manner.

View record

Tolerating intermittent hardware errors : characterization, diagnosis and recovery (2013)

Over three decades of continuous scaling in CMOS technology has led to tremendous improvements in processor performance. At the same time, the scaling has led to an increase in the frequency of hardware errors due to high process variations, extreme operating conditions and manufacturing defects. Recent studies have found that 40% of the processor failures in real-world machines are due to intermittent hardware errors. Intermittent hardware errors are non-deterministic bursts of errors that occur in the same physical location. Intermittent errors have characteristics that are different from transient and permanent errors, which makes it challenging to devise efficient fault tolerance techniques for them.In this dissertation, we characterize the impact of intermittent hardware faults on programs using fault injection experiments at the micro-architecture level. We find that intermittent errors are likely to generate software visible effects when they occur. Based on our characterization results, we build intermittent error tolerance techniques with focus on error diagnosis and recovery. We first evaluate the impact of different intermittent error recovery scenarios on a processor's performance and availability. We then propose DIEBA (Diagnose Intermittent hardware Errors in microprocessors by Backtracing Application), a software-based technique to diagnose the fault-prone functional units in a processor.

View record

Master's Student Supervision (2010 - 2018)
Partially-observed graph abstractions for authorship identification and process interference prediction (2017)

Pairwise interactions between objects can be modeled as a graph, which is a set of nodes and edges that each connect a pair of nodes. We consider the problem of predicting whether edges exist between nodes based on other pairs of nodes that we have observed. From a partially-observed graph we extract several neighbourhood and path features, then evaluate several machine learning algorithms for predicting whether edges will exist between unobserved nodes. K-Nearest Neighbours was found to be the best classifier. We propose the novel use of path on a weighted graph as a feature used for prediction. We apply this abstraction to predicting collaboration between authors in an on-line publication database. The unweighted graph contains an edge if two authors collaborated and the weighted graph encodes the number of collaborations. Prediction error rates were less than 3% under ten-fold cross-validation. We also apply this abstraction to predicting whether processes running on the same hardware will compete for resources. The unweighted graph contains an edge if a process executed in more time than when running with another than by itself.The weighted graph had an edge weight that was the increase in execution time. Prediction error rates were less than 14% under ten-fold cross-validation.

View record

StrongHold : a secure data platform for smart homes (2013)

Security is an important concern in day to day life, far augmented when related to sensitive data, such as private information. As such, it is paramount to protect environments and devices wherein the greater amount of information is private, such as in a home, and its encompassing devices; especially so, as the level of privacy desired in a home is much greater than other private mediums. The issue is further exacerbated in a smart home scenario, where applications are ubiquitous and are constantly communicating with the outside world: this impending technological innovation results in previously private data suddenly becoming accessible through non-physical means, allowing potential breaches in privacy. The accessibility to previously unreachable means brings forth new threats that must be tackled to ensure proper confidentiality is kept. Furthermore, proper accessibility of the physical devices must be engaged, due to their newfound non-physically accessibly nature. We survey the security threats existent in smart home environments, along with possible solutions to mitigate the threats. We use methods ranging from human computer interaction, storage optimization, and behavioral learning to better ensure proper functionality. We integrate the solutions in a cohesive form to be applied to any smart home environment that wishes to best keep high confidentiality, availability and integrity. We test the system to ensure the secure data messaging platform provides sufficient throughput for high definition media storage in real time, as potentially necessary in a smart homeWe supplement our study by creating a system capable of functioning on top of the smart server, which unobtrusively automates the daily task of recipe selection, via meal advocation through past meal selection; we do so unobtrusively in an attempt to prevent deviation from conventional home environments. We present a method which possesses higher granularity than the present meal recommendation technologies, and yet, requires less interaction than the web equivalents. Finally, we interface the algorithm to a real smart home server, evaluate the application on real users, and assess their reaction to the technology.

View record

Response-time analysis and overload management in real-time systems (2012)

We provide two approaches to handling overload in shared resources offering realtimeguarantees. We first provide a technique (based on mathematical optimization)for identifying the possible causes for an overload situation by computing theworst-case demand of the system, depending upon the amount of requests serviced.Worst-case analysis of response time has a pseudo-polynomial time complexity,and when there is no knowledge about the workload, the complexity further increases.We provide polynomial-time heuristics to reduce the computation timeof the algorithm. Further, we evaluate it against other techniques using stochasticanalysis to stress on the accuracy and ease of estimation of the result. The schedulingpolicy based on the approach is useful to detect an overload in the resource andto allow us to make responsible decisions on it. Secondly, we present a schedulingpolicy (obtained through stochastic approximation) to handle overload in real-timesystems. Competitive analysis of online algorithms has commonly been applied tounderstand the behavior of real-time systems during overload conditions. Whilecompetitive analysis provides insight into the behavior of certain algorithms, it ishard to make inferences about the performance of those algorithms in practice.Similar on-line scheduling approaches tend to function differently in practice dueto factors. Further, most work on handling overload in real-time systems doesnot consider using information regarding the distribution of arrival rates of jobsand execution times to make scheduling decisions. With some information aboutthe workload, we aim to improve the revenue earned by the service provider, ina scenario when each successful job completion results in revenue accrual. Weprove that the policy we outline does lead to increased revenue when compared toa class of scheduling policies that make static resource allocations to different service classes. We also use empirical evidence to underscore the fact that this policyperforms better than a variety of other scheduling policies. The ideas presentedcan be applied to several soft real-time systems, specifically systems with multipleservice classes.

View record

A study of data partitioning and prefetching for hybrid storage systems (2011)

Storage system performance has received much attention since the early days of computing systems because of the mismatch between computing power and I/O access times. The invention of new technologies have increased storage system performance, but due to the cost-performance trade off no one type of storage media is capable to meet both performance and capacity requirements. This motivated us to study the impact of data management techniques such as data partitioning and correlated prefetching on I/O performance when two different non-volatile storage media are integrated into a computing system. First, we consider partitioning data blocks between two devices, where one device is significantly faster than the other. We assume that significantly faster performance also implies a significantly smaller capacity. Clearly not all data can be stored or cached in the faster device. Second, to improve performance of the slower device, we investigate if correlation-directed prefetching (CDP) may offer significant benefits. Although CDP has been studied previously, we look into some special aspects of it. We analyze how different block correlation analysis heuristics affect the performance of CDP.We developed a simulator to study the effect of the different techniques when using devices with differing characteristics. Our results show that data partitioning can significantly improve storage system performance. For a hard disk and solid-state drive based system, we achieved 2--92% improvement for different traces. We also show that data partitioning based on application long-range block access patterns performs significantly better than caching temporal locality of references.To evaluate the hybrid system in real world settings, we present a case study, a prototype data block manager for Linux-based systems that permits data to be partitioned across an SSD and an HDD. This partitioning is transparent to the file system and the block manager can also trigger data prefetches when there is high correlation between data block accesses.

View record

Approximation algorithms for task allocation with QoS and energy considerations (2011)

We consider general problems of allocating tasks to processors where eachtask is associated with a set of service classes. A service class is a tuple:the first element represents the resource utilization and the second elementthe quality of service associated with that resource utilization. Before allocatinga task to a processor, we need to assign it to a service class. Weconsider some elementary problems that arise from this setting. What is theminimum number of processors needed if we also need to attain a minimumaggregate QoS level? Given a fixed set of processors, what is the maximumattainable aggregate QoS? Such questions apply to the partitioned schedulingof real-time tasks on multiprocessors and to other resource allocationproblems. The basic questions are NP-Hard, and we present polynomialtime approximation algorithms for certain special, but practical, cases. Wemake interesting use of maximum flows in a bipartite graph while developingthe polynomial time approximation schemes. We then integrate energy expenditureto the model above and address the problem of partitioning a setof independent, periodic, real-time tasks over a fixed set of heterogeneousprocessors while minimizing the energy consumption of the computing platformsubject to a guaranteed quality of service requirement. This problemis NP-Hard and we present a fully polynomial time approximation schemefor this problem. The main contribution of our work is in tackling the problemin a completely discrete, and possibly arbitrarily structured, setting. Inother words, each processor has a discrete set of speed choices. Each taskhas a computation time that is dependent on the processor that is chosento execute the task and on the speed at which that processor is operated. Further, the energy consumption of the system is dependent on the decisionsregarding task allocation and speed settings.

View record


If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.


Learn about our faculties, research, and more than 300 programs in our 2021 Graduate Viewbook!