Alan Hu

Professor

Research Interests

formal verification
formal methods
model checking
software analysis
post-silicon validation
security
nonce to detect automated mining of profiles

Relevant Degree Programs

 

Recruitment

Complete these steps before you reach out to a faculty member!

Check requirements
  • Familiarize yourself with program requirements. You want to learn as much as possible from the information available to you before you reach out to a faculty member. Be sure to visit the graduate degree program listing and program-specific websites.
  • Check whether the program requires you to seek commitment from a supervisor prior to submitting an application. For some programs this is an essential step while others match successful applicants with faculty members within the first year of study. This is either indicated in the program profile under "Requirements" or on the program website.
Focus your search
  • Identify specific faculty members who are conducting research in your specific area of interest.
  • Establish that your research interests align with the faculty member’s research interests.
    • Read up on the faculty members in the program and the research being conducted in the department.
    • Familiarize yourself with their work, read their recent publications and past theses/dissertations that they supervised. Be certain that their research is indeed what you are hoping to study.
Make a good impression
  • Compose an error-free and grammatically correct email addressed to your specifically targeted faculty member, and remember to use their correct titles.
    • Do not send non-specific, mass emails to everyone in the department hoping for a match.
    • Address the faculty members by name. Your contact should be genuine rather than generic.
  • Include a brief outline of your academic background, why you are interested in working with the faculty member, and what experience you could bring to the department. The supervision enquiry form guides you with targeted questions. Ensure to craft compelling answers to these questions.
  • Highlight your achievements and why you are a top student. Faculty members receive dozens of requests from prospective students and you may have less than 30 seconds to peek someone’s interest.
  • Demonstrate that you are familiar with their research:
    • Convey the specific ways you are a good fit for the program.
    • Convey the specific ways the program/lab/faculty member is a good fit for the research you are interested in/already conducting.
  • Be enthusiastic, but don’t overdo it.
Attend an information session

G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.

 

Master's students
Doctoral students
Postdoctoral Fellows
Any time / year round

Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - May 2019)
SAT modulo monotonic theories (2017)

Satisfiability Modulo Theories (SMT) solvers are a class of efficient constraint solvers which form integral parts of many algorithms. Over the years, dozens of different Satisfiability Modulo Theories solvers have been developed, supporting dozens of different logics. However, there are still many important applications for which specialized SMT solvers have not yet been developed.We develop a framework for easily building efficient SMT solvers for previously unsupported logics. Our techniques apply to a wide class of logics which we call monotonic theories, which include many important elements of graph theory and automata theory. Using this SAT Modulo Monotonic Theories framework, we created a new SMT solver, MonoSAT. We demonstrate that MonoSAT improves the state of the art across a wide body of applications, ranging from circuit layout and data center management to protocol synthesis - and even to video game design.

View record

BackSpace : formal analysis for post-silicon debug (2012)

IC technology continues to closely follow Moore's Law, while the ability to verify designs lags behind. The International Technology Roadmap for Semiconductors (ITRS) predicts production of chips using 16nm technology already by 2015, but the verification gap, i.e., advancements in verification technology not keeping up with advancements in design technology, seems to be also increasing at a fast pace. A recent study shows a drop of more than 10 percentage points in the number of 1st-silicon success from 2002 through 2009. By 2007, more than 2/3 of chips had to be respun due to bugs. The increasing verification gap is to blame. Unfortunately, because more bugs are slipping into the fabricated chip, post-silicon debug is the only way to catch them.Post-silicon debug is the problem of determining what's wrong when the fabricated chip of a new design behaves incorrectly. The focus of post-silicon debug is design errors, whereas traditional VLSI test focuses on random manufacturing defects on each fabricated chip. Post-silicon debug currently consumes more than half of the total verification schedule on typical large designs, and the problem is growing worse.The general problem of post-silicon debug is broad and multi-faceted, spurring a diverse variety of research. In this thesis, I focus on one of the most fundamental tasks: getting an execution trace of on-chip signals for many cycles leading up to an observed bug or crash. Until such a trace is obtained, further debugging is essentially impossible, as there is no way to know what happened on the chip. However, the ever-increasing chip complexity compounded with new features that add non-determinism makes computing accurate traces extremely difficult. Thus, to address this issue, I present a novel post-silicon debug framework, which I call BackSpace. From theory to practice, I have methodically developed this framework showing that BackSpace effectively computes accurate traces leading up to a crash state, has low cost (zero-additional hardware overhead), and handles non-determinism. To support my claims, I demonstrated BackSpace with several industrial designs using simulation models, hardware prototypes, and on actual silicon.

View record

Modular verification of shared-memory concurrent system software (2011)

Software is large, complex, and error-prone. According to the US National Institute of Standards and Technology, software bugs cost the US economy an estimated $60 billion each year. The trend in hardware design of switching to multi-core architectures makes software development even more complex. Cutting software development costs and ensuring higher reliability of software is of global interest and a grand challenge. This is especially true of the system software that is the foundation beneath all general-purpose application programs.The verification of system software poses particular challenges: system software is typically written in a low-level programming language with dynamic memory allocation and pointer manipulation, and system software is also highly concurrent, with shared-memory communication being the main concurrent programming paradigm. Available verification tools usually perform poorly when dealing with the aforementioned challenges. This thesis addresses these problems by enabling precise and scalable verification of low-level, shared-memory, concurrent programs. The main contributions are about the interrelated concepts of memory, modularity, and concurrency.First, because programs use huge amounts of memory, the memory is usually modeled very imprecisely in order to scale to big programs. This imprecise modeling renders most tools almost useless in the memory-intensive parts of code. This thesis describes a scalable, yet precise, memory model that offers on-demand precision only when necessary.Second, modularity is the key to scalability, but it often comes with a price --- a user must manually provide module specifications, making the verification process more tedious. This thesis proposes a light-weight technique for automatically inferring an important family of specifications to make the verification process more automatic.Third, the number of program behaviors explodes in the presence of concurrency, thereby greatly increasing the complexity of the verification task. This explosion is especially true of shared-memory concurrency. The thesis presents a static context-bounded analysis that combines a number of techniques to successfully solve this problem.We have implemented the above contributions in the verification tools developed as a part of this thesis. We have applied the tools on real-life system software, and we are already finding critical, previously undiscovered bugs.

View record

Exploiting structure for scalable software verification (2008)

Software bugs are expensive. Recent estimates by the US National Institute of Standards and Technology claim that the cost of software bugs to the US economy alone is approximately 60 billion USD annually. As society becomes increasingly software-dependent, bugs also reduce our productivity and threaten our safety and security. Decreasing these direct and indirect costs represents a significant research challenge as well as an opportunity for businesses.Automatic software bug-finding and verification tools have a potential to completely revolutionize the software engineering industry by improving reliability and decreasing development costs. Since software analysis is in general undecidable, automatic tools have to use various abstractions to make the analysis computationally tractable. Abstraction is a double-edged sword: coarse abstractions, in general, yield easier verification, but also less precise results.This thesis focuses on exploiting the structure of software for abstracting away irrelevant behavior. Programmers tend to organize code into objects and functions, which effectively represent natural abstraction boundaries. Humans use such structural abstractions to simplify their mental models of software and for constructing informal explanations of why a piece of code should work. A natural question to ask is: How can automatic bug-finding tools exploit the same natural abstractions? This thesis offers possible answers.More specifically, I present three novel ways to exploit structure at three different steps of the software analysis process. First, I show how symbolic execution can preserve the data-flow dependencies of the original code while constructing compact symbolic representations of programs. Second, I propose structural abstraction, which exploits the structure preserved by the symbolic execution. Structural abstraction solves a long-standing open problem --- scalable interprocedural path- and context-sensitive program analysis. Finally, I present an automatic tuning approach that exploits the fine-grained structural properties of software (namely, data- and control-dependency) for faster property checking. This novel approach resulted in a 500-fold speedup over the best previous techniques. Automatic tuning not only redefined the limits of automatic software analysis tools, but also has already found its way into other domains (like model checking), demonstrating the generality and applicability of this idea.

View record

Master's Student Supervision (2010 - 2018)
Precisely quantifying software information flow (2016)

A common attack point in a program is the input exposed to the user. The adversary crafts a malicious input that alters some internal state of the program, in order to acquire sensitive data, or gain control of the program's execution. One can say that the input exerts a degree of influence over specific program outputs. Although a low degree of influence does not guarantee the program's resistance to attacks, previous work has argued that a greater degree of influence tends to provide an adversary with an easier avenue of attack, indicating a potential security vulnerability.Quantitative information flow is a framework that has been used to detect a class of security flaws in programs, by measuring an attacker's influence. Programs may be considered as communication channels between program inputs and outputs, and information-theoretic definitions of information leakage may be used in order to measure the degree of influence which a program's inputs can have over its outputs, if the inputs are allowed to vary. Unfortunately, the precise information flow measured by this definition is difficult to compute, and prior work has sacrificed precision, scalability, and/or automation.In this thesis, I show how to compute this information flow (specifically, channel capacity) in a highly precise and automatic manner, and scale to much larger programs than previously possible. I present a tool, nsqflow, that is built on recent advances in symbolic execution and SAT solving. I use this tool to discover two previously-unknown buffer overflows. Experimentally, I demonstrate that this approach can scale to over 10K lines of real C code, including code that is typically difficult for program analysis tools to analyze, such as code using pointers.

View record

Conflict-driven symbolic execution : how to learn to get better (2014)

Due to software complexity, manual and automatic testing are not enough to guarantee the correct behavior of software. One alternative to this limitation is known as Symbolic Execution. Symbolic Execution is a formal verification method that simulates software execution using symbolic values instead of concrete ones. The execution starts with all input variables unconstrained, and assignments that use any input variable are encoded as logical expressions. Whenever a branch is reached, the symbolic execution engine checks which values the branch condition can assume. If more than one valid evaluation is possible, the execution forks, and a new process is created for each possibility.In cases where the program execution is finite, symbolic execution is complete, and potentially executes every reachable program path. However, the number of paths is exponential in the number of branches in the program, and this approach suffers from a problem know as path explosion.This thesis presents a novel algorithm that can dynamically reduce the number of paths explored during symbolic execution in order to prove a given set of properties. The algorithm is capable of learning from conflicts detected while symbolically executing a path.I have named this algorithm Conflict-Driven Symbolic Execution (CDSE), since it was inspired by the conflict-driven clause learning (CDCL) insights introduced by modern boolean satisfiability solvers. The proposed algorithm takes advantage of two features responsible for the success of CDCL solvers: conflict analysis and non-chronological backtracking. In a nutshell, CDSE prunes the search space every time a certain branch is proven infeasible by learning the reason why there is aconflict.In order to assess the proposed algorithm, this thesis presents a proof-of-concept CDSE tool named Kite, and compares its performance to the state-of-the-art symbolic execution tool Klee. The results are encouraging, and present practical evidence that conflict-driven symbolic execution can perform better than regular symbolic execution.

View record

Post-silicon code coverage for functional verification of systems-on-chip (2012)

Post-silicon validation requires effective techniques to better evaluate the functional correctness of modern systems-on-chip. Coverage is the standard measure for validation effectiveness and is extensively used pre-silicon. However, there is little data evaluating the coverage of post-silicon validation efforts on industrial-scale designs. This thesis addresses this knowledge-gap. We employ code coverage, which is one of the most frequently used coverage technique in simulation, and apply it post-silicon. To show our coverage methodology in practice, we employ an industrial-size open source SoC that is based on the SPARC architecture and is synthesizable to FPGA. We instrument code coverage in a number of IP cores and boot Linux as our experiment to evaluate coverage --- booting an OS is a typical industrial post-silicon test. We also compare coverages between pre-silicon directed tests and the post-silicon Linux boot. Our results show that in some blocks, the pre-silicon and post-silicon tests can achieve markedly different coverage figures --- in one block we measured over 50 percentage point coverage difference between the pre- and post-silicon results, which signifies the importance of post-silicon coverage. Moreover, we calculate the area overhead imposed by the additional coverage circuitry on-chip. We apply state-of-the-art software analysis techniques to reduce the excessively large overhead yet preserve data accuracy. The results in this thesis are valuable data for guidance to future research in post-silicon coverage.

View record

 
 

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.