Wyeth Wasserman

Professor

Research Interests

Creation of computational methods for the analysis of genome sequences (bioinformatics)
Study of cis-regulatory elements controlling gene transcription
Applied analyses of genome sequences (genomics)
Indigenous genomics

Relevant Degree Programs

Affiliations to Research Centres, Institutes & Clusters

Research Options

I am available and interested in collaborations (e.g. clusters, grants).
I am interested in and conduct interdisciplinary research.
I am interested in working with undergraduate students on research projects.
 
 

Research Methodology

bioinformatics

Recruitment

Postdoctoral Fellows
Any time / year round

Gene Regulation

Amongst the most important challenges of this era of life science research is understanding the regulation of gene expression, a process that allows an incredible diversity of cells to be produced from the same genome sequence. During development and across physiological conditions, a set of proteins, called Transcription Factors (TFs), interact with the genome to control the activity of genes. The roughly ~1500 TFs in the human genome cooperate in different combinations and interact with other regulatory processes. The lab studies gene regulation via multiple lines. First, the lab creates novel algorithms and software to predict interactions between TFs and DNA. Second, the lab collaborates on the analysis of emerging types of data, to identify active regulatory regions (e.g. enhancer or promoter regions in the genome) in specific biological processes, such as the transition from stem cells into differentiated cells. Third, the lab designs compact DNA sequences, based on regulatory regions in the human genome, to direct gene expression from virus-based gene therapy vectors.

Genome Analysis

Genome Sequencing has accelerated health research, particularly disease genetics. The lab has been developing computational methods and tools to allow researchers and clinicians to identify functional consequences of genetic variations within the human genome, both in the protein coding and in the non-coding space. The latter effort is fueled by the gene regulation bioinformatics research in the lab.

Engaging with patients and clinicians both locally through BC Children’s Hospital, and through international collaborations, our genomics analyses enable the diagnosis, and in some cases treatment, of previously undiagnosed cases. As DNA sequencing technology has revolutionized the diagnosis and management of rare genetic disorders, the Wasserman lab has embarked on an endeavour to make the technology available to currently underrepresented populations, namely the indigenous populations of Canada. Learn more about the Silent Genome Project.

Join Us!

We are always looking for curious individuals with a talent in computing, genomics and gene regulation. Feel free to contact us to explore matching interests. 

Postdoctoral fellows

SILENT GENOMES PROJECT

For the amazing silent genomes project we need a post-doc with an interest in equitable access to genome medicine. Creating resources in partnership with Canada's Indigenous communities that positively impact clinical genetics and empower choice.

GENE REGULATION

We are developing new approaches based on Deep Learning.  Ideally candidates will have experience with machine learning methods, but candidates with experience across the life sciences who have demonstrated a strong commitment to developing programming skills are encouraged to apply.

Graduate students

The lab is not presently seeking graduate students. We do review applications and would consider exceptional candidates at anytime.  However, we do not currently anticipate taking on new students until 2023.  When we do take on students, most pursue their training within the UBC Bioinformatics Graduate Program.

Undergraduate students

We periodically welcome UBC Work-Learn students, coop students from across Canada, and UBC or SFU students conducting undergraduate thesis studies. 

Other positions

No other positions are currently posted. 

Notice for Potential Applicants

Our team is constantly changing. The students and post-docs in the group have historically done well, with alumni working in both industry and academia. We take pride in teamwork and maintaining a positive research environment. Opportunities are always available for exceptional students and post-docs. Computer programming skills are essential—we work in a linux environment and develop our own software (primarily in Python).

I am interested in supervising students to conduct interdisciplinary research.

Complete these steps before you reach out to a faculty member!

Check requirements
  • Familiarize yourself with program requirements. You want to learn as much as possible from the information available to you before you reach out to a faculty member. Be sure to visit the graduate degree program listing and program-specific websites.
  • Check whether the program requires you to seek commitment from a supervisor prior to submitting an application. For some programs this is an essential step while others match successful applicants with faculty members within the first year of study. This is either indicated in the program profile under "Admission Information & Requirements" - "Prepare Application" - "Supervision" or on the program website.
Focus your search
  • Identify specific faculty members who are conducting research in your specific area of interest.
  • Establish that your research interests align with the faculty member’s research interests.
    • Read up on the faculty members in the program and the research being conducted in the department.
    • Familiarize yourself with their work, read their recent publications and past theses/dissertations that they supervised. Be certain that their research is indeed what you are hoping to study.
Make a good impression
  • Compose an error-free and grammatically correct email addressed to your specifically targeted faculty member, and remember to use their correct titles.
    • Do not send non-specific, mass emails to everyone in the department hoping for a match.
    • Address the faculty members by name. Your contact should be genuine rather than generic.
  • Include a brief outline of your academic background, why you are interested in working with the faculty member, and what experience you could bring to the department. The supervision enquiry form guides you with targeted questions. Ensure to craft compelling answers to these questions.
  • Highlight your achievements and why you are a top student. Faculty members receive dozens of requests from prospective students and you may have less than 30 seconds to pique someone’s interest.
  • Demonstrate that you are familiar with their research:
    • Convey the specific ways you are a good fit for the program.
    • Convey the specific ways the program/lab/faculty member is a good fit for the research you are interested in/already conducting.
  • Be enthusiastic, but don’t overdo it.
Attend an information session

G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.

 

Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - Nov 2020)
Expanding the utility of whole genome sequencing in the diagnosis of rare genetic disorders (2020)

The emergence of whole genome sequencing (WGS) has revolutionized the diagnosis of rare genetic disorders, advancing the capacity to identify the “causal” gene responsible for disease phenotypes. In a single assay, many classes of genomic variants can be detected from small single nucleotide changes to large insertions, deletions and duplications. While WGS has enabled a significant increase in the diagnostic rate compared to previous assays, at least 50% of cases remain unsolved. The lack of a diagnosis is the result of both limitations in variant calling, and in variant interpretation. As the field of genomic medicine continues to advance, the emergence of novel bioinformatic approaches to variant calling and interpretation herald promise for the future of undiagnosed cases. In the applied setting, innovation is driven by anecdotes of complex diagnoses, which in turn lead to the development of novel tools and approaches. This is a key theme within this thesis work, where in-depth analysis of a single undiagnosed case leads to an appreciation for a challenging class of variants–short tandem repeats–which in turn leads to the development of novel software for detecting these variants in WGS data. Following the anecdote and novel tool development came an appreciation for the role of simulation, both in enabling the development and in the uptake of bioinformatic innovation for diagnostic analysis pipelines. This appreciation led to the development of a rare disease scenario simulator, which can simulate complex variants in multiple inheritance patterns to emulate challenging cases. Lastly, appreciating the limitations of the linear reference genome, I develop a framework for detecting the presence of user-specified sequences within unmapped read sets. This flexible framework can reproduce microarray-like coverage profiles, and genotype SNPs to identify ancestry and sex which can inform the choice of personalized reference genomes in emergent analysis pipelines. Together, the novel short tandem repeat discovery, bioinformatic innovation, and increased capacity to simulate rare disease cases, expand the utility of whole genome sequencing in the diagnosis of rare genetic diseases.

View record

Development of human-computer interactive approaches for rare disease genomics (2019)

Clinical genome sequencing is becoming a tool for standard clinical practice. Many studies have presented sequencing as effective for both diagnosing and informing the management of genetic diseases. However, the task of finding the causal variant(s) of a rare genetic disease within an individual is often difficult due to the large number of identified variants and lack of direct evidence of causality. Current computational solutions harness existing genetic knowledge in order to infer the pathogenicity of the variant(s), as well as filter those unlikely to be pathogenic. Such methods can bring focus to a compact set (less than hundreds) of variants. However, they are not sufficient to interpret causality of variants for patient phenotypes; interpretation involves expert examination and synthesis of complex evidence, clinical knowledge, and experience. To accelerate interpretation and avoid diagnostic delay, computational methods are emerging for automated prioritization that capture, translate, and exploit clinical knowledge. While automation provides efficiency, it does not replace the expert-driven interpretation process. Moreover, knowledge and experience of human experts can be challenging to fully encode computationally.This thesis, therefore, explores an alternative space between expert-driven and computer-driven solutions, where human expertise is deeply embedded within computer-assisted analytic and diagnostic processes via facilitated human-computer interactions. First, clinical experts and their work environment were observed via collaborations in an interdisciplinary exome analysis project as well as in a clinical resource development project. From these observations, we identified two elements of human-computer interaction: characteristic cognitive processes underlying the diagnostic process and information visualization. Exploiting these findings, we designed and evaluated an interactive variant interpretation strategy that augments cognitive processes of clinical experts. We found that this strategy could expedite variant interpretation. We then qualitatively assessed current information visualization practices during clinical exome and genome analyses. Based on the findings of this assessment, we formulated design requirements that can enhance visual interpretation of complex genetic evidence. In summary, this research highlights the synergistic utility of human-computer interaction in clinical exome and genome analyses for rare genetic diagnoses. Furthermore, it exemplifies the importance of empowering the skills of human experts in digital medicine.

View record

Revealing the impact of sequence variants on transcription factor binding and gene expression (2017)

Transcription factors (TFs) can bind to specific regulatory regions to control the expression of target genes. Disruption of TF binding is regarded as one of the key mechanisms by which regulatory variants could act to cause disease. However predicting the functional impact of variants on TF binding remains a major challenge for the field, standing as a key obstacle to achieving the potential of clinical genome analysis. This thesis confronts this challenge from a bioinformatics perspective and addresses two unresolved problems. The first problem is the determination of which genetic variants alter TF binding. Only a small number of allele-specific binding (ASB) events, in which TFs preferentially bind to one of two alleles at heterozygous sites in the genome, have been determined. To study the impact of variants on TF binding, access to a large, gold standard collection of ASB events could facilitate the development of new predictive methods. In Chapter 2, we implemented a pipeline to identify ASB events from ChIP-seq data and applied it to produce one of the largest ASB datasets. We found that ASB events were associated with allelic alterations of TF motifs, chromatin accessibility and histone modifications. Using the available features, classifiers were trained to predict the impact of variants on TF binding. To improve ASB calling, Chapter 3 evaluated five statistical methods, ultimately supporting a method that pooled ChIP-seq replicates and utilized a binomial distribution to model allelic read counts.The second problem is to determine how altered TF binding events impact the expression of target genes. In Chapter 4, we implemented regression-based models to predict gene expression changes based on altered TF binding events across 358 individuals. The models showed predictive capacity for 19.2% of genes, and the key TF binding events in the model provided mechanistic insights as to how these regulatory variants alter gene expression.In summary, this thesis both generated the largest, high-quality collection of ASB events, and developed algorithms to predict variant impact on TF binding and gene expression. The presented work advances the capacity of the field to interpret regulatory variants and will facilitate future clinical genome analysis.

View record

Development and Evaluation of Software for Applied Clinical Genomics (2016)

High-throughput next-generation DNA sequencing has evolved rapidly over the past 20 years. The Human Genome Project published its first draft of the human genome in 2000 at an enormous cost of 3 billion dollars, and was an international collaborative effort that spanned more than a decade. Subsequent technological innovations have decreased that cost by six orders of magnitude down to a thousand dollars, while throughput has increased by over 100 times to a current delivery of gigabase of data per run. In bioinformatics, significant efforts to capitalize on the new capacities have produced software for the identification of deviations from the reference sequence, including single nucleotide variants, short insertions/deletions, and more complex chromosomal characteristics such as copy number variations and translocations. Clinically, hospitals are starting to incorporate sequencing technology as part of exploratory projects to discover underlying causes of diseases with suspected genetic etiology, and to provide personalized clinical decision support based on patients’ genetic predispositions. As with any new large-scale data, a need has emerged for mechanisms to translate knowledge from computationally oriented informatics specialists to the clinically oriented users who interact with it. In the genomics field, the complexity of the data, combined with the gap in perspectives and skills between computational biologists and clinicians, present an unsolved grand challenge for bioinformaticians to translate patient genomic information to facilitate clinical decision-making. This doctoral thesis focuses on a comparative design analysis of clinical decision support systems and prototypes interacting with patient genomes under various sectors of healthcare to ultimately improve the treatment and well-being of patients. Through a combination of usability methodologies across multiple distinct clinical user groups, the thesis highlights reoccurring domain-specific challenges and introduces ways to overcome the roadblocks for translation of next-generation sequencing from research laboratory to a multidisciplinary hospital environment. To improve the interpretation efficiency of patient genomes and informed by the design analysis findings, a novel computational approach to prioritize exome variants based on automated appraisal of patient phenotypes is introduced. Finally, the thesis research incorporates applied genome analysis via clinical collaborations to inform interface design and enable mastery of genome analysis.

View record

Improving the Detection of Transcription Factor Binding Regions (2015)

The identification of non-coding regulatory elements in the genome has been the focus of much experimental and computational effort. However, both experimental data, such as ChIP-seq, and computational methods of transcription factor (TF) binding predictions suffer from a degree of non-specificity. ChIP-seq experiments report regions that don’t contain the expected canonical motif for the ChIPped TF, which may arise from indirect binding or a non-TF-specific mechanism. Computational predictions based on sequence-level information alone are plagued by false positives. This thesis explores computational approaches to improve both the interpretation of large-scale TF binding data, and the detection of TF binding regions.In Chapters 2 and 3 we observe that experimentally defined regulatory regions of the human genome are a mixture of sub-groups reflecting distinct properties. On average a third of a ChIP-seq dataset does not contain the targeted TF’s motif, and within this subset up to 45% of the ChIP-seq peaks are unexpectedly enriched for a small class of non-targeted TFs’ motifs. Many of these regions are not specific to a TF but are ChIPped by multiple diverse TFs across multiple cell types. These recurring regions tend to be the lower scoring peaks of a dataset, are less likely to reproduce between experimental replicates, and tend to associate with cohesin and polycomb protein occupied positions in the genome. The regulatory regions with a greater specificity for a TF do not share these properties. Based on these observations we suggest a TF ‘loading-zone’ model to account for the presence of the aforementioned recurrent regions in ChIP-seq data. In Chapter 4 we further explore the regulatory region subgroups with a biophysical simulator of TF occupancy (tfOS). Within tfOS we have incorporated TF-DNA interaction energies, TF search mechanics, cooperative TF interactions, and sequence accessibility data into the model. Simulations with tfOS across sequences reveal distinct features associated with recurrent and non-recurrent regions described in Chapter 3. The research presented has improved our understanding and interpretation of large-scale TF binding data and advanced our understanding of TF regulatory regions, leading to improved annotation and interpretation of the human genome. Supplementary video material is available at: http://hdl.handle.net/2429/51447

View record

Inferring Novel Relationships through Over-Representation Analysis of Medical Subjects in Biomedical Bibliographies (2012)

MEDLINE®/PubMed® is a richly annotated resource of over 21 million article citations, growing at a modern rate of over 600,000 citations annually. One grand challenge of bioinformatics is analysing the extensive literature for a biomedical entity such as a gene or disease. This thesis explores using over-representation to extract pertinent biomedical annotation from the research articles for an entity. The quantitative profiles generated are compared to predict novel associations between entities.Medical Subject Heading Over-representation Profiles (MeSHOPs) are constructed from the primary literature of an entity of interest. Medical subject annotations for each article are extracted. Statistical tests evaluate the significance of each term’s frequency across the set of articles, compared against an appropriate background set. The resulting MeSHOP is composed of each term and corresponding enrichment p-value. MeSHOPs can be computed for any entity with an associated bibliography of PubMed articles. We evaluate the predictive performance of quantitatively comparing MeSHOPs to discover novel associations between gene and disease entities, achieving up to 16% improvement in accuracy compared to gene or disease baseline features (measured as increased Receiver Operating Characteristic Area Under the Curve). Strong literature annotation level bias on the predictive performance for future gene-disease association was seen. We observe similar results in a parallel analysis of associations between drugs and disease.Efficiently identifying authors with similar research interests is a challenge in science. During the peer review process, authors seek scientists with similar expertise. MeSHOPs are generated for individual authors, identifying their research foci. Extending the methods to allow comparison across large sets of entities, overlapping research interests between researchers were identified. The predictive performance was evaluated for capacity to identify authors working in the same research domains. Biomedical annotation analysis of primary literature provides insight into the areas of research focus, and is demonstrated to link entities through similarities in their MeSHOPs. We quantitatively confirm the trend where well-studied genes, diseases and drugs are more likely to be the focus of further research. MeSHOP analysis demonstrates that knowledge in the annotated primary literature can be efficiently mined, and the untapped knowledge therein can be discovered computationally. 

View record

Evolutionary conserved regulatory programs (2011)

No abstract available.

Computational Prediction of Regulatory Element Combinations and Transcription Factor Cooperativity (2010)

No abstract available.

Master's Student Supervision (2010 - 2018)
Bioinformatics design of cis-regulatory elements controlling human gene expression (2017)

Gene therapy has the potential to not only treat, but cure individuals suffering from inherited diseases. Advances in understanding the human genome and the discovery of causal genes underlying diseases has heightened the need to solve the gene therapy challenge. Viral vectors are often used as a delivery tool for therapeutics, but their safety and efficacy are still being studied. To contribute to this goal, we have created 49 small viral promoters by bioinformatically annotating cis-regulatory regions from which a subset are concatenated with the goal of drivingcell-specific expression of a reporter gene. We have tested a subset of these in mice in vivo. Regulatory region analysis can take a trained designer multiple weeks. To resolve this issue, we have created a semi-automated approach to regulatory region identification, named OnTarget. The OnTarget database accumulates thousands of cell and tissue-specific experiments in order to identify regions informative of regulatory properties. OnTarget is able to identify regulatory regions consistent with those identified by designers. In this capacity, we expect OnTarget to lead to betterand faster identification of cis-regulatory regions for the design of promoters targeting specific sets of cells.

View record

Text based methods for variant prioritization (2017)

Despite improvements in sequencing technologies, DNA sequence variant interpretation for rare genetic diseases remains challenging. In a typical workflow for the Treatable Intellectual Disability Endeavor in B.C. (TIDE BC), a geneticist examines variant calls to establish a set of candidate variants that explain a patient's phenotype. Even with a sophisticated computation pipeline for variant prioritization, they may need to consider hundreds of variants. This typically involves literature searches on individual variants to determine how well they explain the reported phenotype, which is a time consuming process. In this work, text analysis based variant prioritization methods are developed and assessed for the capacity to distinguish causal variants within exome analysis results for a reference set of individuals with metabolic disorders.

View record

Publications

 
 

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.

 
 

Learn about our faculties, research and more than 300 programs in our 2021 Graduate Viewbook!