Relevant Thesis-Based Degree Programs
Affiliations to Research Centres, Institutes & Clusters
Amongst the most important challenges of this era of life science research is understanding the regulation of gene expression, a process that allows an incredible diversity of cells to be produced from the same genome sequence. During development and across physiological conditions, a set of proteins, called Transcription Factors (TFs), interact with the genome to control the activity of genes. The roughly ~1500 TFs in the human genome cooperate in different combinations and interact with other regulatory processes. The lab studies gene regulation via multiple lines. First, the lab creates novel algorithms and software to predict interactions between TFs and DNA. Second, the lab collaborates on the analysis of emerging types of data, to identify active regulatory regions (e.g. enhancer or promoter regions in the genome) in specific biological processes, such as the transition from stem cells into differentiated cells. Third, the lab designs compact DNA sequences, based on regulatory regions in the human genome, to direct gene expression from virus-based gene therapy vectors.
Genome Sequencing has accelerated health research, particularly disease genetics. The lab has been developing computational methods and tools to allow researchers and clinicians to identify functional consequences of genetic variations within the human genome, both in the protein coding and in the non-coding space. The latter effort is fueled by the gene regulation bioinformatics research in the lab.
Engaging with patients and clinicians both locally through BC Children’s Hospital, and through international collaborations, our genomics analyses enable the diagnosis, and in some cases treatment, of previously undiagnosed cases. As DNA sequencing technology has revolutionized the diagnosis and management of rare genetic disorders, the Wasserman lab has embarked on an endeavour to make the technology available to currently underrepresented populations, namely the indigenous populations of Canada. Learn more about the Silent Genome Project.
We are always looking for curious individuals with a talent in computing, genomics and gene regulation. Feel free to contact us to explore matching interests.
SILENT GENOMES PROJECT
For the amazing silent genomes project we need a post-doc with an interest in equitable access to genome medicine. Creating resources in partnership with Canada's Indigenous communities that positively impact clinical genetics and empower choice.
We are developing new approaches based on Deep Learning. Ideally candidates will have experience with machine learning methods, but candidates with experience across the life sciences who have demonstrated a strong commitment to developing programming skills are encouraged to apply.
The lab is not presently seeking graduate students. We do review applications and would consider exceptional candidates at anytime. However, we do not currently anticipate taking on new students until 2023. When we do take on students, most pursue their training within the UBC Bioinformatics Graduate Program.
We periodically welcome UBC Work-Learn students, coop students from across Canada, and UBC or SFU students conducting undergraduate thesis studies.
No other positions are currently posted.
Notice for Potential Applicants
Our team is constantly changing. The students and post-docs in the group have historically done well, with alumni working in both industry and academia. We take pride in teamwork and maintaining a positive research environment. Opportunities are always available for exceptional students and post-docs. Computer programming skills are essential—we work in a linux environment and develop our own software (primarily in Python).
Complete these steps before you reach out to a faculty member!
- Familiarize yourself with program requirements. You want to learn as much as possible from the information available to you before you reach out to a faculty member. Be sure to visit the graduate degree program listing and program-specific websites.
- Check whether the program requires you to seek commitment from a supervisor prior to submitting an application. For some programs this is an essential step while others match successful applicants with faculty members within the first year of study. This is either indicated in the program profile under "Admission Information & Requirements" - "Prepare Application" - "Supervision" or on the program website.
- Identify specific faculty members who are conducting research in your specific area of interest.
- Establish that your research interests align with the faculty member’s research interests.
- Read up on the faculty members in the program and the research being conducted in the department.
- Familiarize yourself with their work, read their recent publications and past theses/dissertations that they supervised. Be certain that their research is indeed what you are hoping to study.
- Compose an error-free and grammatically correct email addressed to your specifically targeted faculty member, and remember to use their correct titles.
- Do not send non-specific, mass emails to everyone in the department hoping for a match.
- Address the faculty members by name. Your contact should be genuine rather than generic.
- Include a brief outline of your academic background, why you are interested in working with the faculty member, and what experience you could bring to the department. The supervision enquiry form guides you with targeted questions. Ensure to craft compelling answers to these questions.
- Highlight your achievements and why you are a top student. Faculty members receive dozens of requests from prospective students and you may have less than 30 seconds to pique someone’s interest.
- Demonstrate that you are familiar with their research:
- Convey the specific ways you are a good fit for the program.
- Convey the specific ways the program/lab/faculty member is a good fit for the research you are interested in/already conducting.
- Be enthusiastic, but don’t overdo it.
G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.
ADVICE AND INSIGHTS FROM UBC FACULTY ON REACHING OUT TO SUPERVISORS
These videos contain some general advice from faculty across UBC on finding and reaching out to a potential thesis supervisor.
Graduate Student Supervision
Doctoral Student Supervision
Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.
The regulation of gene expression is a core challenge in understanding how diverse types of cells can be produced from the same DNA instructions. Insights about this complex machinery advance not only science but applications in therapy and pharmacology. For instance, the differentiation of stem cells for the purpose of regenerative medicine to treat patients with diabetes. In my second chapter, I address the problem of optimizing the differentiation protocol towards definitive endoderm, the precursor of insulin-producing pancreatic beta cells, by replacing the expensive growth factor with cheap molecule alternatives. I introduce a multiple-step pipeline based on small molecule transcriptome response profiles. The discovered chemicals emphasize the importance of key transcription factors in the process, such as HIF and MYC. The study of transcription factors is of high importance, and will further promote our knowledge about differentiation. Motivated by the thought, I explore the current trends of studying transcription factors in the gene regulation context. With large-scale data generation efforts by public consortia such as ENCODE, deep learning methods have become pervasive. A large training dataset is fundamental to the success of these methods, however, the amount of TF-related data is often small. To tackle this issue, in my third chapter, I perform an in-depth assessment of transfer learning for TF binding prediction and provide biologically motivated guidelines for efficient training of deep models when the data is limited. An additional challenge for deep models beyond data sufficiency is interpretability. In the fourth chapter, I systematically categorize and summarize interpretation approaches, exploring their underlying assumptions, strengths, and weaknesses. Inspired by transparent deep learning architectures, I present ExplaiNN, a new transparent model for the genomics tasks. I explore its efficiency and usability on a variety of problems in the fifth chapter of this thesis. Finally, in the last chapter, I apply ExplaiNN to ATAC-seq datasets of mouse and human immune systems to study differences in cis-regulatory logic. Transparency of the new method allowed me to discover a reproducible set of sequence motifs that either individually or combinatorially are responsible for the bulk of the predictions, and tend to have species-specific occurrence patterns. Supplementary materials available at: http://hdl.handle.net/2429/82693
The human reference genome provides a framework against which the analysis and interpretation of an individual’s genome can be performed. Over the past twenty years the cost of genome sequencing has dropped from a prohibitive amount of hundreds of millions of dollars, to just a few thousand dollars. This has brought genome sequencing in line with the cost of other diagnostic medical tests, leading to a rapid uptake in both clinical and research settings. As a consequence of this global spread, deficiencies and population-specific inequities have emerged from the use of a framework that relies upon a single linear reference sequence. Partial, ad-hoc solutions, such as the introduction of alternative sequences for sections of the genome, have provided a stopgap but fail to fully represent the wealth of information now known about the level of variation that exists within and between populations. This thesis presents an alternative perspective on how we can take advantage of new computational methods to enhance the reference genome in the era of widespread sequencing and big data. An argument is given to motivate the revaluation of the role of the reference genome, and calls for a non-indexed, mutable reference framework with the crucial indexing methods to be shifted from the linear reference to a raw read set. A patented, edge-labelled, cyclic, graph-based model, the GNOmics Graph Model, is introduced as a flexible framework against which read alignment and variant calling can be performed. The value of indexing raw reads is explored through a published tool, FlexTyper, which allows a read set to be screened for informative markers. While there is still an ongoing global discussion as to how best to improve the reference genome, this thesis provides a thought-provoking reconceptualisation of applied human genome analysis.
High-grade serous ovarian cancer (HGSC) is the most common and lethal histotype of epithelial ovarian cancer. Often presenting as multi-site disease, HGSC exhibits extensive malignant clonal diversity with widespread but non-random patterns of disease dissemination. The proclivity of HGSC toward clonally heterogeneous disease is thought to underlie the prevalence of treatment-resistant disease. Yet, the factors that influence the spatial distribution of cancer clones in HGSC remain largely uncharacterized. Hypothesizing that distinct peritoneal niches formed by microenvironmental cell types shape the observed patterns of clonal dynamics in HGSC, the primary aim of this thesis was to understand how microenvironmental factors influence malignant cell evolutionary dynamics.To establish the experimental substrate for this thesis, I led the construction of a cohort of 148 tumour samples from 41 HGSC cases (Chapter 2). In addition to coordinating clinical case identification, I oversaw and learned how to create patient-derived xenograft models and conduct single cell experiments from patient tumours. Leveraging this resource, I explored whether local immune microenvironment factors shape tumor progression properties at the interface of tumor-infiltrating lymphocytes and cancer cells (Chapter 3). Through multi-region study with whole-genome sequencing, immunohistochemistry, image analysis, gene expression profiling, and T- and B-cell receptor sequencing, I identified three immunologic subtypes across samples associated with patterns of malignant clonal diversity. These findings were consistent with immunological pruning of tumor clones. Finally, in order to explore the non-lymphocytic components of the tumour microenvironment, I developed an automated approach to cell type identification from single cell RNA-seq data that eliminates the manual work involved in traditional workflows reliant on post-hoc expert annotation (Chapter 4). I demonstrated how this method performs superiorly to state-of-the-art workflows for cell type identification and applied the method to profile the HGSC microenvironment.Collectively, this work highlights multiple interfaces of evolutionary interplay between malignant and non-malignant cells in the HGSC microenvironment, identifying novel mechanisms by which tumour cells escape from immune recognition. These results will inform the interpretation of results from immunotherapy clinical trials and set the stage for comprehensive microenvironment profiling in large HGSC cohorts and other cancers. Supplementary materials available at: http://hdl.handle.net/2429/70673
Eye misalignment, or strabismus, has a frequency of up to 4% in a population, and is known to have both environmental and genetic causes. Genes associated with syndromic forms of strabismus (i.e. strabismus concurrent with multiple phenotypes) have emerged, but genes contributing to isolated strabismus remain to be discovered. Only one isolated strabismus locus, STBMS1 on chromosome 7, has been confirmed in more than one family, but the inheritance model of the locus is inconsistent between studied families and no specific causal variant has been reported. The large set of syndromes with strabismus suggests that within the visual system multiple perturbations of an underlying genetic network(s) can have the common output of disrupted eye alignment. Thus, I used a bioinformatic-driven approach to analyze curated genes associated with strabismus to provide insight into the biological mechanisms underlying strabismus, highlighting a link to the Ras-MAPK pathway. During the process, I noticed strabismus presenting within a large number of intellectual disability disorders. Therefore, I studied the co-occurrence of strabismus and other common phenotypes in a series of patients with intellectual disability, which confirmed a significant correlation between eye alignment and intellectual disability. Finally, I resumed efforts from my prior studies to identify the genetic cause in a seven-generation family with isolated strabismus inherited in an autosomal dominant manner. The likely casual gene disruption, altering a likely cis-regulatory region of the FOXG1gene, was identified through the incorporation of linkage analysis, next generation sequencing, and in-depth bioinformatic analyses. This thesis identifies potential roles for genes participating in the Ras-MAPK pathway, emphasizes the role of the central nervous system, and reveals FOXG1 as a causal gene candidate for isolated strabismus. Supplementary material available at: http://hdl.handle.net/2429/70670
The emergence of whole genome sequencing (WGS) has revolutionized the diagnosis of rare genetic disorders, advancing the capacity to identify the “causal” gene responsible for disease phenotypes. In a single assay, many classes of genomic variants can be detected from small single nucleotide changes to large insertions, deletions and duplications. While WGS has enabled a significant increase in the diagnostic rate compared to previous assays, at least 50% of cases remain unsolved. The lack of a diagnosis is the result of both limitations in variant calling, and in variant interpretation. As the field of genomic medicine continues to advance, the emergence of novel bioinformatic approaches to variant calling and interpretation herald promise for the future of undiagnosed cases. In the applied setting, innovation is driven by anecdotes of complex diagnoses, which in turn lead to the development of novel tools and approaches. This is a key theme within this thesis work, where in-depth analysis of a single undiagnosed case leads to an appreciation for a challenging class of variants–short tandem repeats–which in turn leads to the development of novel software for detecting these variants in WGS data. Following the anecdote and novel tool development came an appreciation for the role of simulation, both in enabling the development and in the uptake of bioinformatic innovation for diagnostic analysis pipelines. This appreciation led to the development of a rare disease scenario simulator, which can simulate complex variants in multiple inheritance patterns to emulate challenging cases. Lastly, appreciating the limitations of the linear reference genome, I develop a framework for detecting the presence of user-specified sequences within unmapped read sets. This flexible framework can reproduce microarray-like coverage profiles, and genotype SNPs to identify ancestry and sex which can inform the choice of personalized reference genomes in emergent analysis pipelines. Together, the novel short tandem repeat discovery, bioinformatic innovation, and increased capacity to simulate rare disease cases, expand the utility of whole genome sequencing in the diagnosis of rare genetic diseases.
Clinical genome sequencing is becoming a tool for standard clinical practice. Many studies have presented sequencing as effective for both diagnosing and informing the management of genetic diseases. However, the task of finding the causal variant(s) of a rare genetic disease within an individual is often difficult due to the large number of identified variants and lack of direct evidence of causality. Current computational solutions harness existing genetic knowledge in order to infer the pathogenicity of the variant(s), as well as filter those unlikely to be pathogenic. Such methods can bring focus to a compact set (less than hundreds) of variants. However, they are not sufficient to interpret causality of variants for patient phenotypes; interpretation involves expert examination and synthesis of complex evidence, clinical knowledge, and experience. To accelerate interpretation and avoid diagnostic delay, computational methods are emerging for automated prioritization that capture, translate, and exploit clinical knowledge. While automation provides efficiency, it does not replace the expert-driven interpretation process. Moreover, knowledge and experience of human experts can be challenging to fully encode computationally.This thesis, therefore, explores an alternative space between expert-driven and computer-driven solutions, where human expertise is deeply embedded within computer-assisted analytic and diagnostic processes via facilitated human-computer interactions. First, clinical experts and their work environment were observed via collaborations in an interdisciplinary exome analysis project as well as in a clinical resource development project. From these observations, we identified two elements of human-computer interaction: characteristic cognitive processes underlying the diagnostic process and information visualization. Exploiting these findings, we designed and evaluated an interactive variant interpretation strategy that augments cognitive processes of clinical experts. We found that this strategy could expedite variant interpretation. We then qualitatively assessed current information visualization practices during clinical exome and genome analyses. Based on the findings of this assessment, we formulated design requirements that can enhance visual interpretation of complex genetic evidence. In summary, this research highlights the synergistic utility of human-computer interaction in clinical exome and genome analyses for rare genetic diagnoses. Furthermore, it exemplifies the importance of empowering the skills of human experts in digital medicine.
Transcription factors (TFs) can bind to specific regulatory regions to control the expression of target genes. Disruption of TF binding is regarded as one of the key mechanisms by which regulatory variants could act to cause disease. However predicting the functional impact of variants on TF binding remains a major challenge for the field, standing as a key obstacle to achieving the potential of clinical genome analysis. This thesis confronts this challenge from a bioinformatics perspective and addresses two unresolved problems. The first problem is the determination of which genetic variants alter TF binding. Only a small number of allele-specific binding (ASB) events, in which TFs preferentially bind to one of two alleles at heterozygous sites in the genome, have been determined. To study the impact of variants on TF binding, access to a large, gold standard collection of ASB events could facilitate the development of new predictive methods. In Chapter 2, we implemented a pipeline to identify ASB events from ChIP-seq data and applied it to produce one of the largest ASB datasets. We found that ASB events were associated with allelic alterations of TF motifs, chromatin accessibility and histone modifications. Using the available features, classifiers were trained to predict the impact of variants on TF binding. To improve ASB calling, Chapter 3 evaluated five statistical methods, ultimately supporting a method that pooled ChIP-seq replicates and utilized a binomial distribution to model allelic read counts.The second problem is to determine how altered TF binding events impact the expression of target genes. In Chapter 4, we implemented regression-based models to predict gene expression changes based on altered TF binding events across 358 individuals. The models showed predictive capacity for 19.2% of genes, and the key TF binding events in the model provided mechanistic insights as to how these regulatory variants alter gene expression.In summary, this thesis both generated the largest, high-quality collection of ASB events, and developed algorithms to predict variant impact on TF binding and gene expression. The presented work advances the capacity of the field to interpret regulatory variants and will facilitate future clinical genome analysis.
Regulation of gene expression spans different levels of complexity: from genomic sequence, transcription factor binding and epigenetics, to three-dimensional chromatin interactions. Data from different individuals such as genetic variations presents an extra dimension to consider. Abnormal activities at any level may lead to disease phenotypes, motivating deeper exploration of gene regulation. New high-throughput sequencing techniques have empowered genome-wide studies of the regulatory mechanisms within cells. This thesis uses computational approaches to examine gene regulation with high-throughput data in order to address biological hypotheses traversing from short local sequence features to megabase-sized topologically associating domains (TADs).The hypotheses addressed in the thesis have two central themes: 1) the elucidation of local and domain regulation of gene expression, and 2) the application of such knowledge to identify functional phenotypic variants. We developed a computational approach to identify functional variants associated with cancer, and demonstrated how annotating regulatory sequences and linking these regions to target genes can strengthen genome interpretation. The concurrent and intertwined nature of local and domain regulation of gene expression develops as the thesis unfolds. In a study of genes that escape from X-chromosome inactivation, we found the YY1 transcription factor to be a key regulator, and is potentially associated with long distance chromatin looping mechanisms. Similarly, when studying the spread of inactivation to the autosomes in translocated cells, we detected local features associated with inactivation status, and at the domain level, we observed the spreading to be in accordance with TADs. Lastly, when considering TADs as transcriptional units, the identification of cell type-selectively co-expressed and co-localized TADs highlighted an organized and dynamic chromatin architecture across multiple cell types.In summary, this thesis provides insights into the mechanisms involved in gene expression across multiple scales (from local sequences to chromatin domains) using computational analyses on publicly available datasets. The presented methods and results have potential applications to interpret genetic variations and further our understanding in diseases and phenotypes. The findings may contribute to an era of preventative and regenerative medicine to come.
High-throughput next-generation DNA sequencing has evolved rapidly over the past 20 years. The Human Genome Project published its first draft of the human genome in 2000 at an enormous cost of 3 billion dollars, and was an international collaborative effort that spanned more than a decade. Subsequent technological innovations have decreased that cost by six orders of magnitude down to a thousand dollars, while throughput has increased by over 100 times to a current delivery of gigabase of data per run. In bioinformatics, significant efforts to capitalize on the new capacities have produced software for the identification of deviations from the reference sequence, including single nucleotide variants, short insertions/deletions, and more complex chromosomal characteristics such as copy number variations and translocations. Clinically, hospitals are starting to incorporate sequencing technology as part of exploratory projects to discover underlying causes of diseases with suspected genetic etiology, and to provide personalized clinical decision support based on patients’ genetic predispositions. As with any new large-scale data, a need has emerged for mechanisms to translate knowledge from computationally oriented informatics specialists to the clinically oriented users who interact with it. In the genomics field, the complexity of the data, combined with the gap in perspectives and skills between computational biologists and clinicians, present an unsolved grand challenge for bioinformaticians to translate patient genomic information to facilitate clinical decision-making. This doctoral thesis focuses on a comparative design analysis of clinical decision support systems and prototypes interacting with patient genomes under various sectors of healthcare to ultimately improve the treatment and well-being of patients. Through a combination of usability methodologies across multiple distinct clinical user groups, the thesis highlights reoccurring domain-specific challenges and introduces ways to overcome the roadblocks for translation of next-generation sequencing from research laboratory to a multidisciplinary hospital environment. To improve the interpretation efficiency of patient genomes and informed by the design analysis findings, a novel computational approach to prioritize exome variants based on automated appraisal of patient phenotypes is introduced. Finally, the thesis research incorporates applied genome analysis via clinical collaborations to inform interface design and enable mastery of genome analysis.
The identification of non-coding regulatory elements in the genome has been the focus of much experimental and computational effort. However, both experimental data, such as ChIP-seq, and computational methods of transcription factor (TF) binding predictions suffer from a degree of non-specificity. ChIP-seq experiments report regions that don’t contain the expected canonical motif for the ChIPped TF, which may arise from indirect binding or a non-TF-specific mechanism. Computational predictions based on sequence-level information alone are plagued by false positives. This thesis explores computational approaches to improve both the interpretation of large-scale TF binding data, and the detection of TF binding regions.In Chapters 2 and 3 we observe that experimentally defined regulatory regions of the human genome are a mixture of sub-groups reflecting distinct properties. On average a third of a ChIP-seq dataset does not contain the targeted TF’s motif, and within this subset up to 45% of the ChIP-seq peaks are unexpectedly enriched for a small class of non-targeted TFs’ motifs. Many of these regions are not specific to a TF but are ChIPped by multiple diverse TFs across multiple cell types. These recurring regions tend to be the lower scoring peaks of a dataset, are less likely to reproduce between experimental replicates, and tend to associate with cohesin and polycomb protein occupied positions in the genome. The regulatory regions with a greater specificity for a TF do not share these properties. Based on these observations we suggest a TF ‘loading-zone’ model to account for the presence of the aforementioned recurrent regions in ChIP-seq data. In Chapter 4 we further explore the regulatory region subgroups with a biophysical simulator of TF occupancy (tfOS). Within tfOS we have incorporated TF-DNA interaction energies, TF search mechanics, cooperative TF interactions, and sequence accessibility data into the model. Simulations with tfOS across sequences reveal distinct features associated with recurrent and non-recurrent regions described in Chapter 3. The research presented has improved our understanding and interpretation of large-scale TF binding data and advanced our understanding of TF regulatory regions, leading to improved annotation and interpretation of the human genome. Supplementary video material is available at: http://hdl.handle.net/2429/51447
MEDLINE®/PubMed® is a richly annotated resource of over 21 million article citations, growing at a modern rate of over 600,000 citations annually. One grand challenge of bioinformatics is analysing the extensive literature for a biomedical entity such as a gene or disease. This thesis explores using over-representation to extract pertinent biomedical annotation from the research articles for an entity. The quantitative profiles generated are compared to predict novel associations between entities.Medical Subject Heading Over-representation Profiles (MeSHOPs) are constructed from the primary literature of an entity of interest. Medical subject annotations for each article are extracted. Statistical tests evaluate the significance of each term’s frequency across the set of articles, compared against an appropriate background set. The resulting MeSHOP is composed of each term and corresponding enrichment p-value. MeSHOPs can be computed for any entity with an associated bibliography of PubMed articles. We evaluate the predictive performance of quantitatively comparing MeSHOPs to discover novel associations between gene and disease entities, achieving up to 16% improvement in accuracy compared to gene or disease baseline features (measured as increased Receiver Operating Characteristic Area Under the Curve). Strong literature annotation level bias on the predictive performance for future gene-disease association was seen. We observe similar results in a parallel analysis of associations between drugs and disease.Efficiently identifying authors with similar research interests is a challenge in science. During the peer review process, authors seek scientists with similar expertise. MeSHOPs are generated for individual authors, identifying their research foci. Extending the methods to allow comparison across large sets of entities, overlapping research interests between researchers were identified. The predictive performance was evaluated for capacity to identify authors working in the same research domains. Biomedical annotation analysis of primary literature provides insight into the areas of research focus, and is demonstrated to link entities through similarities in their MeSHOPs. We quantitatively confirm the trend where well-studied genes, diseases and drugs are more likely to be the focus of further research. MeSHOP analysis demonstrates that knowledge in the annotated primary literature can be efficiently mined, and the untapped knowledge therein can be discovered computationally.
No abstract available.
No abstract available.
Master's Student Supervision
Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.
X chromosome inactivation (XCI) is the process in which one copy of the X chromosomes in females (XX) is randomly silenced to achieve dosage compensation of gene expression from X chromosome of males (XY). XCI is known to be incomplete, resulting in a subset of genes on the inactive X (Xi) being expressed. Genes along the X chromosome are classified into three categories based upon their expression from the Xi: subject, escape and variable. Escape genes, corresponding to 15% of X-linked genes, are generally expressed from the Xi at a substantially lower level compared to the active X (Xa). The underlying mechanisms in controlling escape from the heterochromatic state of Xi has been a long-standing question. While there is evidence that supports a role for intrinsic DNA elements in the escape XCI, they have not yet been identified.With increasing amounts of data for transcription factor (TF) binding sites, the objective of this thesis was to identify the regulatory TFs facilitating the ability of genes to escape XCI via a bioinformatics approach using empirical data. ChIP-seq peaks of 155 TFs from the ReMap database were assessed for enrichment at regulatory regions of 55 escape genes. 19 TFs were identified via enrichment analysis in the transcription start site regions of escape genes. Co-binding of pairs of enriched TFs were characterized by gene set similarity between the target genes. Of the 155 TFs examined, ZFP36, an RNA binding protein that alters RNA stability, showed importance in both enrichment analysis and co-binding analysis.An initial exploration of methods to compare the structures of Xi and Xa were undertaken using ChIA-PET data, providing insights into the limitations of current data resources and opportunities for future studies to inform the topographical organization of escape genes within Xi.The results of this thesis refined our knowledge on cis-regulatory elements and trans-acting factors potentially involved in escape of XCI. This list of enriched TF may be useful in future analyses, experimental or computational, to further determine their sufficiency and necessity in the escape of XCI.
With single cell sequencing advances, research has increasingly focused on under-standing cell-specific gene regulation mechanisms. However, single cell sequencing data are often noisy and the amount of sequence obtained from rare cell types small. Simulation can be a powerful approach to aid understanding when data is limited, both because the process used to generate such data can provide mechanistic insights into cell-specific regulation and the data produced can augment analysis methods development. We constructed and optimized a stand-alone cell-conditional GAN (ccGAN) to simulate cell-specific ATAC-seq data. We trained our model on published single cell ATAC-seq (scATAC-seq) data that had been produced with different protocols on embryonic mice forebrain and adult mice brain. The ccGAN generated sequence was correlated in both Transcription Factor (TF) binding motif composition and positional distribution with the experimental scATAC-seq. The ccGAN simulator was able to learn important cell-specific signals amidst noise. The ccGAN architecture holds broad potential for single cell regulatory data simulation beyond ATAC-seq, such as for ChIP-seq or epigenetic properties
Linking cooperatively functioning cis-regulatory elements (CREs), specifically enhancers and promoters, is a challenging task. Current strategies include correlation of expression of RNA transcribed from the CREs, experimentally measured chromatin interactions (Promoter Capture Hi-C) or machine learning based computational predictions. However, all three approaches require the availability of experimental data, which is sparse for most cells and tissues. We propose a new similarity metric to link enhancers to their target promoters based on transcription factor (TF)-binding “signatures”. TF-binding signatures are binary string representations (e.g. 0011001...), where each position indicates binding (“1”) or not (“0”) of a TF to a CRE. We apply a cosine similarity metric to enhancer-promoter pairs linked in published studies involving CRISPRi-FlowFISH, co-expression (FANTOM), or experimental tiling-deletion (CREST-seq). We find a significant difference between TF signature similarities of linked promoter-enhancer pairs compared to unlinked pairs. Furthermore we observe that TF-binding similarity scores are CRR specific. Based on the results, new directions are proposed that may allow further improvement towards a reliable mapping of interacting CREs across the genome.
Gene therapy has the potential to not only treat, but cure individuals suffering from inherited diseases. Advances in understanding the human genome and the discovery of causal genes underlying diseases has heightened the need to solve the gene therapy challenge. Viral vectors are often used as a delivery tool for therapeutics, but their safety and efficacy are still being studied. To contribute to this goal, we have created 49 small viral promoters by bioinformatically annotating cis-regulatory regions from which a subset are concatenated with the goal of drivingcell-specific expression of a reporter gene. We have tested a subset of these in mice in vivo. Regulatory region analysis can take a trained designer multiple weeks. To resolve this issue, we have created a semi-automated approach to regulatory region identification, named OnTarget. The OnTarget database accumulates thousands of cell and tissue-specific experiments in order to identify regions informative of regulatory properties. OnTarget is able to identify regulatory regions consistent with those identified by designers. In this capacity, we expect OnTarget to lead to betterand faster identification of cis-regulatory regions for the design of promoters targeting specific sets of cells.
Despite improvements in sequencing technologies, DNA sequence variant interpretation for rare genetic diseases remains challenging. In a typical workflow for the Treatable Intellectual Disability Endeavor in B.C. (TIDE BC), a geneticist examines variant calls to establish a set of candidate variants that explain a patient's phenotype. Even with a sophisticated computation pipeline for variant prioritization, they may need to consider hundreds of variants. This typically involves literature searches on individual variants to determine how well they explain the reported phenotype, which is a time consuming process. In this work, text analysis based variant prioritization methods are developed and assessed for the capacity to distinguish causal variants within exome analysis results for a reference set of individuals with metabolic disorders.
Eye misalignment, called strabismus, occurs in up to 5% of individuals. While misalignment is frequently observed in rare complex syndromes, the majority of strabismus cases are non-syndromic. Over the past decade, genes and pathways associated with syndromic forms of strabismus have emerged, but the genes contributing to non-syndromic strabismus remain elusive. Non-syndromic strabismus is highly heterogeneous, and different loci have been inferred from previous genetics studies. Only a single strabismus locus, STBMS1, on chromosome 7 has been confirmed in more than one family, but the reported inheritance patterns of this locus with disease conflict and no specific variant has been proposed. Here, I analyzed a large non-consanguineous family with multiple individuals affected by strabismus across seven generations. The hypothesis is that a single variant is responsible for the non-syndromic strabismus in this particular family displaying dominant patterns of inheritance. Whole exome sequencing (WES) was performed to uncover large- blocks of variations within protein-coding regions of the genome shared by two affected distant relatives. In parallel, chromosome regions segregating with the strabismus phenotype in the family were identified using linkage analysis on 12 individuals. Linkage analysis identified one specific risk locus of high confidence. Based on the lack of protein-coding alterations in the locus, whole genome sequencing (WGS) was performed to find additional shared candidate causal variants. Combining the available information, a 10 Mb region on chromosome 14 was identified with high confidence that it was associated with strabismus, within which a set of potential regulatory sequence alterations have been highlighted for further study. This study represents the first identified locus for autosomal dominant, non- syndromic, strabismus. The project utilizes next-generation sequencing (NGS), linkage analysis, and bioinformatic analyses to prioritize and select both coding and non-coding variants, demonstrating the effectiveness of combining NGS and classical genetic approaches. The research findings improve our understanding of strabismus genetics and defines multiple paths for future research, family-specific genetic testing for early diagnosis, and consequent preventive therapy.
- ExplaiNN: interpretable and transparent neural networks for genomics (2022)
- RevUP: an online scoring system for regulatory variants implicated in rare diseases (2022)
Bioinformatics, 38 (9), 2664--2666
- Biologically relevant transfer learning improves transcription factor binding prediction (2021)
- Demonstrating the utility of flexible sequence queries against indexed short reads with FlexTyper (2021)
PLOS Computational Biology, 17 (3), e1008815
- GeneBreaker - Variant simulation to improve the diagnosis of Mendelian rare genetic diseases (2020)
- GeneBreaker ‐ Variant simulation to improve the diagnosis of Mendelian rare genetic diseases (2020)
- metPropagate: network-guided propagation of metabolomic information for prioritization of metabolic disease genes (2020)
npj Genomic Medicine, 5 (1)
- metPropagate: network-guided propagation of metabolomic information for prioritization of neurometabolic disease genes (2020)
- Bi-allelic GOT2 Mutations Cause a Treatable Malate-Aspartate Shuttle-Related Encephalopathy. (2019)
American journal of human genetics,
- Curation and bioinformatic analysis of strabismus genes supports functional heterogeneity and proposes candidate genes with connections to RASopathies. (2019)
- Development and user evaluation of a rare disease gene prioritization workflow based on cognitive ergonomics. (2019)
Journal of the American Medical Informatics Association : JAMIA,
- Evidence of transcription at polyT short tandem repeats (2019)
- Gene expression models based on transcription factor binding events confer insight into functional cis-regulatory variants. (2019)
Bioinformatics (Oxford, England),
- Glutaminase Deficiency Caused by Short Tandem Repeat Expansion in GLS. (2019)
The New England journal of medicine,
- Identification of novel cerebellar developmental transcriptional regulators with motif activity analysis. (2019)
- Introduction to Genomic Analysis Workshop: A catalyst for engaging life-science researchers in high throughput analysis (2019)
- JASPAR 2020: update of the open-access database of transcription factor binding profiles. (2019)
Nucleic acids research,
- PLPHP deficiency: clinical, genetic, biochemical, and mechanistic insights. (2019)
Brain : a journal of neurology,
- Strabismus in Children With Intellectual Disability: Part of a Broader Motor Control Phenotype? (2019)
- TFEA.ChIP: A tool kit for transcription factor binding site enrichment analysis capitalizing on ChIP-seq datasets. (2019)
Bioinformatics (Oxford, England),
- Twenty-Seven Tamoxifen-Inducible iCre-Driver Mouse Strains for Eye and Brain, Including Seventeen Carrying a New Inducible-First Constitutive-Ready Allele. (2019)
- Atypical cerebral palsy: genomics analysis enables precision medicine. (2018)
Genetics in medicine : official journal of the American College of Medical Genetics,
- Bone health and SATB2-associated syndrome. (2018)
- c-Myc is a novel Leishmania virulence factor by proxy that targets the host miRNA system and is essential for survival in human macrophages. (2018)
The Journal of biological chemistry,
- Gain-of-function KCNJ6 Mutation in a Severe Hyperkinetic Movement Disorder Phenotype. (2018)
- Genome sequencing reveals a novel genetic mechanism underlying dihydropyrimidine dehydrogenase deficiency: A novel missense variant c.1700G>A and a large intragenic inversion in DPYD spanning intron 8 to intron 12. (2018)
- Genome-wide prediction of cis-regulatory regions using supervised deep learning methods. (2018)
- Human Enhancers Harboring Specific Sequence Composition, Activity, and Genome Organization Are Linked to the Immune Response. (2018)
- Improvement of Self-Injury With Dopamine and Serotonin Replacement Therapy in a Patient With a Hemizygous PAK3 Mutation: A New Therapeutic Strategy for Neuropsychiatric Features of an Intellectual Disability Syndrome. (2018)
Journal of child neurology,
- Integration of genomics and metabolomics for prioritization of rare disease variants: a 2018 literature review. (2018)
Journal of inherited metabolic disease,
- Interfaces of Malignant and Immunologic Clonal Dynamics in Ovarian Cancer. (2018)
- JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. (2018)
Nucleic acids research,
- Knowledge base and mini-expert platform for the diagnosis of inborn errors of metabolism. (2018)
Genetics in medicine : official journal of the American College of Medical Genetics,
- MANTA2, update of the Mongo database for the analysis of transcription factor binding site alterations. (2018)
- New MiniPromoter Ple345 (NEFL) Drives Strong and Specific Expression in Retinal Ganglion Cells of Mouse and Primate Retina. (2018)
Human gene therapy,
- Sialic acid catabolism by N-acetylneuraminate pyruvate lyase is essential for muscle function. (2018)
- Text-based phenotypic profiles incorporating biochemical phenotypes of inborn errors of metabolism improve phenomics-based diagnosis (2018)
Journal of Inherited Metabolic Disease, 41 (3), 555--562
- The genotypic and phenotypic spectrum of MTO1 deficiency. (2018)
Molecular genetics and metabolism,
- The role of the clinician in the multi-omics era: are you ready? (2018)
Journal of inherited metabolic disease,
- The SIN3A histone deacetylase complex is required for a complete transcriptional response to hypoxia. (2018)
Nucleic acids research,
- A case of splenomegaly in CBL syndrome. (2017)
European journal of medical genetics,
- A de novo mosaic mutation in SPAST with two novel alternative alleles and chromosomal copy number variant in a boy with spastic paraplegia and autism spectrum disorder (2017)
European Journal of Medical Genetics,
- A girl with developmental delay, ataxia, cranial nerve palsies, severe respiratory problems in infancy-Expanding NDST1 syndrome. (2017)
American journal of medical genetics. Part A,
- Assessment of the ExAC data set for the presence of individuals with pathogenic genotypes implicated in severe Mendelian pediatric disorders. (2017)
Genetics in medicine : official journal of the American College of Medical Genetics,
- Correction to: FLAGS, frequently mutated genes in public exomes. (2017)
BMC medical genomics,
- Corrigendum: NANS-mediated synthesis of sialic acid is required for brain and skeletal development. (2017)
- CuboCube: Student creation of a cancer genetics e-textbook using open-access software for social learning. (2017)
- Identification of a large intronic transposal insertion in SLC17A5 causing sialic acid storage disease. (2017)
Orphanet journal of rare diseases,
- Impact of next-generation sequencing on diagnosis and management of neurometabolic disorders: current advances and future perspectives. (2017)
Expert review of molecular diagnostics,
- Optic atrophy, cataracts, lipodystrophy/lipoatrophy, and peripheral neuropathy caused by a de novo OPA3 mutation. (2017)
- CAGEd-oPOSSUM: motif enrichment analysis from CAGE-derived TSSs. (2016)
- Cytosolic phosphoenolpyruvate carboxykinase deficiency presenting with acute liver failure following gastroenteritis. (2016)
- Deep Feature Selection: Theory and Application to Identify Enhancers and Promoters. (2016)
- DeepCAGE transcriptomics identify HOXD10 as transcription factor regulating lymphatic endothelial responses to VEGF-C. (2016)
- DNA Methylation Profiling in Human Huntington's Disease Brain. (2016)
- DNA Shape Features Improve Transcription Factor Binding Site Predictions In Vivo. (2016)
- Evaluating the impact of single nucleotide variants on transcription factor binding. (2016)
- Exome Sequencing and the Management of Neurometabolic Disorders. (2016)
- Further Validation of the SIGMAR1 c.151+1G>T Mutation as Cause of Distal Hereditary Motor Neuropathy. (2016)
Child neurology open,
- Identification of non-coding genetic variants in samples from hypoxemic respiratory disease patients that affect the transcriptional response to hypoxia. (2016)
- JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. (2016)
- Mitochondrial Complex III Deficiency with Ketoacidosis and Hyperglycemia Mimicking Neonatal Diabetes. (2016)
- NANS-mediated synthesis of sialic acid is required for brain and skeletal development. (2016)
- PAX6 MiniPromoters drive restricted expression from rAAV in the adult mouse retina. (2016)
- rAAV-compatible MiniPromoters for restricted expression in the brain and eye. (2016)
- Secondary neurotransmitter deficiencies in epilepsy caused by voltage-gated sodium channelopathies: A potential treatment target? (2016)
- YY1 binding association with sex-biased transcription revealed through X-linked transcript levels and allelic binding analyses. (2016)
- A SNP in the HTT promoter alters NF-κB binding and is a bidirectional genetic modifier of Huntington disease. (2015)
- Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas. (2015)
- Cisplatin Nephrotoxicity and Longitudinal Growth in Children With Solid Tumors: A Retrospective Cohort Study. (2015)
- Combined serial analysis of gene expression and transcription factor binding site prediction identifies novel-candidate-target genes of Nr2e1 in neocortex development. (2015)
- De novo dominant variants affecting the motor domain of KIF1A are a cause of PEHO syndrome. (2015)
- DeepCAGE Transcriptomics Reveal an Important Role of the Transcription Factor MAFB in the Lymphatic Endothelium. (2015)
- Defects in fatty acid amide hydrolase 2 in a male with neurologic and psychiatric symptoms. (2015)
- Discovery of molecular markers to discriminate corneal endothelial cells in the human body. (2015)
- Dynamic software design for clinical exome and genome analyses: insights from bioinformaticians, clinical geneticists, and genetic counselors. (2015)
- Expansion of the QARS deficiency phenotype with report of a family with isolated supratentorial brain abnormalities. (2015)
- Genetic variants in SLC22A17 and SLC22A7 are associated with anthracycline-induced cardiotoxicity in children. (2015)
- GeneYenta: a phenotype-based rare disease case matching tool based on online dating algorithms for the acceleration of exome interpretation. (2015)
- Identification of altered cis-regulatory elements in human disease. (2015)
- RMND1 deficiency associated with neonatal lactic acidosis, infantile onset renal failure, deafness, and multiorgan involvement. (2015)
- The genotypic and phenotypic spectrum of PIGA deficiency. (2015)
- The identification of cis-regulatory elements: A review from a machine learning perspective. (2015)
- The statistical geometry of transcriptome divergence in cell-type evolution and cancer. (2015)
- A promoter-level mammalian expression atlas. (2014)
- AIMP1 deficiency presents as a cortical neurodegenerative disease with infantile onset. (2014)
- An atlas of active enhancers across human cell types and tissues. (2014)
- CCL2 enhances pluripotency of human induced pluripotent stem cells by activating hypoxia related genes. (2014)
- Ceruloplasmin is a novel adipokine which is overexpressed in adipose tissue of obese subjects and in obesity-associated cancer cells. (2014)
- Differential roles of epigenetic changes and Foxp3 expression in regulatory T cell-specific transcriptional regulation. (2014)
- DNAJC13 mutations in Parkinson disease. (2014)
- Exome sequencing identifies mutations in KIF14 as a novel cause of an autosomal recessive lethal fetal ciliopathy phenotype. (2014)
- Exome sequencing pilot study in children with carbamazepine-induced serious skin reactions (2014)
Clinical and Translational Allergy, 4 (Suppl), P119
- FLAGS, frequently mutated genes in public exomes. (2014)
- Higher frequency of genetic variants conferring increased risk for ADRs for commonly used drugs treating cancer, AIDS and tuberculosis in persons of African descent. (2014)
- Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment (2014)
BMC Genomics, 15 (1), 472
- JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. (2014)
- Mitochondrial Carbonic Anhydrase VA Deficiency Resulting from CA5A Alterations Presents with Hyperammonemia in Early Childhood (2014)
The American Journal of Human Genetics, 94 (3), 453--461
- Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets (2014)
Genome Biology, 15 (7), 412
- On the identification of potential regulatory variants within genome wide association candidate SNP sets (2014)
BMC Medical Genomics, 7 (1), 34
- Single point mutation in Rabenosyn-5 in a female with intractable seizures and evidence of defective endocytotic trafficking (2014)
Orphanet Journal of Rare Diseases, 9 (1), 141
- Strabismus genetics across a spectrum of eye misalignment disorders. (2014)
- Targeted CNS delivery using human MiniPromoters and demonstrated compatibility with adeno-associated viral vectors (2014)
Mol Ther Methods Clin Dev, 1, 5
- TFBSshape: a motif database for DNA shape features of transcription factor binding sites. (2014)
- Usability study of clinical exome analysis software: Top lessons learned and recommendations (2014)
Journal of Biomedical Informatics, 51, 129--136
- Compensating for literature annotation bias when predicting novel drug-disease relationships through Medical Subject Heading Over-representation Profile (MeSHOP) similarity. (2013)
- Non-coding-regulatory regions of human brain genes delineated by bacterial artificial chromosome knock-in mice. (2013)
- Portal for Families Overcoming Neurodevelopmental Disorders (PFOND): Implementation of a Software Framework for Facilitated Community Website Creation by Nontechnical Volunteers. (2013)
- The next generation of transcription factor binding site prediction. (2013)
- Utilizing social media to study information-seeking and ethical issues in gene therapy. (2013)
- Validation of variants in SLC28A3 and UGT1A6 as genetic markers predictive of anthracycline-induced cardiotoxicity in children. (2013)
- Inferring novel gene-disease associations using Medical Subject Heading Over-representation Profiles. (2012)
- oPOSSUM-3: advanced analysis of regulatory motif over-representation across genes or ChIP-Seq datasets. (2012)
- Quantitative biomedical annotation using medical subject heading over-representation profiles (MeSHOPs). (2012)
- Retina restored and brain abnormalities ameliorated by single-copy knock-in of human NR2E1 in null mice. (2012)
- The clonal and mutational evolution spectrum of primary triple-negative breast cancers. (2012)
- The transcription factor encyclopedia. (2012)
- Identification of cis-regulatory sequence variations in individual genome sequences. (2011)
- MIR@NT@N: a framework integrating transcription factors, microRNAs and their targets to identify sub-network motifs in a meta-regulation network model. (2011)
- The NeuroDevNet Neuroinformatics Core. (2011)
- Towards resolving the transcription factor network controlling myelin gene expression. (2011)
- Validation of skeletal muscle cis-regulatory module predictions reveals nucleotide composition bias in functional enhancers. (2011)
- VPS35 Mutations in Parkinson Disease. (2011)
- A regulatory toolbox of MiniPromoters to drive selective expression in the brain. (2010)
- Global mapping of binding sites for Nrf2 identifies novel targets in cell survival response through ChIP-Seq profiling and network analysis. (2010)
- JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles. (2010)
- Laboratory Animal Management Assistant (LAMA): a LIMS for active research colonies. (2010)
- The Canadian Pharmacogenomics Network for Drug Safety: a model for safety pharmacology. (2010)
- Genetic variants in TPMT and COMT are associated with hearing loss in children receiving cisplatin chemotherapy. (2009)
- TFCat: the curated catalog of mouse and human transcription factors. (2009)
- The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences. (2009)
- Transcriptional repression of microRNA genes by PML-RARA increases expression of key cancer proteins in acute promyelocytic leukemia. (2009)
- Discovery and expansion of gene modules by seeking isolated groups in a random graph process. (2008)
- Dynamics of the yeast transcriptome during wine fermentation reveals a novel fermentation stress response. (2008)
- Gene characterization index: assessing the depth of gene annotation. (2008)
- Identification of a set of genes showing regionally enriched expression in the mouse brain. (2008)
- In silico detection of sequence variations modifying transcriptional regulation. (2008)
- Mechanisms underlying p53 regulation of PIK3CA transcription in ovarian surface epithelium and in ovarian cancer. (2008)
- ORegAnno: an open-access community-driven resource for regulatory annotation. (2008)
- oPOSSUM: integrated tools for analysis of regulatory motif over-representation. (2007)
- PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation. (2007)
- A new generation of JASPAR, the open-access repository for transcription factor binding site profiles. (2006)
- NovelFam3000--uncharacterized human protein domains conserved across model organisms. (2006)
- SAGE2Splice: unmapped SAGE tags reveal novel splice junctions. (2006)
- Complete functional rescue of the ABCA1-/- mouse by human BAC transgenesis. (2005)
- Identification of functional SNPs in the 5-prime flanking sequences of human genes. (2005)
- oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes. (2005)
- Prediction of nuclear hormone receptor response elements. (2005)
- The Gene Set Builder: collation, curation, and distribution of sets of genes. (2005)
- Ulysses - an application for the projection of molecular interactions across species. (2005)
- Applied bioinformatics for the identification of regulatory elements. (2004)
- Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes. (2004)
- ConSite: web-based prediction of regulatory elements using cross-species comparison. (2004)
- Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics. (2004)
- Decoding human regulatory circuits. (2004)
- Exploring the foundation of genomics: a northern blot reference set for the comparative analysis of transcript profiling technologies. (2004)
- JASPAR: an open-access database for eukaryotic transcription factor binding profiles. (2004)
- MSCAN: identification of functional clusters of transcription factor binding sites. (2004)
- Regulog analysis: detection of conserved regulatory networks across bacteria: application to Staphylococcus aureus. (2004)
- Transcriptional promoters (2004)
Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics,
- GeneLynx mouse: integrated portal to the mouse genome. (2003)
- Identification of conserved regulatory elements by comparative genome analysis. (2003)
- Identification of functional clusters of transcription factor binding motifs in genome sequences: the MSCAN algorithm. (2003)
- In silico identification of metazoan transcriptional regulatory regions. (2003)
- Integrated analysis of yeast regulatory sequences for biologically linked clusters of genes. (2003)
- Understanding the language of gene regulation. (2003)
- NotI flanking sequences: a tool for gene discovery and verification of the human genome. (2002)
- TFBS: Computational framework for transcription factor binding site analysis. (2002)
- A predictive model for regulatory sequences directing liver-specific transcription. (2001)
- GeneLynx: a gene-centric portal to the human genome. (2001)
- Initial isolation and analysis of the human Kv1.7 (KCNA7) gene, a member of the voltage-gated potassium channel gene family. (2001)
- Phylogenetic Footprinting (2001)
Encyclopedia of Life Sciences,
- Polymorphic electrophile response elements in the mouse glutathione S-transferase GSTa1 gene that confer increased induction. (2001)
- Discovery and modeling of transcriptional regulatory regions. (2000)
- Human-mouse genome comparisons to locate regulatory sites. (2000)
- Structure and function of adenosine receptors and their genes. (2000)
- The murine Bin1 gene functions early in myogenesis and defines a new region of synteny between mouse chromosome 18 and human chromosome 2. (1999)
- Identification of regulatory regions which confer muscle-specific gene expression. (1998)
- Molecular cloning of a novel mouse gene with predominant muscle and neural expression. (1998)
- Organization of the ABCR gene: analysis of promoter and splice junction sequences. (1998)
- CBP/cycA, a CCAAT-binding protein necessary for adhesion-dependent cyclin A transcription, consists of NF-Y and a novel Mr 115,000 subunit. (1997)
- Comprehensive analysis of proteins which interact with the antioxidant responsive element: correlation of ARE-BP-1 with the chemoprotective induction response. (1997)
- Functional antioxidant responsive elements. (1997)
- 3.3.co;2-o" target="_blank">Analysis of R59022 actions in Xenopus laevis oocytes. (1996)