Prospective Graduate Students / Postdocs
This faculty member is currently not actively recruiting graduate students or Postdoctoral Fellows, but might consider co-supervision together with another faculty member.
This faculty member is currently not actively recruiting graduate students or Postdoctoral Fellows, but might consider co-supervision together with another faculty member.
Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.
Helicases are a highly conserved family of motor proteins responsible for interacting with and unwinding canonical and non-canonical DNA and RNA structures. The RecQ class of helicases, known to suppress illegitimate recombination, are implicated in aging and cancer with four of the five human RecQ helicases directly linked to genome instability syndromes characterized in some cases by strong cancer predisposition or premature aging. While no human disease has been associated with the RECQL5 helicase, loss of this gene in cells is known to result in elevated double strand breaks (DSBs) and sister chromatid exchange events (SCEs), a phenotype of genome instability similar to what is observed in RecQ helicase-linked diseases of strong cancer predisposition. Until recently, studying SCEs has been limited to cytogenetic assays that map at megabase resolution. I used single cell template strand sequencing (Strand-seq) to map SCEs as changes in template strand orientation before and after loss of RECQL5 at kilobase resolution. I generated over 20 single and double knockout models for RECQL5 as well as BLM, WRN and RECQL1 helicases using CRISPR-Cas9 in the human haploid cell line, KBM7, and mapped SCEs to the genome using custom bioinformatic approaches to improve resolution and accuracy of SCE detection. I performed enrichment analysis to show SCEs are frequently occurring near actively transcribed genes with guanine quadruplexes (G4s) and common fragile sites further supporting the role of these helicase genes in suppressing inappropriate recombination at specific genomic elements. I also developed novel bioinformatic approaches to generate genotype-specific call sets for copy number alterations (CNAs), inversions, and translocations. Uncovering the role of DNA helicases in DNA repair and replication pathways is critical for understanding their significance in cancer and aging. Stand-seq offers a unique method to study helicases by mapping the location of SCEs arising in their absence.
View record
Studies of genome heterogeneity and plasticity aim to resolve how genomic features underlie phenotypes and disease susceptibilities. Identifying genomic features that differ between individuals and cells can help uncover the functional variants that drive specific biological outcomes. For this, single cell studies are paramount, as characterizing the contribution of rare but functional cellular subpopulations is important for disease prognosis, management and progression. Until now, these studies have been challenged by our inability to map structural variants accurately and comprehensively. To overcome this, I employed the template strand sequencing method, Strand-seq, to preserve the organization and structure of individual homologues and visualize structural rearrangements in single cells. Using Strand-seq, I monitored homologue states in human genomes to quantify the degree of somatic rearrangements, and distinguished these from recurrent structural variants, such as inherited inversions. In so doing, I created an innovative tool to rapidly discover, map, and genotype structural polymorphisms with unprecedented resolution. Next, to facilitate systematic analyses of Strand-seq data, I developed novel bioinformatic software that locates putative genomic rearrangements in singles cells and identifies recurrent rearrangements across multiple cells. This provides an essential instrument for unbiased and non-targeted structural variant discovery in a high-throughput approach, helping to scale Strand-seq for population-based studies. Applying these tools, I explored the distribution and frequency of structural variation in a heterogeneous cell population to discover and genotype over 100 inversions in the human genome. I found significant structural heterogeneity resides in definable polymorphic domains and within complex and repetitive regions of our genome. Finally, I extended my strategy to comprehensively map the complete set of inversions in an individual’s genome and define their unique invertome. Comparing two invertomes, I found sets of inversions can be combined to make predictions about ancestry and health of an individual, and I characterized the architectural features of inversion breakpoints with base-pair resolution. Taken together, I describe a powerful new framework to study structural rearrangements and genomic heterogeneity in single cell samples, whether from individuals for population studies, or tissues for biomarker discovery.
View record
G-quadruplex nucleic acids are a group of nucleic acids formed from the non-Watson-Crick base pairing of guanine nucleic acids. They can readily form at physiological pH and physiological temperatures within sufficiently long stretches of guanine-rich oligonucleotides. Although, the existence of the G-quartet (the fundamental unit of a G-quadruplex) in a Petri dish has been recognized since the early 60’s, the existence of G-quadruplex nucleic acids in mammalian cells remains unclear. Yet while unequivocal evidence of the existence of G-quadruplex nucleic acids in live cells remains unclear, interest in these potentially important biological structures continues to intensify. G-quadruplex nucleic acids have been suggested to play key roles in essential human molecular pathways including telomere biology, transcriptional regulation and disease development. One of the major obstacles in G-quadruplex nucleic acid research is a lack of tools for the in vivo detection of these structures.In our work, we have harnessed hybridoma technology to produce the first monoclonal antibodies to these unique nucleic acid structures. To our knowledge, these are the first hybridomas secreting monoclonal antibodies obtained through the immunization of mice with purified and validated G-quadruplex structures. Monoclonal antibodies have been approved for use in diagnostic tests and for therapeutic treatments in both cancer and autoimmune diseases, and continue to be very effective laboratory research tools. Using monoclonal antibodies to different G-quadruplex nucleic acids we have explored the existence of G-quadruplex nucleic acids in mammalian cells. One of our antibodies, termed 1H6, forms discrete nuclear foci in human and murine cells and strong nuclear staining in most cells of human tissues. Based on the specificity of the antibodies for defined G-quadruplex structures in vitro, these foci could represent the detection of G-quadruplex nucleic acid structures in mammalian cells. If so, the work presented here provides the first direct evidence for the existence of G-quadruplex nucleic acid structures in human cells.
View record
Proper segregation of replicated chromosomes is essential for cell division in all organisms. Linear eukaryotic chromosomes contain specialized protective structures at the chromosome ends, called telomeres, which are essential for maintaining genome stability. Telomere associations have been observed during key cellular processes including mitosis, meiosis and carcinogenesis. These telomere associations need to be resolved prior to cell division to avoid loss of telomere function. TRF1, a core component of the telomere protein complex shelterin, has been implicated as a mediator of telomere associations. To determine the effect of TRF1 protein levels on telomere associations, we used live-cell fluorescence microscopy to visualize telomeres and chromosome dynamics in cells expressing defined levels of TRF1. Elevated levels of TRF1 induced anaphase bridges containing thin “thread-like” stretches of TRF1 foci connecting segregating chromosomes. We also observed telomere aggregates, mitotic bypass, and TRF1 bridges persisting into the following cell cycle. To examine the role of TRF1 in these telomere associations, we generated a TRF1 protein which can be inducibly cleaved by TEV protease. Telomere aggregates appeared to resolve upon cleavage of TRF1 proteins, suggesting that telomere associations result primarily from protein interactions mediated by TRF1. The essential helicase RTEL1 was observed at the extremities of persistent TRF1 bridges, possibly indicating a function for RTEL1 in the resolution of TRF1-induced telomere associations. Taken together, our results demonstrate that precise regulation of TRF1 levels is essential for telomere resolution and mitotic segregation.
View record
Traditional cytogenetic approaches allow analysis of the chromosomal composition (karyotype) of mitotic cells fixed on slides cells by microscopy. The combination of karyotyping and Fluorescence In Situ Hybridization (FISH) enables the detection of specific target sequences on individual chromosomes. Disadvantages are that traditional cytogenetic approaches are very labor and time consuming and that chromosome specific information from only a few dozen cells has poor statistical power. An alternative is flow karyotyping, a method to analyze chromosomes in suspension by flow cytometry. For flow karyotyping, the DNA composition of specific chromosomes in suspension is measured based on the DNA-specific dyes Hoechst 33258 (HO) and Chromomycin A3 (CA3). My thesis work has focused on the development of a new method to analyze and sort chromosomes using FISH with labeled peptide nucleic acid (PNA) probes on chromosomes in suspension. I found that, following FISH, flow karyotyping can be used to detect and quantify repetitive DNA sequences within individual chromosomes. Using chromosome flow FISH (CFF), chromosomes isolated from cells of various species were hybridized to PNA probes and analyzed by flow cytometry. CFF was used to detect a variety of repeats; interstitial telomeric sequences in Chinese Hamster chromosomes, major satellite in mouse chromosomes and D18Z1 alpha satellite repeats in human chromosomes. Quantitative measurements of repeat length by CFF were validated by comparison with measurements obtained using Q-FISH. We found that parental homologs of human chromosome 18 with different D18Z1 satellite repeat array size could be purified using CFF and Fluorescence Activated Cell Sorting (FACS). Illumina short read sequencing of libraries built from these purified chromosomes enabled us to determine, with a high resolution, the allelic phasing of each homolog over the entire chromosome 18. Finally, CFF was modified to study sister chromatids separately. Using a cell model with inducible separation of sister chromatids, flow karyograms were generated. Using chromosome orientation FISH (CO-FISH) in suspension, we could identify sister chromatids according to the presence of DNA template strands. We anticipate that this approach will allow the purification of sister chromatids to study epigenetic differences between sister chromatids defined on the basis of DNA template strands.
View record
Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.
Eukaryotic chromosome ends are protected by special DNA structures known as telomeres. In mammals the DNA of telomeres consist of TTAGGG repeats which, in cooperation with specialized proteins, “cap” the ends of chromosomes to protect the chromosomes from end-to-end fusion and erosion. Thus, telomeres are important to maintain chromosome stability and play a vital role in preserving the information in our genome. A key factor of telomeric function is the length of the telomeres. Short and dysfunctional telomeres with less than a few dozen repeats are associated with genomic instability and tumorigenesis. Furthermore, loss of telomere function is implicated in numerous diseases like bone marrow failure, hematological malignancies and other cancers. There is a substantial body of evidence indicating that the average length of telomeres can provide prognostic information in human diseases. However, limitations in the currently available technologies for detecting and measuring the length of telomeres has hampered progress in translating telomere length assays into clinical practice. Additionally, many questions about the relation between telomere length and telomere function remain to be answered. Consequently, novel approaches to study single telomeres are of significant interest. In this study, I investigated two cutting-edge technologies to assess properties of telomeres. I used quantitative fluorescence in situ hybridization (Q-FISH) to identify the length of telomeres based on fluorescence values. Using Q-FISH, I was able to generate DNA measurements using plasmids with different size telomeric inserts that served as a reference for length quantifications taken on different platforms. Also, I initiated explorations of a novel high-throughput method to study physical properties of single telomeres and, potentially, measure the length of telomere repeats. Convex Lens-induced Confinement technique (CLiC) is a technique developed to image individual biological molecules and study their dynamics. Using the CLiC platform, I sought to define the length of plasmid DNA based on diffusion coefficient values. My work has set the stage for others to explore the CLiC platform to study properties of telomeres including biological properties as well as their length. The latter can possibly be used as a prognostic tool in bone marrow failure, hematological malignancies and other disorders.
View record
Structural variants (SVs) contribute greater diversity at the nucleotide level between two human genomes than any other form of genetic variation and are three-fold more likely to correlate in genome-wide association studies (GWAS) than single nucleotide variants (SNVs). Using short-read, high-throughput sequencing technologies to uncover such variation has proven to be troublesome and the methods to detect SVs depend on indirect inferences. However, while larger (>5kb) copy number variations (CNVs) could be characterized using read-depth-based algorithms, this approach often fails for smaller and balanced events. Another fundamental problem for detection of SVs from short-read sequencing is inherent to the predominant data type and typical SV detection algorithm that is effective in unique sequences often fails within complex genomic regions, which have been proven to be highly enriched for SVs. In addition, most SV discovery methods do not indicate the haplotype-origin for a given SV and require parental sequencing for this information. For a more complete description and interpretation of human genomic information in relation to phenotypes such as e.g. cancer predisposition and response to therapies, it will, therefore, be necessary to arrange sequence data into parental haplotypes and ascertain polymorphic inversions with respect to such haplotypes. All this can be achieved using Strand-seq. Strand-seq complements other sequencing approaches by providing crucial information about the genetic make-up of individuals that cannot be obtained in any other way. To make Strand-seq available for human studies worldwide is an immense challenge. Library construction, as well as data analysis, needs to be further developed, integrated and made user-friendly to allow accurate and rapid interpretation of results. Here we present a custom bioinformatics pipeline for analyzing Strand-seq data that streamlines the workflow of raw sequence read alignment, putative variant calling, variant call refinement and haplotype assembly by integrating current available Strand-seq specific tools. In addition, relevant metric data are compiled and visualized, ensuring and reinforcing the potential of Strand-seq as a robust sequencing method for uncovering clinically significant SVs and the assembly of WGH without additional parental genomic data.
View record
Template strand sequencing (Strand-seq) is a single cell sequencing approach which maintains 5’ -> 3’ directionality of sequence reads. I hypothesized that the directional information preserved can be used to map complex translocation events. Translocations often disrupt gene expression by reshuffling regulatory elements or by formation of novel fusion transcripts. Yet, detection is often difficult, confounded by complexities of the Structural Variations (SVs). I chose a cell line derived from a patient with pediatric Acute Lymphoblastic Leukemia (iALL) with a known complex karyotype. My aim was to explore Strand-seq’s ability in identifying breakpoints, linking translocation partners and resolving the configuration of SV, comparing low coverage Strand-seq data against high coverage Whole Genome Sequence (WGS) data as reference.The iALL cells selected for my study harbor complex translocations involving 4 chromosomes with 5 breakpoint positions that were previously validated by Fluorescent In Situ Hybridization (FISH). BreakpointR, a novel pipeline for Strand-seq analysis, was able to identify 18 breakpoints, 5 which were isolated for further analysis. These 5 breakpoints were identified with a resolution of 5-60kb, overlapping with the genomic positions of breakpoints identified by WGS analysis, validating the accuracy of BreakpointR. Despite the lower sequencing coverage of Strand-seq, 18 breakpoints were detected against WGS’s 119 breakpoints.Next, I developed a workflow to link translocation partners involved in the breakpoints, successfully linking 4 of 5 translocation partners; the final fragment remained unresolved due to lack of reads within the genomic interval, a limitation of low sequence coverage. By comparison, WGS successfully linked 4 of 5 translocation partners, its limitations of mapping across repetitive regions resulting in a different unresolved fragment.Post-translocation-partner matching and single cell resolution from Strand-seq allowed us to further interrogate expected breakpoints for each single cell. Strand-seq analysis identified an inversion of the 100kb fragment in chromosome 11, validated with Sanger sequencing, representing an additional layer of complexity not identified by the other approaches. I conclude that the application of Strand-seq should be further explored in the areas of SV mapping as it has been proven useful for complementing the inherent difficulties of complex SV mapping across repetitive regions.
View record
Hutchinson-Gilford Progeria Syndrome (HGPS) is a premature aging disorder caused by mutations in the gene LMNA, which encodes the nuclear matrix protein, Lamin A. Lamin A is found predominantly at the nuclear periphery but also throughout the nucleus in a ‘nucleoplasmic veil’. The majority of HGPS patients have a single nucleotide mutation (1824 C→T) which results in the activation of a cryptic donor splice site causing a 150 nucleotide deletion in the mRNA and consequently a 50 amino acid in-frame deletion in the protein. The mutation results in aberrant processing and nuclear localization of the Lamin A protein. HGPS cells are characterized by misshapen nuclei, chromatin disorganization, accumulation of mutant Lamin A, short telomeres, DNA damage recruitment defect and early senescence.To measure the telomere length of individual chromosomes, Quantitative Fluorescence in-situ Hybridization was used. The average telomere length in HGPS fibroblasts was greatly decreased compared to controls as well as highly variable. In contrast, the telomere length in hematopoietic cells which do not express LMNA was within the normal range for three out of four HGPS patient samples. These results suggest that mutant Lamin A decreases telomere length via a direct effect and that expression of mutant LMNA is necessary for telomere loss in HGPS.Three different aspects of telomere biology were investigated: localization, mobility and attachment to the matrix. Telomeres were more localized to the nuclear periphery in HGPS fibroblasts than in wild type fibroblasts as well as having abnormal localization in regards to euchromatin/heterochromatin. To examine mobility, fluorescently tagged proteins were constructed to examine interactions between wild type and mutant Lamin A and telomeres during live cell imaging. Long telomeres in cells with the mutant protein did not move the same distance as those in wild type cells. Mutant Lamin A did not bind DNA with the same affinity as the wild type Lamin A did.These investigations show that telomeres and telomere dynamics are altered in HGPS cells. This is likely contributing to aspects of the pathology of the disease and would need to be taken into consideration in any therapeutic approach.
View record
Partner appointment
View explanation of statuses
If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.