Relevant Degree Programs
Graduate Student Supervision
Doctoral Student Supervision (Jan 2008 - May 2019)
No abstract available.
Non-coding RNAs (ncRNAs) consist of microRNAs, lincRNAs (long intergenicnon-coding RNA), rRNAs, tRNAs and the RNAs from other types of genes that do nothave the potential to be protein-coding. Non-coding RNAs play various roles in cellularprocesses. Gene duplication is a major force in gene evolution and the evolution ofduplicated protein-coding genes has been studied extensively. Whether the sameevolutionary principles hold true for ncRNAs, especially lincRNAs, is still poorlyunderstood particularly in plants. I characterized the effects of the change in microRNAbinding sites on the divergence of multiple types of duplicated genes in Arabidopsisthaliana and Brassica rapa (Chapter 2). I found that the vast majority of duplicated genesshowed divergence in their microRNA binding sites that could be associated with theirexpression and functional divergence. To better understand the evolutionary dynamics oflincRNAs in plants, I analyzed the sequence evolution of lincRNAs from five species(Arabidopsis thaliana, Oryza sativa ssp. japonica, Zea mays, Medicago truncatula andSolanum lycopersicum) across 55 plant genomes (Chapter 3). My analyses revealed thatlincRNAs show more rapid sequence divergence compared with protein-coding genesand microRNAs. I also analyzed the expression conservation of lincRNAs betweenclosely related species and showed rapid expression evolution of lincRNAs. I alsoidentified a considerable number of conserved regions in the sequence of lincRNAs thatare under stronger selection constraints than surrounding regions. To investigate the roleof gene duplication in the evolution of plant lincRNAs, I identified duplicated lincRNAsiiiin several plant species (Chapter 4). I compared the expression patterns betweenduplicated lincRNAs using RNA-seq data from multiple tissue types and developmentalstages, revealing extensive expression divergence of lincRNAs. Finally, I studied theeffects of polyploidy and abiotic stress on the expression of lincRNAs in diploid andpolyploid Brassica species (Chapter 5). My results showed extensive divergence of theexpression of lincRNAs after polyploidy and in response to different stresses. This thesisprovides new insights into lincRNA evolution and fates of lincRNAs after duplication inflowering plants.
Gene and genome duplications have made major contributions to the genomes of eukaryotes. Alternative splicing modulates gene expression and alters protein function. First, I examine alternative splicing patterns in the allopolyploid Brassica napus, revealing that the genome-wide trends of alternative splicing in duplicated genes of an evolutionarily new allotetraploid plant are very similar overall to those found in Arabidopsis thaliana. Within Brassica napus, I show that the alternative splicing patterns of the reunited homeologs are not well conserved, highlighting that alternative splicing is a rapidly evolving aspect of gene expression. Second, using Arabidopsis thaliana, I investigated the divergence of alternative splicing between paralogs, revealing about 30% qualitative conservation of alternative splicing events. I determined that qualitatively conserved events most often are not quantitatively conserved, indicating either incomplete divergence or specialization. I examined the duplicate gene pair of CCA1/LHY in detail, showing a case of subfunctionalization of alternative splicing after gene duplication that has implications for the cold response pathway of A. thaliana. By analyzing a transcriptome data set from nonsense mediated decay mutants, I showed that alternative splicing mediated nonsense mediated decay has significantly diverged between both pairs of whole genome and pairs of tandem duplicates. Third, I investigated the immediate effects of allopolyploidzation on gene expression and alternative splicing using three resynthesized Brassica napus lines. Many of the effects of allopolyploidization are repeatable, however some changes to gene expression and alternative splicing are unique to an instance of polyploidy. In all three polyploids surveyed, intron retention events that changed their frequency did so in an overwhelmingly negative fashion (i.e. the levels of alternatively spliced transcripts went down) and the majority of these changes were parallel between polyploids. Other classes of alternative splicing events showed a far more balanced set of changes in response to polyploidy. Natural B. napus showed significantly more increases in intron retention frequency vs. the parental species than any of the resynthesized lines. I assert that much of the changes in levels of alternatively spliced transcripts can be attributed the stochastic nature of polyploidization.
Duplicated genes are considered as raw materials for evolutionary innovations. They are common in eukaryotic genomes, particularly in plants due to the high incidence of whole genome duplication. Thus, understanding the factors that contribute to the retention of duplicatedgenes is a fundamental topic in evolutionary biology. I tackle this topic by examining howreciprocal expression (RE) among different organ and tissue types, as well as protein subcellularrelocalization (PSR), contributes to the retention of duplicated genes. From analyses ofmicroarray data across 83 different organ/cell types and developmental stages in Arabidopsisthaliana, I determined that more than 30% of duplicate pairs showed RE patterns (chapter 2).Reconstructing their ancestral expression pattern, more RE cases resulted from gain of a newexpression pattern (neofunctionalization) than from partitioning of ancestral expression patterns(subfunctionalization), with pollen being a common location for expression gain (chapter 2).During the analysis on RE, I found a dramatic example of neofunctionalization for a pair ofprotein kinase genes, SSP and BSK1, in the Brassicaceae (chapter 3). BSK1 and SSP have opposite expression patterns in pollen compared with all other parts of the plant. I determined that BSK1 retains the ancestral expression pattern and function and that the ancestral function of SSP was lost by deletions in the kinase domain. I revealed that SSP changed its function from a component of the brassinosteroid signaling pathway to being a paternal regulator ofembryogenesis. I also found that two reciprocally expressed duplicated gene pairs, a peroxidasegene pair and a CDPK gene pair, in Brassicaceae showed PSR and evidence for neofunctionalization (chapter 2). To better understand how PSR can contribute to the retention of duplicated genes, I focused on a particular example for a pair of the chloroplast-origin ribosomal protein S13 (rps13) genes in rosids (chapter 4). One encodes chloroplast-imported RPS13 (nucp rps13), while the other encodes mitochondria-imported RPS13 (numit rps13). I provided evidence that numit rps13 genes have experienced adaptive and convergent evolution. My thesisprovides important insights into the evolutionary importance of RE and PSR on the retention ofduplicated genes in plants.
Master's Student Supervision (2010 - 2018)
Plant genomes have large numbers of duplicated genes. After duplication one duplicate can acquire a new function or expression pattern, referred to as neofunctionalization. Some duplicated genes are imprinted, where only one allele is expressed depending on its parental origin. I hypothesized that duplicated imprinted genes frequently show an accelerated rate of amino acid sequence evolution and have a new expression pattern compared with their paralogs, which together are suggestive of neofunctionalization. I first studied four imprinted genes in Arabidopsis, FIS2, MPC, FWA, and HDG3 that have flower and/or seed specific expression. I found that they all have considerably accelerated rates of sequence evolution compared to their paralogs. To determine the ancestral expression pattern I assayed expression patterns in outgroup species, the results of which strongly suggested that the imprinted genes have acquired a novel organ-specific expression pattern restricted to flowers and/or seeds. Using data from recent large-scale identification studies of imprinted genes, I detected by phylogenetic tree analyses 133 imprinted genes that arose from gene duplication events in Brassicaceae. Analyses of 48 alpha whole genome duplicated gene pairs indicated that many imprinted genes show an accelerated rate of amino acid changes compared to their paralogs. Analyses of microarray data indicated that many imprinted genes have expression patterns restricted to flowers and/or seeds, compared with their broadly-expressed paralogs. Both the accelerated sequence rate evolution and the new expression pattern in the imprinted genes suggest that after evolutionarily recent duplication events, imprinted genes frequently underwent neofunctionalization. In particular, neofunctionalization of the FIS2 gene has led to a change in the mechanism of regulating seed development in Brassicaceae. Multiple lines of evidence, when considered together, are highly suggestive of many origins of imprinting in Brassicaceae. This study reveals that the origin of genetic imprinting can arise over short evolutionary time periods and gene duplication serves as an important factor generating imprinted genes.
Long intergenic non-coding RNA (lincRNA) genes are a poorly studied class of transcripts, particularly in plants. Because of the low levels of expression, high tissue speci- city, and rapid rate of evolution of lincRNA transcripts, the discovery and functionalannotation of these molecules is a signi cant challenge. Here, I report the annotationof 201 new lincRNA transcripts in Arabidopsis thaliana discovered using the results of asingle RNA-seq experiment of a normalized library. Using these sequences, along withthe 6 480 lincRNA genes annotated by Liu et al. (2012), I performed a pairwise sequence alignment experiment with the genomes of 22 plant species in order to discoverhighly conserved sequences within lincRNA loci. Of the 6 681 lincRNA sequences examined, 3 374 have highly conserved sequences supported by multiple genomic alignmentsto other species. Six of these show evidence of ongoing reduced sequence rate evolutionwhen single-nucleotide variant data from the recent evolutionary history of Arabidopsisthaliana. The rate of retention of these conserved regions within the Brassicaceae suggests a much higher rate of sequence turnover in lincRNA genes compared with proteincoding genes. Structural variant data from 80 di erent A. thaliana ecotypes suggeststhat lincRNA genes su er deletions of the entire locus from the genome with appreciablefrequency: 570 of the lincRNA loci examined are entirely missing from at least one A.thaliana strain. These results suggest an intriguing mixture of rapid sequence evolutionwith short, highly-conserved islands in lincRNA genes.
Gene duplication has supplied the raw material for novel gene functions and evolutionary innovations in plants. Duplicated genes can have different fates over time such as neofunctionalization and subfunctionalization. Sublocalization, which is a type of subfunctionalization based on protein subcellular relocalization, happens when the products of the duplicate genes are each directed to only one of two subcellular locations that were previously targeted by the single ancestral gene. The goals of the first part of my project were to study changes in protein subcellular localization (relocalization) after gene duplication by finding cases of sublocalization and further characterizing them from an evolutionary perspective. I found that sublocalization is a relatively uncommon phenomenon in plants as only two out of the seven gene families that I analyzed demonstrated cases of sublocalization. I identified and analyzed multiple cases of sublocalization of the APX and PP5 genes by doing RT-PCR experiments and then performing phylogenetic analyses and sequence rate analyses to further characterize the genes from an evolutionary perspective. Regulatory neofunctionalization involves changes in expression patterns of a gene after duplication. The goals for the second part of my thesis were to study expression patterns of duplicated genes in Arabidopsis thaliana and to analyze the selective forces acting on the genes of interest. I focused on eight pairs of duplicates that showed one copy broadly expressed and the other copy having expression only in certain organ types. By analyzing the expression patterns of the orthologs in outgroup species and selective forces acting on the sequences, I obtained evidence for potential neofunctionalization for a few cases. The results from my thesis provide new insights into the frequency and process of sublocalization of duplicated genes, as well as characterizing new examples of neofunctionalization of duplicated genes.
Gene expression divergence between populations has been linked to adaptive morphological evolution and is thought to be a factor in the invasive success of certain weedy plants. Understanding the genetic basis of these regulatory changes can identify genes that have been under selection during adaptation to a new environment or new species interactions. A high-throughput sequencing approach was used to study the regulatory basis (cis and/or trans) of gene expression differences between native and invasive populations of Cirsium arvense (Canada thistle) by exploring patterns of differential gene expression and sequence variation. Parent and hybrid allele-specific expression ratios were compared to infer the relative effects of cis- and trans-regulatory change. Genes differentially regulated in cis are considered candidate genes involved in adaptation or weediness because there is evidence for selection acting primarily on cis-regulatory variation. Illumina sequencing of cDNA libraries derived from parents and hybrid pools resulted in a total of 82,713,256 paired-end (2x100bp) reads and 83.4% of these were mapped to a reference C. arvense transcriptome of 88,374 unigene sequences. Expression analysis and variant (SNPs and Indel) calling was performed to score the nature of regulatory divergence for the first 900 contigs, representing ~1% of the total dataset. Of the 40 high-confidence cases, 7 showed cis-effects, 6 showed trans-effects, 9 had varying degrees of both cis and trans, and 18 showed non-intermediate hybrid effects. A set of contigs that had high similarity to 63 known or confirmed stress-related genes, previously identified in studies of sunflower and Canada thistle, was also assayed for allelic imbalance. Of these, 2 cases showed a cis-effect, 2 showed both cis- and trans-effects, and 2 revealed hybrid effects. Contig 23614, an auxin-response transcription factor, was differentially regulated due to cis-effects and has been previously confirmed as drought-stress gene in both sunflower and C. arvense. This research identifies changes in gene expression that are driven by differential selective pressures in native and invasive populations. It also advances our understanding of the nature of genetic changes that drive gene expression evolution.