Relevant Thesis-Based Degree Programs
Affiliations to Research Centres, Institutes & Clusters
We have ongoing opportunities for highly motivating individuals to work at the interface of microbial ecology, biological engineeing and bioinformatics across a range of basic and applied science projects.
We are looking for self-motivated cooperators with analytic and problem-solving skills and a strong desire to help create an equitable and sustainable future for humanity interacting with the Earth system.
Complete these steps before you reach out to a faculty member!
- Familiarize yourself with program requirements. You want to learn as much as possible from the information available to you before you reach out to a faculty member. Be sure to visit the graduate degree program listing and program-specific websites.
- Check whether the program requires you to seek commitment from a supervisor prior to submitting an application. For some programs this is an essential step while others match successful applicants with faculty members within the first year of study. This is either indicated in the program profile under "Admission Information & Requirements" - "Prepare Application" - "Supervision" or on the program website.
- Identify specific faculty members who are conducting research in your specific area of interest.
- Establish that your research interests align with the faculty member’s research interests.
- Read up on the faculty members in the program and the research being conducted in the department.
- Familiarize yourself with their work, read their recent publications and past theses/dissertations that they supervised. Be certain that their research is indeed what you are hoping to study.
- Compose an error-free and grammatically correct email addressed to your specifically targeted faculty member, and remember to use their correct titles.
- Do not send non-specific, mass emails to everyone in the department hoping for a match.
- Address the faculty members by name. Your contact should be genuine rather than generic.
- Include a brief outline of your academic background, why you are interested in working with the faculty member, and what experience you could bring to the department. The supervision enquiry form guides you with targeted questions. Ensure to craft compelling answers to these questions.
- Highlight your achievements and why you are a top student. Faculty members receive dozens of requests from prospective students and you may have less than 30 seconds to pique someone’s interest.
- Demonstrate that you are familiar with their research:
- Convey the specific ways you are a good fit for the program.
- Convey the specific ways the program/lab/faculty member is a good fit for the research you are interested in/already conducting.
- Be enthusiastic, but don’t overdo it.
G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.
ADVICE AND INSIGHTS FROM UBC FACULTY ON REACHING OUT TO SUPERVISORS
These videos contain some general advice from faculty across UBC on finding and reaching out to a potential thesis supervisor.
Graduate Student Supervision
Doctoral Student Supervision
Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.
We live in a world dominated by microorganisms that encode a vast yet still largely unexplored diversity of metabolic functions. Using function-driven approaches it is possible to mine this diversity for both basic science insights and industrial applications. In this thesis, a functional metagenomic screening paradigm using large-insert environmental DNA (fosmid) library pool selection was developed in Escherichia coli EPI300 background enabling users to design and build pool selection experiments, sequence pooled clones, assemble and annotate resulting sequences and begin to characterize activities of interest encoded within pools. Tests were conducted on a collection of nine fosmid libraries to select for conversion or tolerance traits enabling growth in the presence of inhibitory concentrations of vanillin to identify biological parts useful in design of industrial chassis for lignin valorization. Relevant fitness traits related to monoaromatic conversion as well as tolerance to monoaromatic compounds, including stress responses, transporters and cell surface modification functions were recovered. Based on these results, a larger selection experiment was conducted using seven lignin-derived monoaromatic compounds. Comparative analysis of enrichment pools revealed that the same clones were recovered on multiple aromatic compounds consistent with the presence of genes encoding non-specific tolerance traits. Efforts to grow fosmid pools on lignin-derived monoaromatics as sole carbon sources were unsuccessful pointing to workflow modifications including adoption of new screening hosts, overexpression of transporters and regulatory elements, or the insertion of genes in E. coli EPI300 encoding upstream or downstream conversion steps within multi-component catabolic pathways. Finally, the screening paradigm was extended to search for hydrocarbon tolerance phenotypes using toluene in three fosmid pools sourced from anaerobic hydrocarbon degrading enrichment cultures resulting in recovery of fitness traits including transport and cell surface modification. Pool growth was unstable prompting a time course experiment to better constrain pool diversity and dynamics. Fosmid pool sequencing recovered numerous genes encoding tolerance traits that varied between initial and later exposure phases consistent with clone succession. Looking ahead, integration of high-throughput functional metagenomics with strain engineering promises to promote functional characterization and these biological parts and will ultimately inform the development of microbial cell factories for sustainable biotechnology innovation.
Microorganisms within the domains Bacteria and Archaea are the most ancient and abundant forms of life in Earth’s biosphere. Their profound diversity manifests resilient communities that extend into virtually every conceivable niche. Over the past decade, high-throughput DNA sequencing of whole communities has transformed our perception of these branches of life, illuminating uncultivated lineages and conceptually linking microorganisms at various levels of biological organization to global-scale biogeochemical cycles. A core set of genes mediating transformations of energy and matter that drive a wide range of ecosystem services and functions evolved early in life’s history. Yet, identifying these functional and phylogenetic anchor genes amidst a veritable haystack of sequence information remains a challenging endeavour. In addition to common limitations including inefficient and overly permissible homology search methods, contemporary annotation tools fail to account for the paltry representation of extant diversity in natural and engineered environments leading to biased interpretations. To this end, I have developed the Tree-based Sensitive and Accurate Phylogenetic Profiler (TreeSAPP), a Python package for gene-centric analysis of microbial communities. TreeSAPP creates, updates, and leverages structured data objects called reference packages for homology search, phylogenetic placement, and taxonomic assignment of protein sequences. In comparison to related tools, TreeSAPP exhibits better classification performance and a broader suite of functions relevant to microbial ecology. I showcase recent improvements to the classification pipeline and introduce a supervised phylogeny partitioning algorithm useful in defining operational protein clusters compatible with common diversity estimation metrics. Example workflows are provided to construct accurate and inclusive reference packages in a principled manner, and quantify marker gene sequences in environmental datasets. Finally, I demonstrate the capabilities of TreeSAPP in a census of methane-cycling and alkane-transforming archaea, revealing expanded ecosystem ranges and support for numerous novel lineages that encode methyl-coenzyme M reductase. Resulting reference packages and data products provide a framework for developing molecular tools with which to probe and enrich for microbial agents driving selected environmental transformations and inform gene-centric modeling efforts to predict microbial community responses to environmental perturbation.
Metabolic pathway prediction within and between cells from genomic sequence information is an integral problem in biology linking genotype to phenotype. This is a prerequisite to both understanding fundamental life processes and ultimately engineering these processes for specific biotechnological applications. A pathway prediction problem exists because we have limited knowledge of the reactions and pathways operating in cells even in model organisms like Escherichia coli where the majority of protein functions are determined. Consequently, over the past decades several computational tools were developed to automate the reconstruction of pathways given enzymes obtained from genomes. Unfortunately, with an ever-increasing rate in the content and diversity of publicly available genomics and metagenomics datasets, those algorithms, to this date, experience more prominent and complex problems. These include incapability of systemically solving meta-level noise, neglecting pathway interactions, not considering vagueness associated with enzymes, and inadequate to scale to heterogeneous genomic datasets. In an attempt to resolve the aforementioned problems, this thesis examines multiple pathway prediction models given a list of enzymes based on multi-label learning approaches. Specifically, it first introduces mlLGPR that encodes manually designed enzyme and pathway properties to reconstruct pathways. Then, it proposes triUMPF, a more advanced model, that characterizes interactions among pathways and enzymes, jointly, with community detection from enzyme and pathway networks to improve the precision of predictions. This requires pathway2vec, a novel representation learning model, to automatically generate features aiding triUMPF’s prediction process. Next, the thesis presents leADS that subselects more impacted examples from a dataset to increase the pathway sensitivity performance. This model may rely on reMap, a novel relabeling algorithm, that incorporates the bag concept which is composed of correlated pathways to articulate missing pathways from data. Finally, all these models are integrated into a unified framework, mltS, to achieve the desired balance between sensitivity and precision outputs while assigning a confidence score to each model. The applicability of these models to recover pathways at the individual, population, and community levels of organization were examined against the traditional inference algorithms using benchmark datasets, where all the proposed models demonstrated accurate predictions and outperformed the previous approaches.
Microbial communities play an integral role in the biogeochemical cycling of carbon, nitrogen and sulfur throughout the biosphere. These communities interact, forming metabolic networks that change and adapt in response to availability of electron donors and acceptors. Oxygen minimum zones (OMZs) are regions of the ocean where oxygen (O₂) is naturally depleted. In OMZs microbial communities use alternative terminal electron acceptors such as nitrate, sulfate and carbon dioxide, resulting in fixed nitrogen loss and production of greenhouse gases including methane (CH₄). In this thesis, I explored microbial community structure, dynamics and metabolic interactions as they relate to CH₄ cycling in Saanich Inlet, a seasonally anoxic fjord on the coast of British Columbia Canada that serves as a model ecosystem for studying microbial processes in OMZs. Leveraging decadal time series observations in Saanich Inlet, I developed a geochemical dataset consisting of nutrient and gas measurements, coupled with multiomic (DNA, RNA and protein) sequence information to chart microbial community structure and dynamics along defined redox gradients. I conducted methods optimization comparing in situ and on-ship samplingparadigms and used correlation analysis to infer putative microbial interaction networks in relation to water column CH4 oxidation. Methanotrophic bacteria in Saanich Inlet were identified associated with three uncultivated Gammaproteobacteria clades termed OPU1, OPU3 and symbiont-related that partitioned in the water column during periods of prolonged stratification. Water column distribution of the OPU3 clade was found to correlate with nitrite (NO₂-). Based on these results, I conducted incubations with labelled CH₄ and NO₂- to test this correlation and constrain potential metabolic interactions between methanotrophs and other one-carbon utilizing microorganisms under low O₂ conditions. Using multi-omic information derived from these incubations I confirmed the role of OPU3 in coupling CH₄ oxidation to NO₂- reduction and uncovered potential metabolic interactions between OPU3 and other co-occurring microorganisms including Methylophilales,Planctomycetes and Bacteroidetes. Evidence for a communal function in CH₄ oxidation expandsthe role of OPU3 in the global carbon budget and provides a conceptual foundation for the development of numerical models to predict CH₄ flux from OMZs as they expand throughout the global ocean.
Microbial communities mediate biogeochemical processes of Carbon (C), Nitrogen (N) and Sulfur (S) cycling in the ocean on global scales. Oxygen (O₂) availability is a key driver in these processes and shapes microbial community structure and metabolisms. As O₂ decreases, microbes utilize alternative terminal electron acceptors, nitrate (NO₃–), nitrite, sulfate and carbon dioxide, depleting biologically available nitrogen and producing greenhouse gases nitrous oxide (N₂O) and methane (CH₄). Marine oxygen minimum zones (OMZs) are areas of O₂-depletion (O₂
Plant biomass offers a sustainable source for energy and materials and an alternative to fossil fuels. However, the industrial scale production or biorefining of fermentable sugars from plant biomass is currently limited by the lack of cost effective and efficient biocatalysts. Microbes, the earth's master chemists - employing biocatalytic solutions to harvest energy, and transform this energy into useful molecules - offer a potential solution to this problem. However, a majority of microbes remain uncultured, limiting our access to the genetic potential encoded within their genomes. This has spurred the development of culture independent methods, termed metagenomics. In this thesis I harnessed high-throughput functional metagenomic screening to discover biomass deconstructing biocatalysts from uncultured microbial communities. Towards this goal, twenty-two clone libraries containing DNA sourced from diverse microbial communities inhabiting terrestrial and aquatic ecosystems were screened with 4-methylumbelliferyl cellobioside to detect glycoside hydrolase activity. This revealed 178 active clones containing glycoside hydrolases, often in gene clusters. This set of active clones was consolidated and further characterized through sequencing and rapid, plate-based, biochemical assays. Additionally, libraries sourced from beaver fecal and gut microbiomes were screened with four fluorogenic probes (6-chloro-4-methylumbelliferyl derivatives of cellobiose, xylobiose, xylose and mannose) for glycoside hydrolase activity. This revealed a total of 247 active fosmid-harbouring clones, that encoded many polysaccharide-degrading genes and gene cassettes. Specific candidate genes from the fecal library were sub-cloned, and the resulting purified enzymes were shown to be involved in synergistic degradation of arabinoxylan oligomers. The clone libraries that were generated through functional metagenomic screening were then employed to reveal the promiscuity of glycoside hydrolases towards unnatural azido- and aminoglycosides. Promiscuous enzymes identified from metagenomic and synthetic clone libraries were then used as a starting point for the generation of new glycosynthases capable of incorporating modified glucosides and galactosides. The resulting set of eight new glycosynthases are capable of synthesizing di- and trisaccharides, glycolipids and inhibitors such as 2,4-dinitrophenyl 4'-amino-2,4'-dideoxy-2-fluoro-cellobioside. Taken together this work has exploited the power of functional metagenomics to reveal new modes of biocatalysis and develop new synthetic tools.
Limitations on the cultivation of a majority of naturally occurring microbes have spurred the rise of culture-independent methods for the investigation of environmental microbial communities, a field known as metagenomics. This thesis addresses both functional and informatic approaches to metagenomics with the aim of improving our knowledge of carbohydrate degradation. A high throughput functional metagenomic screen was developed and applied to over 350,000 fosmid clones to search for glycoside hydrolases (GHs) in metagenomic libraries. Screening yielded 798 fosmid clones capable of hydrolyzing a model sugar compound, and the genes responsible were subcloned and biochemically characterized for pH and temperature stability, and substrate specificity. The combination of functional and in silico methods developed were used in a longitudinal study of the beaver (Castor canadensis) digestive tract, in order to gain insight into the sequential degradation of biomass. A linear model was used to identify enrichment of endo-acting versus exo-acting GH families at five locations throughout the digestive tract. The discovery of high numbers of GH43 family genes on functionally identified fosmids resulted in their combination with all other known GH43 genes in order to create subfamily classifications that provide finer resolution of enzyme activities. This classification system resulted in an improved ability to assign functional characteristics to enzymes identified through informatic studies. Of the 37 subfamilies created, only 22 contained a characterized enzyme. Fosmids identified earlier in this work harboured genes from four uncharacterized GH43 subfamilies, and future characterization efforts will further our understanding of the GH43 family. Altogether, the developed methods provide a framework for future studies of biomass degradation and improve the power of both functional and in silico metagenomics.
Microorganisms are the stewards and creators of Earth's ecosystems, driving planetary nutrient and energy cycles. As such, the interactions and metabolic processes of microbial communities have emerged as a fundamental area of scientific research. Through the use of multi-omic (e.g.; metagenomics, metatranscriptomics, proteomics) sequence information, it is possible to reconstruct the compositional, regulatory, and distributed metabolic processes connecting microbial community members. This dissertation develops an interpretative framework for the joint analysis of compositional network patterns and metabolic pathway reconstruction using metagenomic and metatranscriptomic data from soils collected from lodgepole pine forests 13 years post-harvesting and soil organic matter removal and adjacent undisturbed lodgepole pine forests and Interior Douglas-fir forests at two Long Term Soil Productivity (LTSP) sites located near Kamloops, and Williams Lake B.C., Canada. Further, this dissertation serves to improve the accuracy with which environmental sequence data are analyzed by leveraging the work of statisticians to overcome known biases in canonical approaches to data normalization. Finally, this work provides a systematic approach to the functional annotation of unassembled data, and extends an existing diversity index to accept data types common in studies involving uncultivated microbial annotation. Together the data indicated spatiotemporal variation in, and forest harvesting impact on, metabolic interactions and genomic potential for plant biomass degradation and carbon cycling. However, redundant metabolic capacity combined with genetic variation within the microbial community ensures natural and anthropogenically-induced environmental change had disparate effects across community members thereby moderating the consequences of localized extinctions or niche space reduction, and guarding against the loss of metabolic functions within the soil ecosystem. Indeed, environmental change can result in the reshuffling of trophic relationships and information exchange (e.g.; H+, metabolites, and horizontal gene transfer) allowing new and novel interactions between organisms to form and increase the community's ability to tolerate disturbance. This work represents an important step in understanding how environmental changes impact microbial communities and ecosystem function within the soil milieu. Ultimately, data and findings from this dissertation can be integrated with future analyses of biogeochemical parameter information and thermodynamic principles to enable time variable forecasts of microbial adaptive response to environmental change.
Microorganisms are the most abundant and diverse forms of life on Earth. Interconnected microbial communities drive matter and energy transformations integral to ecosystem functions and services through distributed metabolic networks innovated over 3.5 billion years of evolution. To effectively harness this metabolic potential it is necessary to chart uncultivated microbial community structure and function. Cultivation-independent studies indicate that over half of the microbial diversity on Earth belongs to uncultivated candidate divisions, also known as microbial dark matter (MDM). Here, I illuminate MDM structure and function in meromictic Sakinaw Lake on the Sunshine Coast of British Columbia Canada. Sakinaw Lake water column conditions quantified over 8 field campaigns were intimately associated with unprecedented abundance and diversity of MDM. Using network analysis and single cell genomics, co-occurrence patterns between MDM including OP9/JS-1, OP8 and methanogenic Archaea were linked with potential to perform syntrophic acetate oxidation, an important process in anaerobic digestion of organic matter in natural and engineered ecosystems. Single-cell and metagenome analysis revealed previously unrecognized nitrate reduction potential in candidate division OP3 and uncovered a novel archaeal lineage. In addition, numerous Fe-S oxidoreductases associated with MDM in Sakinaw Lake indicate the potential to couple sulfur oxidation to iron reduction. Taken together, my work establishes Sakinaw Lake as a natural laboratory in which to explore MDM structure and function, shines a spotlight on known and novel interactions and metabolic capabilities among these most enigmatic microorganisms, and points to potential biotechnological innovations based on cooperative interactions between MDM populations.
The lack of cultivated reference strains for the majority of naturally occurring microorganisms has lead to the development of plurality sequencing methods and the field of metagenomics, offering a glimpse into the genomes of this so-called 'microbial dark matter' (MDM). An explosion of sequencing initiatives has followed, attempting to capture and extract biological meaning from MDM across a wide range of ecosystems from deep-sea vents and polar seas to waste-water bioreactors and human beings. Current analytic approaches focus on taxonomic structure and metabolic potential through a combination of phylogenetic anchor screening of the small subunit ribosomal RNA gene (SSU or 16S rRNA) and general sequence searches using homology-based inference. Though much has been learned about microbial diversity and metabolic potential within natural and engineered ecosystems using these approaches, they are insufficient to resolve the ecological relationships that couple nutrient and energy flow between community members - ultimately translating into ecosystem functions and services. This shortcoming arises from a combination of data-intensive challenges presented by environmental sequence information that span processing, integration, and interpretation steps, and a general lack of robust statistical and analytical methods to directly address these problems.This dissertation addresses some of these shortcomings through the development of a modular analytical pipeline, MetaPathways, allowing for the large-scale and systematic processing and integration of many forms of environmental sequence information. MetaPathways is built to scale, comparing hundreds of metagenomic samples through the efficient use of data structures, grid compute models, and interactive data query. Moreover, it attempts to bring functional analysis back to the metabolic map through the creation of environmental pathway/genome databases (ePGDBs), adopting the Pathway Tools software for metabolic pathway prediction on the MetaCyc encyclopedia of genes and genomes. ePGDBs and the pathway-centric approach are validated to provide known and novel insights into community structure and function. Finally, novel taxonomic and metabolic methods supporting the pathway-centric model are derived and demonstrated, and enhance Pathway Tools as a framework for engineering microbial communities and consortia.
Evolution of multicellular eukaryotes is intimately associated with microbial interactions resulting in diversification and niche expansion. This long history of co-evolution is evident in metabolic interdependence, and reliance of animal (i.e. metazoan) ecosystems on their microbiota for healthy development and function. Specific recognition between interacting partners is essential for establishing and successfully maintaining interspecies associations, and involves host immunity and symbiont-encoded factors. Sponges represent the most deeply branching animal phylum with the potential to shed new light on the evolution of innate immunity and host-microbe interactions within the metazoa. Marine sponges harbour diverse microbial communities that contribute to higher order ecosystem functions including primary production and nutrient cycling. However, molecular mechanisms mediating symbiont recognition and host immune signalling in sponge symbioses are unknown. This knowledge gap stems from the fact that most sponge-associated microbes remain uncultivated and no sponge host/symbiont culture systems exist. In this thesis, I used cultivation-independent approaches including environmental genomics, transcriptomics, and proteomics in combination with homology modeling and community composition profiling to identify molecular determinants of sponge symbiosis in the sponge Dragmacidon mexicanum. Community composition profiling indicated that D. mexicanum is a high microbial abundance sponge harbouring a specific microbial community dominated by the Thaumarchaeaote Cenarchaeum symbiosum. Comparative genomics and gene expression profiling identified potential symbiont-encoded proteins including serine protease inhibitors (serpins) with the potential to mediate host-microbe interactions that were not found in closely related free-living Thaumarchaeaota, consistent with C. symbiosum’s adaptation to a symbiotic lifestyle. Biochemical assays were subsequently used to characterize serpin activity and infer function. Immunity determinants previously unreported in sponges were identified, enabling near-complete reconstruction of innate immune signalling pathways and partial adaptive immunity pathways. Thus, this work expanded the known complexity of sponge immune signalling and suggests a more ancient origin of certain pathways than previously recognized. The composition of sponge innate immunity may reflect the complex nature of sponge-associated microbiota, which likely acquired adaptive features to thrive in the host milieu. Taken together, this thesis provides novel insights into the evolution of host-microbial recognition, archaeal adaptations to a symbiotic lifestyle, and molecular interactions between archaea and eukaryotic cells.
Oxygen minimum zones (OMZs) are intrinsic water column features that arise when the respiratory oxygen (O2) demand during microbial remineralization of organic matter exceeds O2 supply rates in poorly ventilated regions of the ocean. Microbial processes play a key role in mediating biogeochemical cycling of nutrients and radiatively active trace gases in OMZs. Specific roles of individual microbial groups and the ecological interactions among groups that drive OMZ biogeochemistry on a global scale, however, remain poorly constrained. This dissertation focuses on describing microbial community structure in the world’s largest and least studied OMZ, located in the Northeast subarctic Pacific Ocean (NESAP), with a specific emphasis on characterizing the ecology of Marine Group A, an uncultivated candidate phylum of bacteria found to be prevalent in this region. To begin, I performed a survey of microbial community structure in the NESAP at two time points and over a range of depths based on traditional ecological analyses. I applied techniques derived from network theory to identify co-occurrence patterns among microbial groups within the NESAP and determined that MGA bacteria most frequently co-occurred with other MGA bacteria, suggesting that intra-phylum interactions may play a role in governing microbial processes in this region. Through analysis of small subunit ribosomal rRNA (SSU rRNA) gene sequences affiliated with MGA, I identified 8 novel subgroups and established the phylogeny and population structure of both novel and previously detected MGA subgroups. Finally, I provided first insights into the metabolic capacity of this little-known candidate phylum through investigations of metagenomic data obtained from NESAP waters. Analysis of large-insert genomic DNA fragments derived from MGA revealed protein-coding genes associated with adaptation to oxygen deficiency and sulfur-based energy metabolism. These observations may implicate MGA bacteria in the cryptic sulfur cycle, recently discovered to play a central role in biogeochemical cycling within OMZs. This work describes the first survey of microbial community structure in the NESAP OMZ and the first application of co-occurrence networks to study the ecology of deep ocean microbial communities, in addition to the first analysis of the diversity, population structure, and metabolic capacity of the enigmatic bacterial lineage MGA.
Master's Student Supervision
Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.
Global plastic production has increased exponentially since the 1950s, ushering in an era of cheap synthetic polymers that have revolutionized human manufacturing and promoted socioeconomic development. However, the benefits of this plastics revolution are contrasted with human and environmental health problems resulting from its disposal. Not only has plastic waste contributed to burgeoning accumulation in landfills and the ocean, but additives and leachates from plastic polymers have been linked to negative health effects including endocrine disruption and cancer. Less well understood is the potential impact of microplastic particles and fibers (MPs) between 0.3mm and 5mm on human and environmental health. MPs frequently enter coastal marine waters through wastewater treatment where they are rapidly colonized by microbes. The drivers of community assembly on MPs remain unconstrained with implications for carbon cycling, antimicrobial resistance and mobilization of additives and leachates through marine food webs. Here, I investigated microbial community assembly on marine plastic fibers and the impact of MP pollution on planktonic microbial community composition. A time-resolved mesocosm experiment using different concentrations of MP fibers indicated that increased fiber concentrations did not have a significant effect on planktonic microbial community composition or chemical concentrations but did induce finer-scale taxonomic changes. A related in-situ textile degradation experiment indicated rapid microbial colonization coalescing into relatively stable community structures after one month. Plastic-attached communities varied significantly by polymer type but not by chemical additive or color. These results demonstrate that marine microbial community assembly varies by plastic type, but MP pollution may not significantly affect the surrounding planktonic microbial community.
The full abstract for this thesis is available in the body of the thesis, and will be available when the embargo expires.
The manipulation of gene activity in cyanobacteria offers the possibility of producing energy and materials directly from sunlight, water and carbon dioxide, contributing to food production, innovative bioproducts and reliable bioenergy solutions that reduce human carbon emissions. However, current applications of cyanobacteria for these purposes remain in early stages of development. A limited capacity to study and control genetic information in many strains is a barrier to both optimizing metabolic flux through central metabolic pathways and programming these pathways for production of user defined products at industrial scales. Additionally, even in strains that have successfully been engineered for production of various commodity compounds, such as Synechocystis and Synechococcus, limitations in scalability and knowledge of carbon processing pathways continue to stymie industrial applications. This thesis is focused on genome sequencing of an industrial cyanobacterial strain called AB48 a strain optimized to grow as a biofilm using modular photobioreactors developed by AlgaBloom International Ltd, a local biotechnology company. These biofilm-based photobioreactors employ proprietary substrate-based growth paradigms that permit minimal water and energy consumption, while maximizing bioreactor productivity. The AB48 genome was initially sequenced directly from photobioreactor biomass on the PacBio platform at the DOE Joint Genome Institute. In the process of assembling and annotating the genome, additional metagenome assembled genomes (MAGs) associated with co-occurring microorganisms and mobile genetic elements (MGEs) including plasmids and phage were identified. An AB48 strain was later isolated and sequenced on the Illumina HiSeq and Oxford Nanopore MinION platforms. After hybrid assembly a complete closed reference genome and plasmid sequence were resolved, enabling formal classification of AB48 as a new species within the genus Phormidium called Phormidium yuhuli AB48. Encouraged by these results, efforts were made to reassemble and analyze raw sequencing data from other cyanobacterial genome sequencing projects with an eye toward identifying co-occurring microorganisms and MGEs. Typically, sequence information associated with co-cultured bacteria is not reported in published cyanobacterial genome reports despite the potential for uncovering known or novel interactions. The resulting MAGs for cyanobacteria, co-occurring microorganisms and MGEs provide a lineage-resolved resource of biological parts and putative metabolic interactions for sustainable bioproduction and biotechnology innovation.
As biofuel production increases, so too has the likelihood of accidental spills into the environment with important implications for human health and ecosystem functions. Such impacts can be evaluated at the microbial level as microorganisms are fundamental units of metabolism integral to the conversion of hydrocarbon substrates in the environment. Using the small subunit ribosomal (SSU or 16S) rRNA gene, we evaluate microbial community responses to ethanol and methanol blended biofuel contamination in terrestrial environments, through laboratory and field experiments, by assessing both community abundance and potential activity. For ethanol-based biofuel contamination, we observe differential patterns of enrichment in microbial taxa with traits relating to hydrocarbon degradation and fermentation across both field and laboratory experiments as well as changes in the abundance and potential activity of canonical methanogens. Observed differences highlight the role of the physical environment and the availability of organic matter in shaping microbial response patterns. Similar results were obtained for the methanol laboratory experiments, with respect hydrocarbon-degrading and fermentative taxa. However, a stronger methanogen response was observed suggesting that methanol blended fuels are more readily converted to methane under low organic loading conditions. Together, this work provides an initial assessment on the impact blended biofuels has on microbial community structure by identifying microbial taxa most responsive to contamination with implications for the development of remediation and risk assessment strategies.
Bacteria and Archaea represent the invisible majority of living things on Earth with an estimated numerical abundance exceeding 10^30 cells. This estimate surpasses the number of grains of sand on Earth and stars in the known universe. Interdependent microbial communities drive fluxes of matter and energy underlying biogeochemical processes, and provide essential ecosystem functions and services that help create the operating conditions for life. Despite their abundance and functional imperative, the vast majority of microorganisms remain uncultivated in laboratory settings, and therefore remain extremely difficult to study. Recent advances in high-throughput sequencing are opening a multi-omic (DNA and RNA) window to the structure and function of microbial communities providing new insights into coupled biogeochemical cycling and the metabolic problem solving power of otherwise uncultivated microbial dark matter (MDM). These technological advances have created bottlenecks with respect to information processing, and innovative bioinformatics solutions are required to analyze immense biological data sets. This is particularly apparent when dealing with metagenome assembly, population genome binning, and network analysis.This work investigates combined use of single-cell amplifed genomes (SAGs) and metagenomes to more precisely construct population genome bins and evaluates the use of covariance matrix regularization methods to identify putative metabolic interdependencies at the population and community levels of organization. Applying dimensional reduction with principal components and a Gaussian mixture model to k-mer statistics from SAGs and metagenomes is shown to bin more precisely, and has been implemented as a novel pipeline, SAG Extrapolator (SAGEX). Also, correlation networks derived from small subunit ribosomal RNA gene sequences are shown to be more precisely inferred through regularization with factor analysis models applied via Gaussian copula. SAGEX and regularized correlation are applied toward 368 SAGs and 91 metagenomes, postulating populations’ metabolic capabilities via binning, and constraining interpretations via correlation. The application describes coupled biogeochemical cycling in low-oxygen waters. Use of SAGEX leverages SAGs’ deep taxonomic descriptions and metagenomes’ breadth, produces precise population genome bins, and enables metabolic reconstruction and analysis of population dynamics over time. Regularizing correlation networks overcomes a known analytic bottleneck based in precision limitations.
Emerging lines of evidence indicate that microbes form distributed networks of metabolite exchange based in part on public goods. These networks have the potential to drive the evolution of microbial lineages, and contribute to essential functions and services in natural and engineered ecosystems. However, experimental systems in which to evolve and perturb public good dynamics remain poorly constrained. Here, a functional metagenomic screening approach was used to recover abundant biosynthetic gene clusters with the potential to mediate microbial interactions in the environment. Specifically, 29 gene clusters involved in the production of riboflavin sourced from diverse microbial donor genotypes were recovered by functional screening from two fosmid libraries constructed from methanogenic communities enriched on hydrocarbons. Active clones were sequenced and riboflavin encoding gene cassettes were verified using cluster subcloning. Focusing on observed relationships between mobile genetic elements, metabolite secretion patterns and gene frequency distributions, a role for riboflavin as a public good in hydrocarbon-enriched environments was posited. Overall, this work suggests that secreted riboflavin may have versatile and unrecognized roles in microbial hydrocarbon transformation with potential to modulate microbial community dynamics in hydrocarbon resource environments
Cultivation independent microbial ecology research relies on high throughput sequencing technologies and analytical methods to resolve the infinite diversity of microbial life on Earth. Microorganisms live in communities driven by genetic and metabolic processes as well as symbiotic relationships. Interconnected communities of microorganisms provide essential functions in natural and human engineered ecosystems. Modelling the community as an inter-connected system can give insight into the community's functional characteristics related to the biogeochemical processes it performs. Network science resolves associations between elements of structure to notions of function in a system and has been successfully applied to the study of microbial communities and other complex biological systems. Microbial co-occurrence networks are inferred from community composition data to resolve structural patterns related to ecological properties such as community resilience to disturbance and keystone species. However, the interpretation of global and local network properties from an ecological standpoint remains difficult due to the complexity of these systems creating a need for quantitative analytical methods and visualization techniques for co-occurrence networks. This thesis tackles the visualization and analytical challenges of modelling microbial community structure from a network science approach. First, Hive Panel Explorer, an interactive visualization tool, is developed to permit data driven exploration of topological and data association patterns in complex systems. The effectiveness of Hive Panel Explorer is validated by resolving known and novel patterns in a model biological network, the C. elegans connectome. Second, network structural robustness analysis methods are applied to study microbial communities from timber harvested forest soils from a North American longterm soil productivity study. Analyzing these geographically dispersed soils reveals biogeographic patterns of diversity and enables the discovery of conserved organizing principles shaping microbial community structure. The capacity of robustness analysis to identify key microbial community members as well as model shifts in community structure due to environmental change is demonstrated. Finally, this work provides insight into the relationship between microbes and their ecosystem, and characterizing this relationship can help us understand the organization of microbial communities, survey microbial diversity and harness its potential.
Biological methane (CH₄) production, or methanogenesis, plays a crucial role in the global carbon cycle. Biologically generated CH₄ can be emitted into the atmosphere, where it acts as greenhouse gas twenty five times more potent than carbon dioxide (CO₂), stored as “methane ice” (clathrates or hydrates) in marine sediments along continental margins, or oxidized under aerobic or anaerobic conditions by microbial agents effectively limiting atmospheric flux. Methanogenesis is orchestrated by a group of obligate anaerobic archaea within the phylum Euryarchaeota, known as the methanogens that produce CH₄ as the end product of their energy metabolism. Three methanogenic pathways have been described including the hydrogenotrophic, methylotrophic and aceticlastic. Although differing in their use of electron donors, all three pathways converge on a terminal step catalyzed by the heterohexameric methyl-coenzyme M reductase (MCR). Over the last decade cultivation-independent studies have identified anaerobic methane-oxidizing archaea (ANME-1, 2 and 3) related methanogens that appear to run one or more canonical methanogenic pathways in reverse including the terminal step catalyzed by MCR. The three genes encoding MCR subunits, mcrA, mcrB and mcrG possess phylogenetic resolution similar to that of the small subunit ribosomal RNA gene making them useful functional markers for detection and differentiation of methanogenic and methane-oxidation pathways in natural and human engineered ecosystems. Here I introduce an automated and culture-independent method for monitoring the taxonomic structure and genomic potential of methane-cycling environments that leverages and improves upon an existing software package called MLTreeMap. MLTreeMap is a user-extensible software framework that automates maximum likelihood analysis to recover phylogenetic or functional marker genes from environmental sequence data. I first describe the taxonomic structure and pathway representation of methane-cycling environments on a global scale based on the identification of MCRA alleles. I then chart both the metabolic potential and gene expression of marine sediments supporting the anaerobic oxidation of CH₄ using a series of reference trees representing near complete methane-cycling pathways.
- Expanding the phylogenetic distribution of cytochrome b-containing methanogenic archaea sheds light on the evolution of methanogenesis (2022)
The ISME Journal,
- Fractional factorial experimental design for optimizing volatile fatty acids from anaerobic fermentation of municipal sludge: Microbial community and activity investigation (2022)
- An integrated, modular approach to data science education in microbiology (2021)
PLOS Computational Biology,
- Bacteroidetal cold-active and promiscuous esterases play a significant role in global polyethylene terephthalate (PET) degradation (2021)
- Ecology and molecular targets of hypermutation in the global microbiome (2021)
- Ecology of inorganic sulfur auxiliary metabolism in widespread bacteriophages (2021)
Nature Communications, 12 (1)
- Insights into the controls on metabolite distributions along a latitudinal transect of the western Atlantic Ocean (2021)
- Mercury methylation by metabolically versatile and cosmopolitan marine bacteria (2021)
The ISME Journal, 15 (6), 1810--1825
- Metabolic Pathway Prediction using Non-negative Matrix Factorization with Improved Precision (2021)
- Potential virus-mediated nitrogen cycling in oxygen-depleted oceanic waters (2021)
The ISME Journal, 15 (4), 981--998
- Prokaryotic responses to a warm temperature anomaly in northeast subarctic Pacific waters (2021)
- The abundance of mRNA transcripts of bacteroidetal polyethylene terephthalate (PET) esterase genes may indicate a role in marine plastic degradation (2021)
- An integrated, modular approach to data science education in the life sciences (2020)
- leADS: improved metabolic pathway inference based on active dataset subsampling (2020)
- Leveraging Heterogeneous Network Embedding for Metabolic Pathway Prediction (2020)
- Metabolic pathway inference using multi-label classification with rich pathway features (2020)
PLOS Computational Biology,
- Metabolic pathway prediction using non-negative matrix factorization with improved precision (2020)
- Metabolite composition of sinking particles differs from surface suspended particles across a latitudinal transect in the South Atlantic (2020)
Limnology and Oceanography, 65 (1), 111--127
- Relabeling metabolic pathway data with groups to improve prediction outcomes (2020)
- An enzymatic pathway in the human gut microbiome that converts A to universal O type blood (2019)
Nature Microbiology, 4 (9), 1475--1485
- Seasonal and ecohydrological regulation of active microbial populations involved in DOC, CO2, and CH4 fluxes in temperate rainforest soil (2019)
The ISME Journal, 13 (4), 950--963
- Metagenomics reveals functional synergy and novel polysaccharide utilization loci in the Castor canadensis fecal microbiome (2018)
The ISME Journal, 12 (11), 2757--2769
- Diverse Marinimicrobia bacteria may mediate coupled biogeochemical cycles along eco-thermodynamic gradients (2017)
Nature Communications, 8 (1)
- Major role of nitrite-oxidizing bacteria in dark ocean carbon fixation (2017)
Science, 358 (6366), 1046--1051