Steven J Jones


Relevant Degree Programs



Master's students
Doctoral students
Postdoctoral Fellows
Any time / year round

Bioinformatics Cancer Genomics

Complete these steps before you reach out to a faculty member!

Check requirements
  • Familiarize yourself with program requirements. You want to learn as much as possible from the information available to you before you reach out to a faculty member. Be sure to visit the graduate degree program listing and program-specific websites.
  • Check whether the program requires you to seek commitment from a supervisor prior to submitting an application. For some programs this is an essential step while others match successful applicants with faculty members within the first year of study. This is either indicated in the program profile under "Admission Information & Requirements" - "Prepare Application" - "Supervision" or on the program website.
Focus your search
  • Identify specific faculty members who are conducting research in your specific area of interest.
  • Establish that your research interests align with the faculty member’s research interests.
    • Read up on the faculty members in the program and the research being conducted in the department.
    • Familiarize yourself with their work, read their recent publications and past theses/dissertations that they supervised. Be certain that their research is indeed what you are hoping to study.
Make a good impression
  • Compose an error-free and grammatically correct email addressed to your specifically targeted faculty member, and remember to use their correct titles.
    • Do not send non-specific, mass emails to everyone in the department hoping for a match.
    • Address the faculty members by name. Your contact should be genuine rather than generic.
  • Include a brief outline of your academic background, why you are interested in working with the faculty member, and what experience you could bring to the department. The supervision enquiry form guides you with targeted questions. Ensure to craft compelling answers to these questions.
  • Highlight your achievements and why you are a top student. Faculty members receive dozens of requests from prospective students and you may have less than 30 seconds to pique someone’s interest.
  • Demonstrate that you are familiar with their research:
    • Convey the specific ways you are a good fit for the program.
    • Convey the specific ways the program/lab/faculty member is a good fit for the research you are interested in/already conducting.
  • Be enthusiastic, but don’t overdo it.
Attend an information session

G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.


Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - Nov 2020)
The clinical actionability and evolution of mutational processes in metastatic cancer (2020)

Cancers are characterized by somatic mutation arising from the interplay of mutagen exposure and deficient DNA repair. Whole genome sequencing of tumours reveals characteristic patterns of mutation, known as mutation signatures, which often correspond with specific processes such as cigarette smoke exposure or the loss of a DNA repair pathway. Quantifying DNA repair deficiency can have clinical implications. Cancer chemotherapies which induce DNA damage are known to be more effective against cancers with deficient DNA repair. However, it is not yet known whether mutation signatures can serve as reliable predictive biomarkers for response to these treatments. Furthermore, the current understanding of mutation signatures stems largely from studies of primary, untreated tumours, whereas metastasis underpins as much as 90% of cancer-related mortality. This thesis aims to (1) describe the association between mutation signatures and clinical response to DNA damaging chemotherapy, (2) enable accurate personalized assessment of mutation signatures and their evolution over time, and (3) characterize the evolution of mutational processes in metastatic cancers. To assess clinical actionability, we quantified signatures of single nucleotide variants, structural variants, copy number variants, and small deletions in 93 metastatic breast cancers, 33 of which received platinum-based chemotherapy. We found that patients with signatures of homologous recombination deficiency had improved responses and prolonged treatment durations on platinum-based chemotherapy. Next, we formulated a Bayesian model called SignIT, which improves the accuracy of individualized mutation signature analysis and infers signature evolution over tumour subpopulations. We demonstrated SignIT’s superior accuracy on both simulated data and somatic mutations from The Cancer Genome Atlas, and validated temporal dissection using whole genomes from 24 multiply-sequenced cancers. We highlighted a potential clinical application of mutation in a BRCA1-mutated pancreatic adenocarcinoma with low Homologous Recombination Deficiency (HRD) signature but exceptional response to platinumcontaining chemotherapy. Finally, we deciphered mutation signatures from nearly 500 metastatic cancer whole genomes, revealing evolution of mutational processes associated with late metastasis and exposure to cytotoxic chemotherapy. Taken together, our findings demonstrate the complex interplay of factors shaping the metastatic cancer genome. We highlight both clinical opportunities of studying genomic instability and the additional insights available from understanding their temporal evolution.

View record

Utility of machine learning approaches for cancer diagnosis and analysis from RNA sequencing (2020)

The highest number of cancer-associated deaths are attributable to metastasis. These include rare cancer types that lack established treatment guidelines, or cancers that become resistant to established lines of therapy. Precision oncology projects aim to develop treatment options for these patients by obtaining a detailed molecular view of the cancer. Scientists use sequencing data like whole-genome sequencing and RNA-sequencing to understand the biology of the cancer. A significant challenge in this process is diagnosing the cancer type of the sample since the observed measurements are best understood with this context. Routine histopathology relies on tissue morphology and can fail to provide a determinative diagnosis when the cancer metastasizes, presents biology attributable to multiple different cancer types, or presents as a rare cancer type. Molecular data has revealed differences in the genetic makeup of cancers that appear morphologically similar, motivating the use of molecular diagnostics. Nevertheless, no existing tools utilize the output from these sequencing modalities in its entirety (that is, without feature selection). There is also limited work evaluating the utility of pan-cancer molecular diagnostics in a precision oncology trial. In this work we review an ongoing precision oncology trial and identify the impact of sequencing-based approaches on cancer diagnosis. We develop SCOPE, a machine-learning method that uses RNA-Seq profiles of tumours for automated cancer diagnosis. We show that this method, which uses over 17,688 gene measurements as input, has better classification accuracy than when using statistically prioritized marker genes, can deconvolve cancer-types with mixed histology, and has high performance in metastatic cancers and cancers of unknown origin. In precision oncology, manual analysis of the tumour's genomic profile is used to understand tumour biology and driver pathways. We find that by assessing the classifier's dependence on gene subsets, we can automatically calculate the importance of various biological programs in individual tumours. Pathways prioritized through this tool - called PIE - show a high overlap with manual integrative analysis performed by expert bioinformaticians to identify clinically important genomic changes. Lastly, we demonstrate that PIE facilitates cohort-wide cancer analysis and discovery of novel sub-groups in advanced cancers.

View record

Building and inferring knowledge bases using biomedical text mining (2019)

Biomedical researchers have the overwhelming task of keeping abreast of the latest research. This is especially true in the field of personalized cancer medicine where knowledge from different areas such as clinical trials, preclinical studies, and basic science research needs to be combined. We propose that automated text mining methods should become a commonplace tool for researchers to help them locate relevant research, assimilate it quickly and collate for hypothesis generation. To move towards this goal, we focus on extracting relations from published abstracts and full-text papers. We first explore the use of co-occurrences in sentences and develop a method for inferring new co-occurrences that can be used for hypothesis generation. We next explore more advanced relation extraction methods by developing a supervised learning method, VERSE, which won part of the BioNLP 2016 Shared Task. Our classical method outperforms a deep learning method showing its applicability to text mining problems with limited training data. We develop it further into the Kindred Python package which integrates with other biomedical text mining resources and is easily applied to other biomedical problems. Finally, we examine the applicability of these methods in personalized cancer research. The specific role of genes in different cancer types as drivers, oncogenes, and tumor suppressors is essential information when interpreting an individual cancer genome. We built CancerMine, a high-quality knowledgebase, using the Kindred classifier and annotations from a team of annotators. This allows for quantifiable comparisons of different cancer types based on the importance of different genes. The clinical relevance of cancer mutations is generally locked in the raw text of literature and was the focus of the CIViCmine project. As a collaboration with the Clinical Interpretation of Variants in Cancer (CIViC) project team, we built methods to prioritise relevant papers for curation. Through this work, we have focussed on different ways to extract structured knowledge from individual sentences in biomedical publications. The methods, guidelines, and results developed will aid biomedical text mining research and the personalized cancer treatment community.

View record

Genomic Analysis of Head and Neck Endocrine Glands (2015)

Discovering biomarkers and molecular drivers of head and neck endocrine tumors was the inspiration for this thesis. Here, I describe the molecular evaluation of tumors of the thyroid and parathyroid endocrine glands for the purpose of identifying somatic driver alterations in these cancers. While molecular interplay of the germline genomic background of an individual and the somatic genome that emerges throughout the lifetime plays significant roles in increasing the susceptibility to cancer and in driving the malignant phenotype, the major known contributors to cancer remain the acquired somatic mutations. Analysis of a sporadic and recurring parathyroid carcinoma, with incidence of 1 per million population, revealed mutations in mTOR, MLL2, CDKN2C and PIK3CA and comparison of patient-matched primary and recurrent malignant tumors uncovered loss of PIK3CA activating mutation during the evolution of the tumor. Loss of the short arm of chromosome 1 along with somatic missense and truncating mutations in CDKN2C and THRAP3 provided new evidence for the potential role of these as tumor suppressors. Hürthle cell thyroid carcinoma accounts for a small proportion of all thyroid cancers; however, this malignancy often presents at an advanced stage and poses unique challenges. Genomic analysis revealed large regions of copy number variation encompassing nearly the entire genomes accompanied also by near haploidization. Moreover, I identified loss-of-function mutations of the tumor suppressor gene MEN1 in 4% of patients. Repeated alterations of the epigenetic machinery in anaplastic thyroid carcinoma, one of the most fatal of all adult solid malignancies, and novel gene fusions including MKRN1-BRAF, FGFR2-OGDH and SS18-SLC5A11 are reported here. The transcriptomic analysis suggested known drug targets such as FGFRs, VEGFRs, KIT and RET to have low expressions in this cancer; however, through integrative data analysis, I identified the mTOR signaling pathway as a potential therapeutic target for anaplastic thyroid cancer. Molecular analysis of papillary thyroid carcinoma and benign thyroid nodules revealed very low mutation rates in these tumors with CYP1B1, PTPRE, CTSH and RUNX1 emerging as promising diagnostic markers. The key somatic mutations identified in these studies can serve as novel diagnostic markers as well as therapeutic targets.

View record

Algorithms and applications of next-generational DNA sequencing (2012)

Next Generation Sequencing (NGS) technologies enable Deoxyribonucleic Acid (DNA) or Ribonucleic Acid (RNA) sequencing to be done at volumes and speeds several orders of magnitude faster than Sanger (dideoxy termination) based methods and have enabled the development of novel experiment types that would not have been practical before the advent of the NGS-based machines. The dramatically increased throughput of these new protocols requires significant changes to the algorithms used to process and analyze the results. In this thesis, I present novel algorithms used for Chromatin Immunoprecipitation and Sequencing (ChIP-Seq) as well as the structures required and challenges faced for working with Single Nucleotide Variations (SNVs) across a large collection of samples, and finally, I present the results obtained when performing an NGS based analysis of eight mammary ductal carcinoma cell lines and four matched normal cell lines.

View record

Bioinformatic approaches to drug repositioning (2012)

Repositioning existing drugs for new therapeutic uses is an efficient approach to drugdiscovery. However, most successful repositioning cases to date have been serendipitous; thegoal of my thesis was to use computational methods to rationally discover drug repositioningcandidates.I first virtually screened (VS) 4621 drugs against 252 drug targets with molecular docking.This method emphasized removing potential false positives using stringent criteria fromknown interaction docking, consensus scores, and rank information. Published literatureindicated experimental evidence for 31 top predicted interactions, supporting the approach.The chemotherapeutic nilotinib was validated as a potent MAPK14 inhibitor in vitro (IC5040nM), suggesting a potential use in inflammatory diseases.I then applied this method to the cancer target EGFR, predicting the anti-HIV drug tenofovirdisoproxil fumarate (TDF) as a novel inhibitor. In vitro, TDF inhibited the proliferation andEGFR-signaling of an EGFR-overexpressing cell line, but did not inhibit EGFR in directkinase binding assays. This study highlighted limitations of computational and experimentalmethodologies that should be considered when interpreting or designing other studies.We then screened 1,120 off-patent drugs against the triple-negative breast cancer (TNBC)target p90RSK using both VS and high-throughput (HTS) methods. VS predicted a set ofcompounds 26-times enriched for known RSK inhibitors and 11 times enriched for HTS hits,underscoring its efficiency. In secondary screens, the chemotherapeutic ellipticine and thebioflavonoids luteolin and apigenin inhibited RSK activity (IC50 0.50-4.77μM), blocked RSKsignaling, and inhibited TNBC cell proliferation. These drugs thus have potential to berepositioned to TNBC.Finally, we rationally repositioned renal cell carcinoma drugs for a patient with a rare tongueadenocarcinoma. Whole genome and transcriptome sequencing of the patient’s tumor andnormal cells detected sequence, copy number, and expression aberrations, and analysis suggested that the tumor was driven by the RET oncogene. Treatment with RET-inhibitingdrugs stabilized the disease for eight months, after which the disease progressed. We alsosequenced the post-treatment tumor and found changes consistent with acquired therapeuticresistance.Overall, this thesis details two novel high-throughput approaches for drug repositioning:virtual screening of drugs and targets and personalized medicine via sequencing.

View record

De Novo Detection of Regulatory Elements in the Nematode Caenorhabditis elegans (2009)

No abstract available.

Master's Student Supervision (2010 - 2018)
Characterization of the human thyroid epigenome (2017)

The thyroid gland, necessary for normal human growth and development, is essential for the regulation of metabolism. Its function – to produce and secrete appropriate levels of thyroid hormone – is simple; however accurate assessment of thyroid abnormality is challenging and a fundamental understanding of the normal thyroid is therefore needed. One way to characterize the normal functioning of the thyroid gland is to study the epigenome and resulting transcriptome within its constituent cells. In this study, we compare the consistency of chromatin state annotations across the epigenomes from the grossly uninvolved tumour-adjacent thyroid tissue of four human individuals using ChIP-seq and RNA-seq. We profile four activating (H3K4me1, H3K4me3, H3K27ac, H3K36me3) and two repressing (H3K9me3, H3K27me3) histone modifications, identify chromatin states using a hidden Markov model, produce a novel metric for model selection, and establish epigenomic maps of 19 chromatin states. We found that epigenetic features characterizing promoters and transcription elongation tend to more consistent across epigenomes and that epigenetically active genes consistent across all epigenomes tend to have higher expression than those that are not marked as epigenetically active in all samples. We also identified a set of 18 genes epigenetically active and consistently expressed in the thyroid that are likely relevant to thyroid function. Altogether, we believe the epigenomes presented in this work represent a useful resource to gain a deeper understanding of the underlying molecular biology of thyroid function and provide contextual information of thyroid and human epigenomic data for comparison and integration into future studies.

View record

Latent Semantic Analysis for retrieving related biomedical articles (2017)

Retrieving relevant scientific papers in a scalable way is increasingly important, as more and more studies are published. PubMed’s relevant article recommendation is based on MeSH assignments by indexers, which requires significant human resources and can become a limitation in making papers searchable. Many recommendation systems use singular value decomposition (SVD) to pre-compute related products. In this study, we look at using latent semantic analysis (LSA), an application of SVD to determine relationships in a set of documents and terms, to find related biomedical papers. We focused on determining the best parameters for SVD in retrieving relevant biomedical articles given a paper of interest. Using PubMed's recommendations as guidance, we found that using cosine distance to measure document similarity leads to better results than using Euclidean distance. We re-evaluated other parameters, including the weighting scheme and the number of singular values and using a larger abstract corpus. Finally, we asked people to compare the relevant abstract retrieved with our method against those retrieved by PubMed. Our method retrieved sensible articles that were chosen over PubMed's relevant papers one-third of the time. We looked into the abstracts retrieved by either method and discuss possible areas for experimentation and improvement.

View record

Bioinformatics approach to investigate genetic differences underlying breast tumours with specific outcomes of adoptive T-cell therapy using a mouse model (2010)

The immune system plays a critical role in cancer prevention and development. The stimulation of natural immune reaction in a cancer patient by adoptive T-cell therapy has shown success in treating metastatic melanomas and renal cell carcinomas. However, the use of adoptive T-cell therapy remains limited due to unpredictable outcomes and low response rates. In particular, adoptive T-cell therapy for breast cancer has not been realized, despite of the presence of immunogenic antigens such as over-expressed HER2, present in 20-40% of breast tumours. Using a unique transgenic mouse model, the global profiles of gene expression, miRNA abundance and single nucleotide variants (SNVs) were investigated to identify the molecular difference of murine mammary tumours with isogenic background, which exhibited complete regression (CR), partial regression (PR) or progressive disease (PD) outcome of adoptive T-cell therapy. The bioinformatics analyses were further carried out to identify uniquely activated pathways, prognostic gene expression signatures, the effect of post-transcriptional gene regulation and mutated genes unique to tumours with specific outcome. The largest differences in gene expression, miRNA and SNV profiles were repeatedly observed between the regressing (CR, PR) and non-regressing (PD) tumours, supporting the attribution of molecular differences to the immunotherapy outcome. In particular, the gene expression signatures derived from genes in immune-related pathways were experimentally validated to be strong prognostic markers for predicting the CR outcome. Comparison with the human breast cancer subtypes further revealed similarities of the non-regressing tumours with the basal subtype, and the regressing tumours with the HER2 subtype. The difference in miRNA profiles between CR and PR tumours suggested potential translational activities unique to PR, which was nearly identical to CR at the transcriptome level. The findings from this study show that tumour-derivied factors that either promote or suppress the immune system are responsible for the varying outcome of immunotherapy, and that the molecular characteristics can be further applied for the development of clinical prognostic tools, cancer vaccines and drug targets to enhance the efficacy of adoptive T-cell therapy.

View record


If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.


Explore our wide range of course-based and research-based program options!