Ife Adebara
Doctor of Philosophy in Linguistics (PhD)
Research Topic
Inclusive by Design: Natural Language Technology for Africa
Deep Learning. Deep learning of natural language. Natural Language Processing. Computational Linguistics. Natural Language Inference. Machine Translation. Misinformation. Detection of Negative and Abusive Content Online. Applications of deep learning in health and well-being.
G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.
These videos contain some general advice from faculty across UBC on finding and reaching out to a potential thesis supervisor.
Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.
Deep learning based models often achieve state-of-the-art performance in a wide range of natural language processing (NLP) tasks, which include open-ended tasks (e.g., story generation, brainstorming, and chat) and closed-ended tasks (e.g., summarization, question answering, and rewriting). To further enhance quality, there is a growing interest in scaling the model size and the amount of data used for training. These research efforts often overlook the impact of footprint metrics, such as high latency, high memory usage, and high energy consumption, on these deep learning models. A high footprint makes these models significantly inefficient for deployment on servers and devices such as tablets, handhelds, and wearables. Methods for improving model efficiency often come at the cost of degrading model quality.In this dissertation, we address the central question: how can we push the envelope in improving the efficiency-quality tradeoff of deep learning models for on-device NLP tasks? To this end, we propose methods that take on-device efficiency constraints (e.g., ≤ 16 MB memory or ≤ 200 ms latency) to inform the design of the model architecture. We propose methods for the manual design of architecture for the auto-completion task (generate continuations for user-written prompts) that enjoy a better memory-accuracy tradeoff than existing auto-completion models (Chapter 2). Additionally, we introduce methods that can directly take efficiency constraints to automatically search for efficient sparsely activated architectures for machine translation tasks (Chapter 3) and efficient pretrained (task-agnostic) language modeling architectures (Chapter 4). Finally, in Chapter 5, we explore a novel use case of employing large language models to speed up architecture search, while maintaining the efficiency and quality of state-of-the-art neural architecture search algorithms.
View record
This dissertation centers on Natural Language Processing (NLP) for African languages, endeavoring to unravel the progress, challenges, and future prospects within this linguistic context. The research encompasses language identification and Natural Language Understanding (NLU), Natural Language Generation (NLG), and culminates in a comprehensive case study on machine translation.The first chapter introduces the problem statement, articulates the motivation for addressing theissue, and presents the innovative solutions developed throughout this research. Chapter two discusses intricate details of African languages, offering insights into the genealogical classification, linguistic landscape, and the challenges of multilingual NLP. Building upon this foundation, the third chapter advocates for an Afrocentric approach to technology development, emphasizing the significance of aligning technology with the cultural values and linguistic diversity of African communities. It addresses challenges such as data scarcity and representation bias, spotlighting community-driven initiatives aimed at advancing NLP in the region.The fourth chapter unveils AfroLID, a neural language identification tool designed for 517 Africanlanguages and language varieties, establishing itself as the new state-of-the-art solution for Africanlanguage identification.Chapter five introduces SERENGETI, a massively multilingual language model tailored to support517 African languages and language varieties. Evaluation on AfroNLU, an extensive benchmark forAfrican NLP, showcases SERENGETI’s superior performance, thereby paving the way for transformative research and development across a diverse linguistic landscape.The sixth chapter addresses NLG challenges in African languages, presenting Cheetah, a language model designed for 517 African languages. Comprehensive evaluations underscore Cheetah’s capacity to generate contextually relevant text across various African languages.The seventh chapter presents a case study on machine translation, focusing on Bare Nouns (BNs)translation from Yorùbá to English. This study illuminates the challenges posed by informationasymmetry in machine translation and provides insights into the linguistic capabilities of StatisticalMachine Translation (SMT) and Neural Machine Translation (NMT) systems. Emphasizing theimportance of fine-grained linguistic considerations, the study encourages further research in addressing translation challenges faced by languages with BNs, analytic languages, and low-resource languages.In chapter eight, I conclude and discuss possible directions for future work.
View record
Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.
Contrastive learning (CL) has recently emerged as an effective technique in natural language processing, especially in the important area of language modeling. In this work, we offer novel methods for deploying CL in both pretraining and finetuning of language models. First, we present PACT (Pretraining with Adversarial Contrastive Learning for Text Classification), a novel self-supervised framework for text classification. Instead of contrasting against in-batch negatives, a popular approach in the literature, PACT mines negatives closer to the anchor representation. PACT operates by endowing the standard pretraining mechanisms of BERT with adversarial contrastive learning objectives, allowing for effective joint optimization of token- and sentence-level pretraining of the BERT model. Our experiments on 13 diverse datasets including token-level, single-sentence, and sentence-pair text classification tasks show that PACT achieves consistent improvements over SOTA baselines. We further show that PACT regularizes both token-level and sentence-level embedding spaces into more uniform representations, thereby alleviating the undesirable anisotropic phenomenon of language models. Subsequently, in the context of finetuning, we apply CL in tackling cross-platform abusive language detection. The prevalence of abusive language on different online platforms has been a major concern that raises the need for automated cross-platform abusive language detection. However, prior works focus on concatenating data from multiple platforms, inherently adopting Empirical Risk Minimization (ERM) method. In our work, we address this challenge from the perspective of domain generalization objective. We design SCL-Fish, a supervised contrastive learning integrated meta-learning algorithm to detect abusive language on unseen platforms. Our experimental analysis shows that SCL-Fish achieves better performance over ERM and the existing state-of-the-art models. We also show that SCL-Fish is data-efficient and achieves comparable performance with the large-scale pretrained models upon finetuning for the abusive language detection task.
View record
Arabic dialect identification (ADI) is an important aspect of the Arabic speech processing pipeline, and in particular dialectal Arabic automatic speech recognition (ASR) models. In this work, we present an overview of corpora and methods applicable to both ADI and dialectal Arabic ASR, then we benchmark two approaches to using pre-trained speech representation models for ADI. Namely, we first employ direct fine-tuning, and then use fixed-representations extracted from pre-trained models as an intermediate step in the ADI process. We train and evaluate our models on the granular ADI-17 Arabic dialect corpus (92% F1 for our fine-tuned HuBERT model), and further probe generalization by evaluating our trained models on coarse-grained ADI-5, (80% F1 for fine-tuned HuBERT).
View record
Natural language processing (NLP) has pervasive applications in everyday life, and has recently witnessed rapid progress. Incorporating latent variables in NLP systems can allow for explicit representations of certain types of information. In neural machine translation systems, for example, latent variables have the potential of enhancing semantic representations. This could help improve general translation quality. Previous work has focused on using variational inference with diagonal covariance Gaussian distributions, which we hypothesize cannot sufficiently encode latent factors of language which could exhibit multi-modal distributive behavior. Normalizing flows are an approach that enables more flexible posterior distribution estimates by introducing a change of variables with invertible functions. They have previously been successfully used in computer vision to enable more flexible posterior distributions of image data. In this work, we investigate the impact of normalizing flows in autoregressive neural machine translation systems. We do so in the context of two currently successful approaches, attention mechanisms, and language models. Our results suggest that normalizing flows can improve translation quality in some scenarios, and require certain modelling assumptions to achieve such improvements.
View record
If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.