Muhammad Abdul-Mageed

Associate Professor

Research Classification

Research Interests

Deep Learning
Natural Language Processing
Machine Learning
Computational Linguistics
Social Media Mining
Arabic

Relevant Thesis-Based Degree Programs

 
 

Research Methodology

Deep Learning
Natural Language Processing
machine learning
Social Media
Arabic

Recruitment

Master's students
Doctoral students
Postdoctoral Fellows
Any time / year round

Deep Learning. Deep learning of natural language. Natural Language Processing. Computational Linguistics. Natural Language Inference. Machine Translation. Misinformation. Detection of Negative and Abusive Content Online. Applications of deep learning in health and well-being.

I support public scholarship, e.g. through the Public Scholars Initiative, and am available to supervise students and Postdocs interested in collaborating with external partners as part of their research.
I support experiential learning experiences, such as internships and work placements, for my graduate students and Postdocs.
I am open to hosting Visiting International Research Students (non-degree, up to 12 months).

Complete these steps before you reach out to a faculty member!

Check requirements
  • Familiarize yourself with program requirements. You want to learn as much as possible from the information available to you before you reach out to a faculty member. Be sure to visit the graduate degree program listing and program-specific websites.
  • Check whether the program requires you to seek commitment from a supervisor prior to submitting an application. For some programs this is an essential step while others match successful applicants with faculty members within the first year of study. This is either indicated in the program profile under "Admission Information & Requirements" - "Prepare Application" - "Supervision" or on the program website.
Focus your search
  • Identify specific faculty members who are conducting research in your specific area of interest.
  • Establish that your research interests align with the faculty member’s research interests.
    • Read up on the faculty members in the program and the research being conducted in the department.
    • Familiarize yourself with their work, read their recent publications and past theses/dissertations that they supervised. Be certain that their research is indeed what you are hoping to study.
Make a good impression
  • Compose an error-free and grammatically correct email addressed to your specifically targeted faculty member, and remember to use their correct titles.
    • Do not send non-specific, mass emails to everyone in the department hoping for a match.
    • Address the faculty members by name. Your contact should be genuine rather than generic.
  • Include a brief outline of your academic background, why you are interested in working with the faculty member, and what experience you could bring to the department. The supervision enquiry form guides you with targeted questions. Ensure to craft compelling answers to these questions.
  • Highlight your achievements and why you are a top student. Faculty members receive dozens of requests from prospective students and you may have less than 30 seconds to pique someone’s interest.
  • Demonstrate that you are familiar with their research:
    • Convey the specific ways you are a good fit for the program.
    • Convey the specific ways the program/lab/faculty member is a good fit for the research you are interested in/already conducting.
  • Be enthusiastic, but don’t overdo it.
Attend an information session

G+PS regularly provides virtual sessions that focus on admission requirements and procedures and tips how to improve your application.

 

ADVICE AND INSIGHTS FROM UBC FACULTY ON REACHING OUT TO SUPERVISORS

These videos contain some general advice from faculty across UBC on finding and reaching out to a potential thesis supervisor.

Graduate Student Supervision

Doctoral Student Supervision

Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.

Methods for design of efficient on-device natural language processing architectures (2024)

Deep learning based models often achieve state-of-the-art performance in a wide range of natural language processing (NLP) tasks, which include open-ended tasks (e.g., story generation, brainstorming, and chat) and closed-ended tasks (e.g., summarization, question answering, and rewriting). To further enhance quality, there is a growing interest in scaling the model size and the amount of data used for training. These research efforts often overlook the impact of footprint metrics, such as high latency, high memory usage, and high energy consumption, on these deep learning models. A high footprint makes these models significantly inefficient for deployment on servers and devices such as tablets, handhelds, and wearables. Methods for improving model efficiency often come at the cost of degrading model quality.In this dissertation, we address the central question: how can we push the envelope in improving the efficiency-quality tradeoff of deep learning models for on-device NLP tasks? To this end, we propose methods that take on-device efficiency constraints (e.g., ≤ 16 MB memory or ≤ 200 ms latency) to inform the design of the model architecture. We propose methods for the manual design of architecture for the auto-completion task (generate continuations for user-written prompts) that enjoy a better memory-accuracy tradeoff than existing auto-completion models (Chapter 2). Additionally, we introduce methods that can directly take efficiency constraints to automatically search for efficient sparsely activated architectures for machine translation tasks (Chapter 3) and efficient pretrained (task-agnostic) language modeling architectures (Chapter 4). Finally, in Chapter 5, we explore a novel use case of employing large language models to speed up architecture search, while maintaining the efficiency and quality of state-of-the-art neural architecture search algorithms.

View record

Towards Afrocentric natural language processing (2024)

This dissertation centers on Natural Language Processing (NLP) for African languages, endeavoring to unravel the progress, challenges, and future prospects within this linguistic context. The research encompasses language identification and Natural Language Understanding (NLU), Natural Language Generation (NLG), and culminates in a comprehensive case study on machine translation.The first chapter introduces the problem statement, articulates the motivation for addressing theissue, and presents the innovative solutions developed throughout this research. Chapter two discusses intricate details of African languages, offering insights into the genealogical classification, linguistic landscape, and the challenges of multilingual NLP. Building upon this foundation, the third chapter advocates for an Afrocentric approach to technology development, emphasizing the significance of aligning technology with the cultural values and linguistic diversity of African communities. It addresses challenges such as data scarcity and representation bias, spotlighting community-driven initiatives aimed at advancing NLP in the region.The fourth chapter unveils AfroLID, a neural language identification tool designed for 517 Africanlanguages and language varieties, establishing itself as the new state-of-the-art solution for Africanlanguage identification.Chapter five introduces SERENGETI, a massively multilingual language model tailored to support517 African languages and language varieties. Evaluation on AfroNLU, an extensive benchmark forAfrican NLP, showcases SERENGETI’s superior performance, thereby paving the way for transformative research and development across a diverse linguistic landscape.The sixth chapter addresses NLG challenges in African languages, presenting Cheetah, a language model designed for 517 African languages. Comprehensive evaluations underscore Cheetah’s capacity to generate contextually relevant text across various African languages.The seventh chapter presents a case study on machine translation, focusing on Bare Nouns (BNs)translation from Yorùbá to English. This study illuminates the challenges posed by informationasymmetry in machine translation and provides insights into the linguistic capabilities of StatisticalMachine Translation (SMT) and Neural Machine Translation (NMT) systems. Emphasizing theimportance of fine-grained linguistic considerations, the study encourages further research in addressing translation challenges faced by languages with BNs, analytic languages, and low-resource languages.In chapter eight, I conclude and discuss possible directions for future work.

View record

Master's Student Supervision

Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.

Improving language models with novel contrastive learning objectives (2024)

Contrastive learning (CL) has recently emerged as an effective technique in natural language processing, especially in the important area of language modeling. In this work, we offer novel methods for deploying CL in both pretraining and finetuning of language models. First, we present PACT (Pretraining with Adversarial Contrastive Learning for Text Classification), a novel self-supervised framework for text classification. Instead of contrasting against in-batch negatives, a popular approach in the literature, PACT mines negatives closer to the anchor representation. PACT operates by endowing the standard pretraining mechanisms of BERT with adversarial contrastive learning objectives, allowing for effective joint optimization of token- and sentence-level pretraining of the BERT model. Our experiments on 13 diverse datasets including token-level, single-sentence, and sentence-pair text classification tasks show that PACT achieves consistent improvements over SOTA baselines. We further show that PACT regularizes both token-level and sentence-level embedding spaces into more uniform representations, thereby alleviating the undesirable anisotropic phenomenon of language models. Subsequently, in the context of finetuning, we apply CL in tackling cross-platform abusive language detection. The prevalence of abusive language on different online platforms has been a major concern that raises the need for automated cross-platform abusive language detection. However, prior works focus on concatenating data from multiple platforms, inherently adopting Empirical Risk Minimization (ERM) method. In our work, we address this challenge from the perspective of domain generalization objective. We design SCL-Fish, a supervised contrastive learning integrated meta-learning algorithm to detect abusive language on unseen platforms. Our experimental analysis shows that SCL-Fish achieves better performance over ERM and the existing state-of-the-art models. We also show that SCL-Fish is data-efficient and achieves comparable performance with the large-scale pretrained models upon finetuning for the abusive language detection task.

View record

Representation learning for Arabic dialect identification (2022)

Arabic dialect identification (ADI) is an important aspect of the Arabic speech processing pipeline, and in particular dialectal Arabic automatic speech recognition (ASR) models. In this work, we present an overview of corpora and methods applicable to both ADI and dialectal Arabic ASR, then we benchmark two approaches to using pre-trained speech representation models for ADI. Namely, we first employ direct fine-tuning, and then use fixed-representations extracted from pre-trained models as an intermediate step in the ADI process. We train and evaluate our models on the granular ADI-17 Arabic dialect corpus (92% F1 for our fine-tuned HuBERT model), and further probe generalization by evaluating our trained models on coarse-grained ADI-5, (80% F1 for fine-tuned HuBERT).

View record

Investigating the impact of normalizing flows on latent variable machine translation (2020)

Natural language processing (NLP) has pervasive applications in everyday life, and has recently witnessed rapid progress. Incorporating latent variables in NLP systems can allow for explicit representations of certain types of information. In neural machine translation systems, for example, latent variables have the potential of enhancing semantic representations. This could help improve general translation quality. Previous work has focused on using variational inference with diagonal covariance Gaussian distributions, which we hypothesize cannot sufficiently encode latent factors of language which could exhibit multi-modal distributive behavior. Normalizing flows are an approach that enables more flexible posterior distribution estimates by introducing a change of variables with invertible functions. They have previously been successfully used in computer vision to enable more flexible posterior distributions of image data. In this work, we investigate the impact of normalizing flows in autoregressive neural machine translation systems. We do so in the context of two currently successful approaches, attention mechanisms, and language models. Our results suggest that normalizing flows can improve translation quality in some scenarios, and require certain modelling assumptions to achieve such improvements.

View record

Current Students & Alumni

This is a small sample of students and/or alumni that have been supervised by this researcher. It is not meant as a comprehensive list.
 
 

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.

 
 

Learn about our faculties, research and more than 300 programs in our Graduate Viewbook!