Giuseppe Carenini

Professor

Relevant Thesis-Based Degree Programs

 
 

Graduate Student Supervision

Doctoral Student Supervision

Dissertations completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest dissertations.

Versatile neural approaches to more accurate and robust topic segmentation (2024)

Topic segmentation, as a fundamental NLP task, has been proposed and systematically studied since the 1980s and received increased attention in recent years due to the surge in big data. It aims to unveil the coarse-grained semantic structure of long unstructured documents by automatically dividing them into shorter, topically coherent segments.The coarse-grained structure provided by topic segmentation has been proven to not only enhance human reading efficiency but also play a vital role in other natural language understanding tasks, such as text summarization, question answering, and dialogue modeling. Before the neural era, early computational models for topic segmentation typically adhered to unsupervised paradigms with lexical cohesion directly derived from the input, yet their performance was notably limited. With the evolution of deep learning and enhanced computational capabilities, neural models have delivered significant progress in performance. Nevertheless, inadequate coherence modeling, in terms of both explicitness and reliability in these neural approaches, prevents them from emerging as more accurate and robust solutions for topic segmentation. Additionally, the growing prevalence of multi-modal data content across social media platforms has heightened the need for topic segmentation to traverse beyond mere text, extending into videos. Motivated by the challenges and needs mentioned above, in this thesis, we direct our efforts towards enhancing neural topic segmentation for two types of documents: text and video. To overcome the inadequate coherence modeling (explicitness and reliability) in neural topic segmenters for text, we propose a series of methods that either more explicitly model coherence patterns or leverage coherence signals encoded in related auxiliary tasks, notably discourse parsing and language modeling. For video content, we explore to extend neural topic segmenters, originally designed for text, into a multi-modal setting which is also robust to the often-encountered drastic variance in video length. A comprehensive set of experimental results indicates that our methods not only effectively enhance the overall performance of neural segmenters for text and video in intra-domain scenarios, but also broaden their applicability to data in other domains.

View record

Exploration on the synergy between discourse and neural summarizers (2023)

Automatic Text Summarization is the challenging NLP task of summarizing some source input text - a single document, a cluster of multiple related documents, or conversations - into a shorter text, covering the main points succinctly. Since the 1950s, researchers have been exploring with some success both extractive solutions - simply picking salient sentences from the source, and abstractive ones - directly generating the summary word by word. Before the neural revolution, summarizers relied on explicit linguistic properties of the input, including lexical, syntactic and discourse information, but their performance was rather poor. With the development of powerful machine learning and the availability of large-scale datasets, neural models have delivered considerable gains in performance regarding both automatic evaluation metrics and human evaluation. Nevertheless, they still suffer from difficulties in understanding the long input document(s), and the generated summaries often contain factual inconsistencies and hallucinations, namely the content not existing in, or even contradicting the source document. One promising solution that we explore in this thesis is to inject linguistic knowledge into the neural models as a guidance. In particular, we focus on discourse, which reflects how the text is aggregated as a coherent document with specific central focus and fluent topical transport. While in the past decades, discourse has been shown to be helpful in non-neural summarizers, here we investigate the synergy between discourse and neural summarizers, with a special focus on the discourse structure (both explicit structures and RST structures) and the entities of the document(s). Specifically, we propose a series of methods to inject discourse information into neural summarizers regarding both the document understanding and summary generation steps, covering both extractive and abstractive methods. A large set of experimental results indicate that our methods effectively improve not only the summarizers overall performance, but also their efficiency, generality, and factualness. Conversely, by evaluating the discourse trees induced by neural summarizers on several human-annotated discourse tree datasets, we show that such summarizers do capture discourse structural information implicitly.

View record

Better document-level natural language understanding through data-driven applications of discourse theories (2022)

A discourse constitutes a locally and globally coherent text in which words, clauses and sentences are not solely a sequence of independent statements, but follow a hidden structure, encoding the author's underlying communicative goal(s). As such, the meaning of a discourse as a whole goes beyond the meaning of its individual parts, guided by the latent semantic and pragmatic relationships holding between parts of the document. Clearly falling into the area of Natural Language Understanding (NLU), discourse analysis augments textual inputs with structured representations following linguistic formalisms and frameworks. Annotating documents following these elaborate formalisms has led to the computationally inspired research area of discourse parsing, aiming to generate robust and general discourse annotations for arbitrary documents through automated approaches. With computational discourse parsers having great success at inferring valuable structures and supporting prominent real-world tasks such as sentiment analysis, text classification, and summarization, discourse parsing has been established as a valuable source of structured information. However, a significant limitation preventing the broader application of discourse-inspired approaches, especially in the context of modern deep-learning models, is the lack of available gold-standard data, caused by the tedious and expensive human annotation process. To overcome the prevalent data sparsity issue in the areas of discourse analysis and discourse parsing, it is imperative to find new methods to generate large-scale and high-quality discourse annotations, not relying on the restrictive human annotation process. Along these lines, we present a set of novel computational approaches to (partially) overcome the data sparsity issue by proposing distantly and self-supervised methods to automatically generate large-scale, high-quality discourse annotations in a data-driven manner. In this thesis, we provide detailed insights into our technical contributions and diverse evaluations. Specifically, we show the competitive and complementary nature of our discourse inference approaches to human-annotated discourse information, partially outperforming gold-standard discourse structures on the important task of "inter-domain" discourse parsing. We further elaborate on our generated discourse annotations in regard to their ability to support linguistic theories and downstream tasks, finding that they have direct applications in linguistics and Natural Language Processing (NLP).

View record

Visual text analytics for online conversations (2017)

With the proliferation of Web-based social media, asynchronous conversations have become very common for supporting online communication and collaboration. Yet the increasing volume and complexity of conversational data often make it very difficult to get insights about the discussions. This dissertation posits that by integrating natural language processing and information visualization techniques in a synergistic way, we can better support the user's task of exploring and analyzing conversations. Unlike most previous systems, which do not consider the specific characteristics of online conversations; we applied design study methodologies from the visualization literature to uncover the data and task abstractions that guided the development of a novel set of visual text analytics systems.The first of such systems is ConVis, that supports users in exploring an asynchronous conversation, such as a blog. ConVis offers a visual overview of a conversation by presenting topics, authors, and the thread structure of a conversation, as well as various interaction techniques such as brushing and linked highlighting. Broadening from a single conversation to a collection of conversations, MultiConVis combines a novel hierarchical topic modeling with multi-scale exploration techniques. A series of user studies revealed the significant improvements in user performance and subjective measures when these two systems were compared to traditional blog interfaces.Based on the lessons learned from these studies, this dissertation introduced an interactive topic modeling framework specifically for asynchronous conversations. The resulting systems empower the user in revising the underlying topic models through an intuitive set of interactive features when the current models are noisy and/or insufficient to support their information seeking tasks. Two summative studies suggested that these systems outperformed their counterparts that do not support interactive topic modeling along several subjective and objective measures.Finally, to demonstrate the generality and applicability of our approach, we tailored our previous systems to support information seeking in community question answering forums. The prototype was evaluated through a large-scale Web-based study, which suggests that our approach can be adapted to a specific conversational genre among a diverse range of users.The dissertation concludes with a critical reflection on our approach and considerations for future research.

View record

Discourse analysis of asynchronous conversations (2014)

A well-written text is not merely a sequence of independent and isolated sentences, but instead a sequence of structured and related sentences. It addresses a particular topic, often covering multiple subtopics, and is organized in a coherent way that enables the reader to process the information. Discourse analysis seeks to uncover such underlying structures, which can support many applications including text summarization and information extraction.This thesis focuses on building novel computational models of different discourse analysis tasks in asynchronous conversations; i.e., conversations where participants communicate with each other at different times (e.g., emails, blogs). Effective processing of these conversations can be of great strategic value for both organizations and individuals. We propose novel computational models for topic segmentation and labeling, rhetorical parsing and dialog act recognition in asynchronous conversation. Our approaches rely on two related computational methodologies: graph theory and probabilistic graphical models.The topic segmentation and labeling models find the high-level discourse structure; i.e., the global topical structure of an asynchronous conversation. Our graph-based approach extends state-of-the-art methods by integrating a fine-grained conversational structure with other conversational features. On the other hand, the rhetorical parser captures the coherence structure, a finer discourse structure, by identifying coherence relations between the discourse units within each comment of the conversation. Our parser applies an optimal parsing algorithm to probabilities inferred from a discriminative graphical model which allows us to represent the structure and the label of a discourse tree constituent jointly, and to capture the sequential and hierarchical dependencies between the constituents. Finally, the dialog act model allows us to uncover the underlying dialog structure of the conversation. We present unsupervised probabilistic graphical models that capture the sequential dependencies between the acts, and show how these models can be trained more effectively based on the fine-grained conversational structure. Together, these structures provide a deep understanding of an asynchronous conversation that can be exploited in the above-mentioned applications. For each discourse processing task, we evaluate our approach on different datasets, and show that our models consistently outperform the state-of-the-art by a wide margin. Often our results are highly correlated with human annotations.

View record

Master's Student Supervision

Theses completed in 2010 or later are listed below. Please note that there is a 6-12 month delay to add the latest theses.

Neural multimodal topic modeling : a comprehensive evaluation (2023)

Neural topic models can successfully find coherent and diverse topics in textual data. However, they are limited in dealing with multimodal datasets (e.g., images and text). This thesis presents the first systematic and comprehensive evaluation of multimodal topic modeling of documents containing both text and images. In the process, we propose three novel topic modeling solutions and two novel evaluation metrics. Moreover, we focus on one of our models and explore additional techniques to improve the quality of topics, such as incorporating external knowledge. Overall, our evaluation on an unprecedented rich and diverse collection of datasets indicates that all of our models generate coherent and diverse topics. Nevertheless, the extent to which one method outperforms the other depends on the metrics and dataset combinations, which suggests further exploration of combined approaches in the future.

View record

Using transformers to predict customer satisfaction for live chat dialogues : guiding applied natural language processing research in contact centres through design thinking (2023)

Contact centres are one of the most important channels by which many private and public organizations, such as retail brands, banks, airlines, and government departments, interact with millions of people every day. Customer satisfaction is the dominant quality metric that contact centres use for a wide variety of evaluative purposes such as helping inform customer experience, voice of customer reporting, agent service quality, or product and service improvements. Through a qualitative study with six contact centres and drawing on methods from the human-computer interaction field such as design thinking, interviews, and affinity diagrams, we identify several important tasks in contact centres that rely heavily on customer satisfaction scores. We form a collaborative research partnership with the contact centre at lululemon athletica inc., a large apparel retail organization, in order to gain access to a large unique corpus of customer service live chat dialogues and domain experts. Through this partnership, we collaboratively develop a solution to improve the measurement of customer satisfaction. We propose a solution that predicts customer satisfaction scores for live chats based on a supervised machine-learning approach. We treat predicting customer satisfaction scores for live chats as a single-label document classification task and utilize transformer models to achieve notable performance versus several other common document classification models. To the best of our knowledge, this is the first study that shows fine-tuned transformer models are very effective at predicting customer satisfaction for live chats, thus improve the measurement of customer satisfaction in contact centres.

View record

Effective techniques of combining information visualization with natural language processing (2022)

In this thesis, we present three research projects that lie on the intersection of visual analysis and natural language processing (NLP) covering a very broad range of user information needs and analytical tasks. On one hand, integrating NLP techniques into visual frameworks can facilitate the exploration and analysis of large-scale textual data. In this domain, we present ConVIScope, a visual framework designed for exploring patient-doctor conversations. On the other hand, visual analytics can also assist in the development of NLP models by providing them with valuable insights about the model's intrinsic properties and behaviours. Under this line of research, we propose T³-Vis, a system that utilizes visual analytics to assist the users in the training of Transformers, a dominant architecture in NLP. Lastly, we present an in-depth case study where we used a novel pipeline for analyzing and improving extractive summarization models with guidance from our visual interface, T³-Vis.

View record

From neural discourse parsing to content structuring: towards a large-scale data-driven approach to discourse processing (2021)

In this thesis, we propose novel approaches for supervised RST-style discourse parsing, as well as the methods for utilizing those discourse structures for the benefit of natural language generation. We demonstrate a significant improvement in discourse parsing accuracy on RST-DT and Instr-DT treebanks by incorporating silver-standard supervision. Furthermore, in line with theoretical and empirical connections between the discourse parsing and coreference resolution tasks, we find the evidence of improvement of discourse parsing accuracy on RST-DT when our proposed discourse parsing system is provided with coreference supervision from a coreference resolver trained on OntoNotes corpus. Finally, in extending our work to natural language generation, we demonstrate that our novel content structuring system utilizing silver-standard discourse structures outperforms text-only systems on our proposed task of elementary discourse unit ordering, a significantly more difficult version of sentence ordering task.

View record

Exploring neural models for predicting dementia from language (2019)

In this thesis we explore the effectiveness of neural models that require no task-specific feature for automatic dementia prediction. The problem is about classifying Alzheimer's disease (AD) from recordings of patients undergoing the Boston Diagnostic Aphasia Examination (BDAE). First we use a multimodal neural model to fuse linguistic features and acoustic features, and investigate the performance change compared to simply concatenating these features. Then we propose a novel coherence feature generated by a neural coherence model, and evaluate the predictiveness of this new feature for dementia prediction. Finally we apply an end-to-end neural method which is free from feature engineering and achieves state-of-the-art classification result on a widely used dementia dataset. We further interpret the predictions made by this neural model from different angles, including model visualization and statistical tests.

View record

Extractive summarization of long documents by combining global and local context (2019)

In this thesis, we propose a novel neural single-document extractive summarizationmodel for long documents, incorporating both the global context of the wholedocument and the local context within the current topic. We evaluate the modelon two datasets of scientific papers , Pubmed and arXiv, where it outperforms previouswork, both extractive and abstractive models, on ROUGE-1 and ROUGE-2scores. We also show that, consistently with our goal, the benefits of our methodbecome stronger as we apply it to longer documents. Besides, we also show thatwhen the topic segment information is not explicitly provided, if we apply a pretrainedtopic segmentation model that splits documents into sections, our model isstill competitive with state-of-the-art models.

View record

Infrequent discourse relation identification using Data Programming (2019)

Discourse parsing is an important task in natural language processing as it supports a wide range of downstream NLP tasks. While the overall performance of discourse parsing has been recently improved considerably, the performance on identifying relatively infrequent discourse relations is still rather low (∼ 20 in terms of F1 score).To resolve the gap between the performance of infrequent and frequent relations, we propose a novel method for discourse relation identification that is centered around “a paradigm for the programmatic creation of training datasets,” called Data Programming (DP). The main idea in our approach is to overcome the issue of limited labeled data for infrequent relations by leveraging unlabeled data in addition to labeled data. Our experiments show that our method improves the performance on most of the infrequent relations with minimal negative effect on frequent relations.

View record

NJM-Vis: applying and interpreting neural network joint models in natural language processing applications (2019)

Neural joint models have been shown to outperform non-joint models on several NLP and Vision tasks and constitute a thriving area of research in AI and ML. Although several researchers have worked on enhancing the interpretability of single-task neural models, in this thesis we present what is, to the best of our knowledge, the first interface to support the interpretation of results produced by joint models, focusing in particular on NLP settings. Our interface is intended to enhance interpretability of these models for both NLP practitioners and domain experts (e.g., linguists).

View record

OCTVis: ontology-based comparison of topic models (2019)

Topic modeling is a natural language processing (NLP) task that statistically identifies topics from a set of texts. Evaluating results from topic modeling is difficult in context and often requires domain experts. To facilitate evaluation of topic model results within communication between NLP researchers and domain experts, we present a visual comparison framework, OCTVis, to explore results from two topic models mapped against a domain ontology. The design of OCTVis is based on detailed and abstracted data and task models. We support high-level topic model comparison by mapping topics onto ontology concepts and incorporating topic alignment visualizations. For in-depth exploration of the dataset, display of per-document topic distributions and buddy plots allow comparison of topics, texts, and shared keywords at the document level. Case studies with medical domain experts using healthcare texts indicate that our framework enhances qualitative evaluation of topic models and provide a clearer understanding of how topic models can be improved.

View record

A Study of Methods for Learning Phylogenies of Cancer Cell Populations from Binary Single Nucleotide Variant Profiles (2015)

An accurate phylogeny of a cancer tumour has the potential to shed light on numerous phenomena, such as key oncogenetic events, relationships between clones, and evolutionary responses to treatment. Most work in cancer phylogenetics to-date relies on bulk tissue data, which can resolve only a few genotypes unambiguously. Meanwhile, single-cell technologies have considerably improved our ability to resolve intra-tumour heterogeneity. Furthermore, most cancer phylogenetic methods use classical approaches, such as Neighbor-Joining, which put all extant species on the leaves of the phylogenetic tree. But in cancer, ancestral genotypes may be present in extant populations. There is a need for scalable methods that can capture this phenomenon.We have made progress on this front by developing the Genotype Tree representation of cancer phylogenies, implementing three methods for reconstructing Genotype Trees from binary single-nucleotide variant profiles, and evaluating these methods under a variety of conditions. Additionally, we have developed a tool that simulates the evolution of cancer cell populations, allowing us to systematically vary evolutionary conditions and observe the effects on tree properties and reconstruction accuracy.Of the methods we tested, Recursive Grouping and Chow-Liu Grouping appear to be well-suited to the task of learning phylogenies over hundreds to thousands of cancer genotypes. Of the two, Recursive Grouping has the strongest and most stable overall performance, while Chow-Liu Grouping has a superior asymptotic runtime that is competitive with Neighbor-Joining.

View record

Detecting dementia from written and spoken language (2018)

This thesis makes three main contributions to existing work on the automatic detection of dementia from language. First we introduce a new set of biologically motivated spatial neglect features, and show their inclusion achieves a new state of the art in classifying Alzheimer's disease (AD) from recordings of patients undergoing the Boston Diagnostic Aphasia Examination. Second we demonstrate how a simple domain adaptation algorithm can be used to leveraging AD data to improve classification of mild cognitive impairment (MCI), a condition characterized by a slight-but-noticeable decline in cognition that does not meet the criteria for dementia, and a condition for which reliable data is scarce. Third, we investigate whether dementia can be detected from written rather than spoken language, and show a range of classifiers achieve a performance far above baseline. Additionally, we create a new corpus of blog posts written by authors with and without dementia and make it publicly available for future researchers.

View record

Summarization of partial email threads: silver standards and bayesian surprise (2018)

We define and motivate the problem of summarizing partial email threads. This problem introduces the challenge of generating reference summaries for these partial threads when extractive human annotation is only available for the threads as a whole, since gold standard annotation intended to summarize a completed email thread may not always be equally applicable to each of its partial threads, particularly when the human-selected sentences are not uniformly distributed within the threads. We propose a framework for generating these reference summaries with arbitrary length in an oracular manner by exploiting existing gold standard summaries for completed email threads. We also propose and evaluate two sentence scoring functions that can be used in this "silver standard" framework, and we are making the resulting datasets publicly available. In addition, we apply a recent unsupervised method based on Bayesian Surprise that incorporates background knowledge to partial thread summarization, extend that method with conversational features, and modify the mechanism by which it handles information redundancy. Experiments with our partial thread summarizers indicate comparable or improved performance relative to a state-of-the-art unsupervised full thread summarizer baseline in most cases; and we have identified areas in which potential vulnerabilities in our methods can be avoided or accounted for. Furthermore, our results suggest that the potential benefits of background knowledge to partial thread summarization should be further investigated with larger datasets.

View record

A semi-joint neural model for sentence level discourse parsing and sentiment analysis (2017)

Discourse Parsing and Sentiment Analysis are two fundamental tasks in Natural Language Processing that have been shown to be mutually beneficial. In this work, we design and compare two Neural Based models for jointly learning both tasks. In the proposed approach, we first create a vector representation for all the segments in the input sentence. Next, we apply three different Recursive Neural Net models: one for discourse structure prediction, one for discourse relation prediction and one for sentiment analysis. Finally, we combine these Neural Nets in two different joint models: Multi-tasking and Pre-training. Our results on two standard corpora indicate that both methods result in improvements in each task but Multi-tasking has a bigger impact than Pre-training.

View record

Improve Classification on Infrequent Discourse Relations via Training Data Enrichment (2017)

Discourse parsing is a popular technique widely used in text understanding, sentiment analysis, and other NLP tasks. However, for most discourse parsers, the performance varies significantly across different discourse relations. In this thesis, we first validate the underfitting hypothesis, i.e., the less frequent a relation is in the training data, the poorer the performance on that relation. We then explore how to increase the number of positive training instances, without resorting to manually creating additional labeled data. We propose a training data enrichment framework that relies on co-training of two different discourse parsers on unlabeled documents. Importantly, we show that co-training alone is not sufficient. The framework requires a filtering step to ensure that only “good quality” unlabeled documents can be used for enrichment and re-training. We propose and evaluate two ways to perform the filtering. The first is to use an agreement score between the two parsers. The second is to use only the confidence score of the faster parser. Our empirical results show that agreement score can help to boost the performance on infrequent relations, and that the confidence score is a viable approximation of the agreement score for infrequent relations.

View record

A Study of Methods for Learning Phylogenies of Cancer Cell Populations from Binary Single Nucleotide Variant Profiles (2015)

An accurate phylogeny of a cancer tumour has the potential to shed light on numerous phenomena, such as key oncogenetic events, relationships between clones, and evolutionary responses to treatment. Most work in cancer phylogenetics to-date relies on bulk tissue data, which can resolve only a few genotypes unambiguously. Meanwhile, single-cell technologies have considerably improved our ability to resolve intra-tumour heterogeneity. Furthermore, most cancer phylogenetic methods use classical approaches, such as Neighbor-Joining, which put all extant species on the leaves of the phylogenetic tree. But in cancer, ancestral genotypes may be present in extant populations. There is a need for scalable methods that can capture this phenomenon.We have made progress on this front by developing the Genotype Tree representation of cancer phylogenies, implementing three methods for reconstructing Genotype Trees from binary single-nucleotide variant profiles, and evaluating these methods under a variety of conditions. Additionally, we have developed a tool that simulates the evolution of cancer cell populations, allowing us to systematically vary evolutionary conditions and observe the effects on tree properties and reconstruction accuracy.Of the methods we tested, Recursive Grouping and Chow-Liu Grouping appear to be well-suited to the task of learning phylogenies over hundreds to thousands of cancer genotypes. Of the two, Recursive Grouping has the strongest and most stable overall performance, while Chow-Liu Grouping has a superior asymptotic runtime that is competitive with Neighbor-Joining.

View record

Exploring Machine Learning Design Options in Discourse Parsing (2015)

Discourse parsing recently attracts increasing interest among researchers since it is very helpful for text understanding, sentiment analysis and other NLP tasks. In a well-written text, authors often use discourse to better organize the text, and sentences (or clauses) tend to interact with neighboring sentences (or clauses). Each piece of text locally exhibits a finer discourse structure called rhetorical structure. And a document can be organized to a discourse tree (this process is called discourse parsing), which seeks to capture the discourse structure and logically binds the sentences (or clauses) together.However, despite the fact that discourse parsing is very useful, although intra-sentential level discourse parsing already achieves high performance, multi-sentential level discourse parsing remains a big challenge in terms of both accuracy and efficiency. In addition, machine learning techniques are proved to be successful in many NLP tasks including discourse parsing. Thus, in this thesis, we try to enhance the performance (e.g., accuracy, efficiency) of discourse parsing by using machine learning techniques. To this aim, we propose a novel two-step discourse parsing system, which first builds a discourse tree for a given text by applying optimal probabilistic parsing to probabilities inferred from learned conditional random fields (CRFs), then uses learned log-linear models to tag all discourse relations to the nodes in the discourse tree.We analyze different aspects of the problem (e.g., sequential v.s. non-sequential model, greedy v.s. optimal parsing, joint v.s. separate model) and discuss their trade-offs. We also carried out extensive experiments to study the usefulness of different feature families and over-fitting. Consequently, we find out that the most effective feature sets for different tasks are different: part-of-speech (POS) and context features are the most effective for intra and multi-sentential structure prediction respectively, while ngram features are the most effective for both intra and multi-sentential relation labeling. Moreover, over-fitting does occur in our experiments, so we need proper regularization. Final result shows that our system achieves state-of-the-art F-scores of 86.2, 72.2 and 59.2 in structure, nuclearity and relation. And it is more efficient than Joty's (training: 40 times faster; test: 3 times faster).

View record

Automatic Abstractive Summarization of Meeting Conversations (2014)

Nowadays, there are various ways for people to share and exchange information. Phone calls, E-mails, and social networking applications are tools which have made it much easier for us to communicate. Despite the existence of these convenient methods for exchanging ideas, meetings are still one of the most important ways for people to collaborate, share information, discuss their plans, and make decisions for their organizations. However, some drawbacks exist to them as well. Generally, meetings are time consuming and require the participation of all members. Taking meeting minutes for the benefit of those who miss meetings also requires considerable time and effort.To this end, there has been increasing demand for the creation of systems to automatically summarize meetings. So far, most summarization systems have applied extractive approaches whereby summaries are simply created by extracting important phrases or sentences and concatenating them in sequence. However, considering that meeting transcripts consist of spontaneous utterances containing speech disfluencies such as repetitions and filled pauses, traditional extractive summarization approaches do not work effectively in this domain. To address these issues, we present a novel template-based abstractive meeting summarization system requiring less annotated data than that needed for previous abstractive summarization approaches. In order to generate abstract and robust templates that can guide the summarization process, our system extends a novel multi-sentence fusion algorithm and utilizes lexico-semantic information. It also leverages the relationship between human-authored summaries and their source meeting transcripts to select the best templates for generating abstractive summaries of meetings. In our experiment, we use the AMI corpus to instantiate our framework and compare it with state-of-the-art extractive and abstractive systems as well as human extractive and abstractive summaries. Our comprehensive evaluations, based on both automatic and manual approaches, have demonstrated that our system outperforms all baseline systems and human extractive summaries in terms of both readability and informativeness. Furthermore, it has achieved a level of quality nearly equal to that of human abstracts based on a crowd-sourced manual evaluation.

View record

Evaluating Open Relation Extraction Over Conversational Texts (2014)

In this thesis, for the first time the performance of Open IE systems on conversational data has been studied. Due to lack of test datasets in this domain, a method for creating the test dataset covering a wide range of conversational data has been proposed. Conversational text is more complex and challenging for relation extraction because of its cryptic content and ungrammatical colloquial language. As a consequence text simplification has been used as a remedy to empower Open IE tools for relation extraction. Experimental results show that text simplification helps OLLIE, a state of the art for relation extraction, find new relations, extract more accurate relations and assign higher confidence scores to correct relations and lower confidence scores to incorrect relations for most datasets. Results also show some conversational modalities such as emails and blogs are easier for relation extraction task while people reviews on products is the most difficult modality.

View record

Blog comments classification using tree structured conditional random fields (2013)

The Internet provides a variety of ways for people to easily share, socialize, and interact with each other. One of the most popular platforms is the online blog. This causes a vast amount of new text data in the form of blog comments and opinions about news, events and products being generated everyday. However, not all comments have equal quality. Informative or high quality comments have greater impact on the readers’ opinions about the original post content, such as the benefits of the product discussed in the post, or the interpretation of a political event. Therefore, developing an efficient and effective mechanism to detect the most informative comments is highly desirable. For this purpose, sites like Slashdot, where users volunteer to rate comments based on their informativeness, can be a great resource to build such automated system using supervised machine learning techniques. Our research concerns building an automatic comment classification system leveraging these freely available valuable resources. Specifically, we discuss how comments in blogs can be detected using Conditional Random Fields (CRFs). Blog conversations typically have a tree-like structure in which an initial post is followed by comments, and each comment can be followed by other comments. In this work, we present our approach using Tree-structured Conditional Random Fields (TCRFs) to capture the dependencies in a tree-like conversational structure. This is in contrast with previous work [5] in which results produced by linear-chain CRF models had to be aggregated heuristically. As an additional contribution, we present a new blog corpus consisting of conversations of different genres from 6 different blog websites. We use this corpus to train and test our classifiers based on TCRFs.

View record

A visual interface for browsing and summarizing conversations (2012)

In our daily lives, we have conversations with others in many different modalities like meetings, emails, chats, blogs etc. At the advent of the Web, the volume and the complexity of the conversational data generated through our day to day communication have increased many folds. A way to deal with this overwhelming amount of interactional information is to use automatic summarization for quick access. Although Machine Learning approaches can be used to generate automatic summaries, extractive or abstractive, they still have not reached the level of quality of human generated summaries. We introduce here a visual interface that takes advantage of human cognition and perception abilities in conjunction with automatically extracted knowledge concepts for the conversation to analyze it and to automatically generate a summary for it. Our interface provides the user an overview of the conversation's content and a way to quickly explore it. It aids to identify informative sentences as potential components of the summary based on visual cues. Our objective is to provide the user more control over choosing the topics she wants to appear in the concise resultant overview generated through interactive exploration, thus generating a focused summary. We use an ontology containing nodes for speakers, dialogue acts (DA), and a list of entities referred to in the conversation to provide entry points to the conversation. These concepts in the ontology are derived using classifiers based on generic features making it possible to use the interface to explore any mode of conversational data. In this thesis, we have designed an interface based on the principles of Natural Language Processing, Human Computer Interaction, and Information Visualization that can be used to browse a human conversation using the mapping of sentences to those ontology concepts and can be used to generate a brief and focused summary for the conversation. We have evaluated our interface in a formal user study and have found that our interface facilitates widely varying approaches adopted by people trying to analyze a conversation.

View record

Domain Adaptation for Summarizing Conversations (2011)

The goal of summarization in natural language processing is to create abridged and informative versions of documents. A popular approach is supervised extractive summarization: given a training source corpus of documents with sentences labeled with their informativeness, train a model to select sentences from a target document and produce an extract. Conversational text is challenging to summarize because it is less formal, its structure depends on the modality or domain, and few annotated corpora exist. We use a labeled corpus of meeting transcripts as the source, and attempt to summarize a different target domain, threaded emails. We study two domain adaptation scenarios: a supervised scenario in which some labeled target domain data is available for training, and an unsupervised scenario with only unlabeled data in the target and labeled data available in a related but different domain. We implement several recent domain adaptation algorithms and perform a comparative study of their performance. We also compare the effectiveness of using a small set of conversation-specific features with a large set of raw lexical and syntactic features in domain adaptation. We report significant improvements of the algorithms over their baselines. Our results show that in the supervised case, given the amount of email data available and the set of features specific to conversations, training directly in-domain and ignoring the out-of-domain data is best. With only the more domain-specific lexical features, though overall performance is lower, domain adaptation can effectively leverage the lexical features to improve in both the supervised and unsupervised scenarios.

View record

 

Membership Status

Member of G+PS
View explanation of statuses

Program Affiliations

 

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.

 
 

Sign up for an information session to connect with students, advisors and faculty from across UBC and gain application advice and insight.