Zhen Wang

Professor

Relevant Degree Programs

 

Graduate Student Supervision

Doctoral Student Supervision (Jan 2008 - May 2019)
Automated analysis of vascular structures of skin lesions: segmentation, pattern recognition and computer-aided diagnosis (2018)

No abstract available.

Deep learning with limited labeled image data for health informatics (2018)

Deep learning is a data-driven technique for developing intelligent systems using a large amount of training data. Amongst the deep learning applications, this thesis focuses on problems in health informatics. Compared to the general deep learning applications, health informatics problems are complex, unique and pose problem-specific challenges. Many of these problems however, face a common challenge: the lack of labeled data. In this thesis, we explore the following three ways to overcome three specific image based health informatics problems:1) The use of image patches instead of whole images as the input for deep learning. To increase the data size, each image is partitioned into non-overlapping, mid-level patches: This approach is illustrated by addressing the food image recognition problem. Automatic food recognition could be used for nutrition analysis. We propose a novel deep framework for mid-level food image patches. Evaluations on 3 benchmark datasets demonstrate that the proposed approach achieves superior performance over baseline convolutional neural networks (CNN) methods.2) The use of prior knowledge to reduce the high dimensionality and complexity of raw data: We illustrate this idea on magnetic resonance imaging (MRI) images, for diagnosing a common mental-health disorder, the Attention Deficit Hyperactivity Disorder (ADHD). MRI has been increasingly used in analyzing ADHD with machine learning algorithms. We propose a multi-channel 3D CNN based automatic ADHD diagnosis approach using MRI scans. Evaluations on ADHD-200 Competition dataset show that the proposed approach achieves state-of-the-art accuracy.3) The use of synthetic data pre-training along with real data domain adaptation to increase the available labeled data during training: We illustrate this idea on 2-D/3-D image registration problems. We propose a fully automatic and real-time CNN-based 2-D/3-D image registration system. Evaluations on Transesophageal Echocardiography (TEE) X-ray images from clinical studies demonstrate that the proposed system outperforms existing methods in accuracy and speed. We further propose a pairwise domain adaptation module (PDA MODULE), designed to be flexible for different deep learning-based 2-D/3-D registration frameworks with improved performance. Evaluations on two clinical applications demonstrate the PDA modules advantages for 2-D/3-D medical image registration with limited data.

View record

Underdetermined joint blind source separation with application to physiological data (2017)

Blind Source Separation (BSS) methods have been attracting increasing attention for their promising applications in signal processing. Despite recent progress on the research of BSS, there are still remaining challenges. Specifically, this dissertation focuses on developing novel Underdetermined Blind Source Separation (UBSS) methods that can deal with several specific challenges in real applications, including limited number of observations, self/cross dependence information and source inference in the underdetermined case. First, by taking advantage of theNoise Assisted Multivariate Empirical Mode Decomposition (NAMEMD) and Multiset Canonical Correlation Analysis (MCCA), we propose a novel BSS framework and apply it to extract the heart beat signal form noisy nano-sensor signals. Furthermore, we generalize the idea of (over)determined joint BSS to that of the underdetermined case. We explore the dependence information between two datasets and propose an underdetermined joint BSS method for two datasets, termed as UJBSS-2. In addition, by exploiting the cross correlation between each pair of datasets, we develop a novel and effective method to jointly estimate the mixing matricesfrom multiple datasets, referred to as Underdetermined Joint Blind Source Separation for Multiple Datasets (UJBSS-M). In order to improve the time efficiency and relax the sparsity constraint, we recover the latent sources based on subspace representation when the mixing matrices are estimated. As an example application for noise enhanced signal processing, the proposed UJBSS-M method also can be utilized to solve the single-set UBSS problem when suitable noise is added to the observations. Finally, considering the recent increasing need for biomedical signal processing in the ambulatory environment, we propose a novel UBSS method for removing electromyogram (EMG) from Electroencephalography (EEG) signals. The proposed method for recovering the underlying sources is also applicable to other artifact removal problems. Simulation results demonstrate that the proposed methods yield superior performances over conventional approaches. We also evaluate the proposed methods on real physiological data, and the proposed methods are shown to effectively and efficiently recover the underlying sources.

View record

Brain connectivity network modeling using fMRI signals (2016)

Functional magnetic resonance imaging (fMRI) is one of the most popular non-invasive neuroimaging technologies, which examines human brain at relatively good spatial resolution in both normal and disease states. In addition to the investigation of local neural activity in isolated brain regions, brain connectivity estimated from fMRI has provided a system-level view of brain functions. Despite recent progress on brain connectivity inference, there are still several challenges. Specifically, this thesis focuses on developing novel brain connectivity modeling approaches that can deal with particular challenges of real biomedical applications, including group pattern extraction from a population, false discovery rate control, incorporation of prior knowledge and time-varying brain connectivity network modeling. First, we propose a multi-subject, exploratory brain connectivity modeling approach that allows incorporation of prior knowledge of connectivity and determination of the dominant brain connectivity patterns among a group of subjects. Furthermore, to integrate the genetic information at the population level, a framework for genetically-informed group brain connectivity modeling is developed. We then focus on estimating the time-varying brain connectivity networks. The temporal dynamics of brain connectivity assess the brain in the additional temporal dimension and provide a new perspective to the understanding of brain functions. In this thesis, we develop a sticky weighted time-varying model to investigate the time-dependent brain connectivity networks. As the brain must strike a balance between stability and flexibility, purely assuming that brain connectivity is static or dynamic may be unrealistic. We therefore further propose making joint inference of time-invariant connections and time-varying coupling patterns by employing a multitask learning model. The above proposed methods have been applied to real fMRI data sets, and the disease induced changes on the brain connectivity networks have been observed. The brain connectivity study is able to provide deeper insights into neurological diseases, complementing the traditional symptom-based diagnostic methods. Results reported in this thesis suggest that brain connectivity patterns may serve as potential disease biomarkers in Parkinson's Disease.

View record

Effective image registration for motion estimation in medical imaging environments (2016)

Motion estimation is a key enabler for many advanced medical imaging / image analysis applications, and hence is of significant clinical interest. In this thesis, we study image registration for motion estimation in medical imaging environments, and focus on two clinically interesting problems: 1) deformable respiratory motion estimation from dynamic Magnetic Resonance Imagings (MRIs), and 2) rigid-body object motion estimation (e.g., surgical devices, implants) from fluoroscopic images. Respiratory motion is a major complicating factor in many image acquisition applications and image-guided interventions. Existing respiratory motion estimation methods typically rely on motion models learned from retrospective data, and therefore are vulnerable to unseen respiratory motion patterns. To address this limitation, we propose to use dynamic MRI acquisition protocol to monitor respiratory motion, and a scatter to volume registration method that can directly recover the dense motion fields from the dynamic MRI data without explicitly modeling the motion. The proposed method achieves significantly higher motion estimation accuracy than the state-of-the-art methods in addressing varying respiratory motion patterns. Object motion estimation from fluoroscopic images is an enabling technology for advanced image guidance applications for Image-Guided Therapy (IGT). Complex and time-critical clinical procedures typically require the motion estimation to be accurate, robust and real-time, which cannot be achieved by existing methods at the same time. We study 2-D/3-D registration for rigid-body object motion estimation to address the above challenges, and propose two new approaches to significantly improve the robustness and computational efficiency of 2-D/3-D registration. We first propose to use pre-generated canonical form Digitally Reconstructed Radiographs (DRRs) to accelerate the DRR generation during intensity-based 2-D/3-D registration, which boosts the computational efficiency by ten-fold with little degradation in registration accuracy and robustness. We further demonstrate that the widely adopted intensity-based formulation for 2-D/3-D registration is ineffective, and propose a more effective regression-based formulation, solved using Convolutional Neural Network (CNN). The proposed regression-based approach achieves significantly higher robustness, capture range and computational efficiency than state-of-the-art intensity-base approaches.

View record

Data famine in big data era : machine learning algorithms for visual object recognition with limited training data (2014)

Big data is an increasingly attractive concept in many fields both in academia and in industry. The increasing amount of information actually builds an illusion that we are going to have enough data to solve all the data driven problems. Unfortunately it is not true, especially for areas where machine learning methods are heavily employed, since sufficient high-quality training data doesn't necessarily come with the big data, and it is not easy or sometimes impossible to collect sufficient training samples, which most computational algorithms depend on. This thesis mainly focuses on dealing situations with limited training data in visual object recognition, by developing novel machine learning algorithms to overcome the limited training data difficulty.We investigate three issues in object recognition involving limited training data: 1. one-shot object recognition, 2. cross-domain object recognition, and 3. object recognition for images with different picture styles. For Issue 1, we propose an unsupervised feature learning algorithm by constructing a deep structure of the stacked Hierarchical Dirichlet Process (HDP) auto-encoder, in order to extract "semantic" information from unlabeled source images. For Issue 2, we propose a Domain Adaptive Input-Output Kernel Learning algorithm to reduce the domain shifts in both input and output spaces. For Issue 3, we introduce a new problem involving images with different picture styles, successfully formulate the relationship between pixel mapping functions with gradient based image descriptors, and also propose a multiple kernel based algorithm to learn an optimal combination of basis pixel mapping functions to improve the recognition accuracy. For all the proposed algorithms, experimental results on publicly available data sets demonstrate the performance improvements over previous state-of-arts.

View record

MIMO backscatter RFID systems : performance analysis, design and comparison (2014)

Backscatter RFID systems are the most popular RFID systems deployed due to low cost and low complexity. However, they pose many design challenges due to their querying-fading-signaling-fading structure, which experiences deeper fading than conventional one-way channels. Recently, by simulations and measurements, researchers found that the MIMO setting can improve the performance of backscatter RFID systems. These simulations and measurements were based on simple signaling schemes and no rigorous mathematical analysis has been provided. In this thesis, we explore querying, STC, and diversity combining schemes over the three ends of the backscatter RFID systems and provide generalized performance analysis and design criteria.At the tag end, we show that the identical signaling scheme, which cannot improve the BER performance in conventional one-way channels, can significantly improve the BER performance of backscatter RFID. We also analytically study the performances of orthogonal STCs, with different sub-channel fading assumptions, and show that the diversity order depends only on the number of tag antennas. More interestingly, we show that the performance is more sensitive to the channel condition of the forward link than that of the backscattering link.In previous literature, the understanding of the query end is that the designs of query signals have no potential to improve the system performance. However, we show that some well-designed query signals can improve the system performance significantly. We propose a novel unitary query method in this thesis. Conventional measures of the physical layer performance cannot be obtained analytically in backscatter RFID channels with employing our unitary query. We thus provide a new performance measure to overcome the difficulty of conventional measures, and show that why the unitary query has superior performance.The multi-keyhole channel is another type of cascaded channel. The backscatter RFID channel and the multi-keyhole channels look similar, but are essentially different and there difference has not been clearly studied in previous literature. In the final part of this thesis, by investigating general STCs and revealing a few interesting properties of this channel in the MISO case, we show that the two channels achieves completely different diversity order and BER performance.

View record

Multimodal biomedical signal processing for corticomuscular coupling analysis (2014)

Corticomuscular coupling analysis using multiple data sets such as electroencepha-logram (EEG) and electromyogram (EMG) signals provides a useful tool for understanding human motor control systems. A popular conventional method to assess corticomuscular coupling is the pair-wise magnitude-squared coherence (MSC). However, there are certain limitations associated with MSC, including the difficulty in robustly assessing group inference, only dealing with two types of data sets simultaneously and the biologically implausible assumption of pair-wise interactions.In this thesis, we propose several novel signal processing techniques to overcome the disadvantages of current coupling analysis methods. We propose combining partial least squares (PLS) and canonical correlation analysis (CCA) to take advantage of both techniques to ensure that the extracted components are maximally correlated across two data sets and meanwhile can well explain the information within each data set. Furthermore, we propose jointly incorporating response-relevance and statistical independence into a multi-objective optimization function, meaningfully combining the goals of independent component analysis (ICA) and PLS under the same mathematical umbrella.In addition, we extend the coupling analysis to multiple data sets by proposing a joint multimodal group analysis framework. Finally, to acquire independent components but not just uncorrelated ones, we improve the multimodal framework by exploiting the complementary property of multiset canonical correlation analysis (M-CCA) and joint ICA. Simulations show that our proposed methods can achieve superior performances than conventional approaches. We also apply the proposed methods to concurrent EEG, EMG and behavior data collected in a Parkinson's disease (PD) study. The results reveal highly correlated temporal patterns among the multimodal signals and corresponding spatial activation patterns. In addition to the expected motor areas, the corresponding spatial activation patterns demonstrate enhanced occipital connectivity in PD subjects, consistent with previous medical findings.

View record

Improved spread spectrum schemes for data hiding and their security analysis under known message attack (2013)

The massive production and easy use of digital media pose new challenges on protecting intellectual property of digital media. Digital data hiding, which can be defined as the procedure of embedding information into an original media host signal, is a promising technique for digital intellectual property protection. A data hiding system generally contains two major components: the encoder for embedding the hidden information and the decoder for extracting the hidden information.This thesis focuses on spread spectrum (SS) watermarking schemes for data hiding. Watermarking techniques for data hiding can be broadly categorized into two classes: quantization index modulation (QIM) based and spread spectrum based approaches. Being robust against distortions and having simple decoder structure make SS attractive for data hiding.First, we investigate the decoding performance of the traditional SS schemes in the DCT and DFT domains. To obtain more practical decoders, we propose using suboptimal decoders which do not need side information.Secondly, since the interference effect of the host signal causes decoding performance degradation in the additive SS scheme, to remove this host effect efficiently, we propose the correlation-and-bit-aware concept for data hiding by exploiting the side information at the encoder side and propose two improved SS-based schemes, the correlation-aware SS (CASS) and the correlation-aware improved SS (CAISS) embedding schemes.Thirdly, we analyze the decoding error probability and capacity of the multiplicative spread spectrum (MSS) embedding scheme, and show that the content-based MSS still suffers from the interference effect of the host signal. We then propose an improved MSS-based scheme by efficiently removing the host interference effect.Lastly, we present the security analysis of the SS-based data hiding schemes under the Known Message Attack (KMA) scenario. Each data hiding scheme has some secret parameters and here the security of a data hiding scheme represents the difficulty of estimating the secret parameters. We employ the mutual information between the observations and the secret parameters as a security measure. Also some practical estimators for estimating the signature code are introduced and their performances are reported to illustrate the security results.

View record

Lasso-type sparse regression and high-dimensional Gaussian graphical models (2013)

High-dimensional datasets, where the number of measured variables is larger than the sample size, are not uncommon in modern real-world applications such as functional Magnetic Resonance Imaging (fMRI) data. Conventional statistical signal processing tools and mathematical models could fail at handling those datasets. Therefore, developing statistically valid models and computationally efficient algorithms for high-dimensional situations are of great importance in tackling practical and scientific problems. This thesis mainly focuses on the following two issues: (1) recovery of sparse regression coefficients in linear systems; (2) estimation of high-dimensional covariance matrix and its inverse matrix, both subject to additional random noise.In the first part, we focus on the Lasso-type sparse linear regression. We propose two improved versions of the Lasso estimator when the signal-to-noise ratio is low: (i) to leverage adaptive robust loss functions; (ii) to adopt a fully Bayesian modeling framework. In solution (i), we propose a robust Lasso with convex combined loss function and study its asymptotic behaviors. We further extend the asymptotic analysis to the Huberized Lasso, which is shown to be consistent even if the noise distribution is Cauchy. In solution (ii), we propose a fully Bayesian Lasso by unifying discrete prior on model size and continuous prior on regression coefficients in a single modeling framework. Since the proposed Bayesian Lasso has variable model sizes, we propose a reversible-jump MCMC algorithm to obtain its numeric estimates.In the second part, we focus on the estimation of large covariance and precision matrices. In high-dimensional situations, the sample covariance is an inconsistent estimator. To address this concern, regularized estimation is needed. For the covariance matrix estimation, we propose a shrinkage-to-tapering estimator and show that it has attractive theoretic properties for estimating general and large covariance matrices. For the precision matrix estimation, we propose a computationally efficient algorithm that is based on the thresholding operator and Neumann series expansion. We prove that, the proposed estimator is consistent in several senses under the spectral norm. Moreover, we show that the proposed estimator is minimax in a class of precision matrices that are approximately inversely closed.

View record

Robust digital image hashing algorithms for image indentification (2013)

No abstract available.

Cortico-cortical and cortico-muscular connectivity analysis : methods and application to Parkinson's disease (2012)

The concept of brain connectivity provides a new perspective to the understanding of the mechanism underlying brain functions, complementing the traditional approach of analyzing neural activity of isolated regions. Among the existing connectivity analysis techniques, multivariate autoregressive (mAR)-based measures are of great interest for their ability to characterize both directionality and spectral property of cortical interactions. Yet, the direct estimation of mAR-based connectivity from scalp electroencephalogram (EEG) is confounded by volume conduction, statistical instability and inter-subject variability. In this thesis, we propose novel signal processing methods to enhance the existing mAR-based connectivity methods. First, we explore incorporating sparsity constraints into the mAR formulation at both subject level and group level using LASSO-based regression. We show by simulation that sparse mAR yields more stable and accurate connectivity estimates compared to the traditional, non-sparse approach. Furthermore, the group-wise sparsity simplifies the inference of group-level connectivity patterns from multi-subject data. To mitigate the effect of volume conduction, we investigate source-level connectivity and propose a state-space generalized mAR framework to jointly model the mixing effect of volume conduction and causal relationships between underlying neural sources. By jointly estimating the mixing process and mAR model parameters, the proposed technique demonstrates improved connectivity estimation performance. Finally, we expanded our connectivity analysis to cortico-muscular level by modeling the relationships between EEG and simultaneously recorded electromyography (EMG) data using a multiblock partial least square (mbPLS) framework. The hierarchical construction of the mbPLS framework provides a natural way to model multi-subject, multi-modal data, enabling the identification of maximally covarying common patterns from EEG and EMG across subjects. Applications of the proposed techniques to EEG and EMG data of healthy and Parkinson's disease (PD) subjects demonstrate that directional connectivity analysis is a more sensitive technique than traditional univariate spectral analysis in revealing complex effects of motor tasks and disease. Moreover, alternations in connectivity accurately predict disease severity in PD. These new analysis tools allow a better understanding of brain function and provide a basis for developing objective measures to assess progression of neurological diseases.

View record

Dynamic Bayesian networks : modeling and analysis of neural signals (2009)

Studying interactions between different brain regions or neural components is crucial in understanding neurological disorders. Dynamic Bayesian networks, a type of statistical graphical model, have been suggested as a promising tool to model neural communication systems. This thesis investigates the employment of dynamic Bayesian networks for analyzing neural connectivity, especially with focus on three topics: structural feature extraction, group analysis, and error control in learning network structures.Extracting interpretable features from experimental data is important for clinical diagnosis and improving experiment design. A framework is designed for discovering structural differences, such as the pattern of sub-networks, between two groups of Bayesian networks. The framework consists of three components: Bayesian network modeling, statistical structure-comparison, and structure-based classification. In a study on stroke using surface electromyography, this method detected several coordination patterns among muscles that could effectively differentiate patients from healthy people.Group analyses are widely conducted in neurological research. However for dynamic Bayesian networks, the performances of different group-analysis methods had not been systematically investigated. To provide guidance on selecting group-analysis methods, three popular methods, i.e. the virtual-typical-subject, the common-structure and the individual-structure methods, were compared in a study on Parkinson's disease, from the aspects of their statistical goodness-of-fit to the data, and more importantly, their sensitivity in detecting the effect of medication. The three methods led to considerably different group-level results, and the individual-structure approach was more sensitive to the normalizing effect of medication.Controlling errors is a fundamental problem in applying dynamic Bayesian networks to discovering neural connectivity. An algorithm is developed for this purpose, particularly for controlling the false discovery rate (FDR). It is proved that the algorithm is able to curb the FDR under user-specified levels (for example, conventionally 5%) at the limit of large sample size, and meanwhile recover all the true connections with probability one. Several extensions are also developed, including a heuristic modification for moderate sample sizes, an adaption to prior knowledge, and a combination with Bayesian inference.

View record

Master's Student Supervision (2010 - 2018)
Automatic translucency detection of basal cell carcinoma (BCC) via deep learning methods (2018)

Translucency, defined as a jelly-like appearance, is a common clinical feature of basal cell carcinoma (BCC), the most common skin cancer. This feature plays an important role in diagnosing basal cell carcinoma at an early stage because the translucency can be observed readily in clinical examinations with a high specificity. Therefore, translucency detection is a critical component of computer aided systems which aim at early detection of basal cell carcinoma. In this thesis, we proposed two deep learning methods to automatically detect translucency. First, we develop a convolutional neural network based framework to detect translucency of basal cell carcinoma. Furthermore, a sparse auto-encoder based framework is proposed for translucency detection on BCC images. Since currently two types of skin images are mainly used for diagnosis of basal cell carcinoma by doctors, which are dermoscopy images and clinical images, we evaluate two proposed methods on both types of skin images. Our results showed that the two proposed methods yield similar detection performances. For detecting translucency in dermoscopy images, both proposed methods achieve comparable accuracy results, though the accuracy is not as good as we expected. For detecting translucency in clinical images, both methods achieve good performances. Compared the performances in both types of images, the proposed deep learning based methods seems more suitable for translucency detection in clinical images than in dermoscopy images.

View record

Connectivity-based parcellation of putamen region using resting state fMRI (2015)

Functional magnetic resonance imaging (fMRI) has shown great potential in studying the underlying neural systems. Functional connectivity measured by fMRI provides an efficient approach to study the interactions and relationships between different brain regions. However, functional connectivity studies require accurate definition of brain regions, which is often difficult and may not be achieved through anatomical landmarks. In this thesis, we present a novel framework for parcellation of a brain region into functional subunits based on their connectivity patterns with other reference brain regions. The proposed method takes the prior neurological information into consideration and aims at finding spatially continuous and functionally consistent sub-regions in a given brain region. The proposed framework relies on a sparse spatially regularized fused lasso regression model for feature extraction. The usual lasso model is a linear regression model commonly applied in high dimensional data such as fMRI signals. Compared with lasso, the proposed model further considers the spatial order of each voxel and thus encourages spatially and functionally adjacent voxels to share similar regression coefficients despite of the possible spatial noise. In order to achieve the accurate parcellation results, we propose a process by iteratively merging voxels (groups) and tuning the parameters adaptively. In addition, a Graph-Cut optimization algorithm is adopted for assigning the overlapped voxels into separate sub-regions. With spatial information incorporated, spatially continuous and functionally consistent subunits can be obtained which are desired for subsequent brain connectivity analysis. The simulation results demonstrate that the proposed method could reliably yield spatially continuous and functionally consistent subunits. When applied to real resting state fMRI datasets, two consistent functional subunits could be obtained in the putamen region for all normal subjects. Comparisons between the results of the Parkinson’s disease group and the normal group suggest that the obtained results are in accordance with our medical assumption. The extracted functional subunits themselves are of great interest in studying the influence of aging and a certain disease, and they may provide us deeper insights and serve as a biomarker in our future Parkinson’s disease study.

View record

Video-based cardiac physiological measurements using joint blind source separation approaches (2015)

Non-contact measurements of human cardiopulmonary physiological parameters based on photoplethysmography (PPG) can lead to efficient and comfortable medical assessment. It was shown that human facial blood volume variation during cardiac cycle can be indirectly captured by regular Red-Green-Blue (RGB) cameras. However, few attempts have been made to incorporate data from different facial sub-regions to improve remote measurement performance. In this thesis, we propose a novel framework for non-contact video-based human heart rate (HR) measurement by exploring correlations among facial sub-regions via joint blind source separation (J-BSS). In an experiment involving video data collected from 16 subjects, we compare the non-contact HR measurement results obtained from a commercial digital camera to results from a Health Canada and Food and Drug Administration (FDA) licensed contact blood volume pulse (BVP) sensor. We further test our framework on a large public database, which provides subjects' left-thumb plethysmograph signal as ground truth. Experimental results show that the proposed framework outperforms the state-of-the-art independent component analysis (ICA)-based methodologies. Driver physiological monitoring in vehicle is of great importance to provide a comfortable driving environment and prevent road accidents. Contact sensors can be placed on the driver's body to measure various physiological parameters. However such sensors may cause discomfort or distraction. The development of non-contact techniques can provide a promising solution. In this thesis, we employ our proposed non-contact video-based HR measurement framework to monitor the drivers heart rate and do heart rate variability analysis using a simple consumer-level webcam. Experiments of real-world road driving demonstrate that the proposed non-contact framework is promising even with the presence of unstable illumination variation and head movement.

View record

 
 

If this is your researcher profile you can log in to the Faculty & Staff portal to update your details and provide recruitment preferences.