
Showing 1 - 50 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Clustering Of Nonnegative Data And An Application To Matrix Completion
In this paper, we propose a simple algorithm to cluster nonnegative data lying in disjoint subspaces. We analyze its performance in relation to a certain measure of correlation between said subspaces. We use our clustering algorithm to develop a matrix co
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sndcnn: Self-Normalizing Deep Cnns With Scaled Exponential Linear Units For Speech Recognition
Very deep CNNs achieve state-of-the-art results in both computer vision and speech recognition, but are difficult to train. The most popular way to train very deep CNNs is to use shortcut connec- tions (SC) together with batch normalization (BN). Inspired
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Exploring Entity-Level Spatial Relationships For Image-Text Matching
Exploring the entity-level (i.e., objects in an image, words in a text) spatial relationship contributes to understanding multimedia content precisely. The ignorance of spatial information in previous works probably leads to misunderstandings of image con
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Equalization Of Ofdm Waveforms With Insufficient Cyclic Prefix
In this paper, a simple equalization strategy for OFDM waveforms is proposed that specifically targets the case where the cyclic prefix is insufficient to span the whole channel duration. The proposed architecture can be very efficiently implemented in th
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data
A frame-unit-selection based voice conversion system proposed earlier by us is revisited here to enhance its performance in both speech naturalness and speaker similarity. Speaker independent, bilingual (Mandarin Chinese and American English) deep neural
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Modelling Sea Clutter In Sar Images Using Laplace-Rician Distribution
This paper presents a novel statistical model for the characterisation of synthetic aperture radar (SAR) images of the sea surface. The analysis of ocean surface is widely performed using satellite imagery as it produces information for wide areas under v
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Dual-Path Rnn: Efficient Long Sequence Modeling For Time-Domain Single-Channel Speech Separation
Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods. Unlike the time-frequency domain approaches, the time-domain separation systems often receive input
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Faster-Than-Nyquist Signaling Via Spatiotemporal Symbol-Level Precoding For Multi-User Miso Redundant Transmissions
This paper tackles the problem of both multi-user and intersymbol interference stemming from co-channel users transmitting at a faster-than-Nyquist (FTN) rate in multi-antenna downlink transmissions. We propose a framework for redundant block-based symbol
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Differential Approach For Rain Field Tomographic Reconstruction Using Microwave Signals From Leo Satellites
A differential approach is proposed for tomographic rain field reconstruction using the estimated signal-to-noise ratio of microwave signals from low earth orbit satellites at the ground receivers, with the unknown baseline values eliminated before using
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Allocation Of Computing Tasks In Distributed Mec Servers Co-Powered By Renewable Sources And The Power Grid
We consider a Multiaccess Edge Computing (MEC) network where distributed servers have energy harvesting (e.g., solar) and storage (e.g., batteries) capabilities. Energy from a connected power grid is also available, in case that harvested from ambient sou
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sstnet: Detecting Manipulated Faces Through Spatial, Steganalysis And Temporal Features
Compared to conventional object detection which focuses on high-level image content, face manipulation detection pays more attention to low-level artifacts and temporal discrepancies. However, there are few methods considering both of these two characteri
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Neural Time Warping For Multiple Sequence Alignment
Multiple sequence alignment (MSA) is a traditional and still challenging task for time-series analyses. The MSA problem is intrinsically a discrete optimization and, in principle, dynamic programming is available for solving MSA. However, the computation
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Smoothing Graph Signals Via Random Spanning Forests
Another facet of the elegant link between random processes on graphs and Laplacian-based numerical linear algebra is uncovered: based on random spanning forests, novel Monte-Carlo estimators for graph signal smoothing are proposed. These random forests ar
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Large-Scale Weakly-Supervised Content Embeddings For Music Recommendation And Tagging
We explore content-based representation learning strategies tailored for large-scale, uncurated music collections that afford only weak supervision through unstructured natural language metadata and co-listen statistics. At the core is a hybrid training s
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Large-Scale Time Series Clustering With K-Ars
Time-series clustering involves grouping homogeneous time series together based on certain similarity measures. The mixture AR model (MxAR) has already been developed for time series clustering, as well as an associated EM algorithm. However, this EM clus
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Boffin Tts: Few-Shot Speaker Adaptation By Bayesian Optimization
We present BOFFIN TTS (Bayesian Optimization For FIne-tuning Neural Text To Speech), a novel approach for few-shot speaker adaptation. Here, the task is to fine-tune a pre-trained TTS model to mimic a new speaker using a small corpus of target utterances.
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multitask Learning With Capsule Networks For Speech-To-Intent Applications
Voice controlled applications can be a great aid to society, especially for physically challenged people. However this requires robustness to all kinds of variations in speech. A spoken language understanding system that learns from interaction with and d
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Electro-Magnetic Side-Channel Attack Through Learned Denoising And Classification
This paper proposes an upgraded Electro-Magnetic (EM) side-channel attack that automatically reconstructs the intercepted data. A novel system is introduced, running in parallel with leakage signal interception and catching compromising data on the fly. L
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Adversarial Detection Of Counterfeited Printable Graphical Codes: Towards
This paper addresses a problem of anti-counterfeiting of physical objects and aims at investigating a possibility of counterfeited printable graphical code detection from a machine learning perspectives. We investigate a fake generation via two different
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Regularized Partial Phase Synchrony Index Applied To Dynamical Functional Connectivity Estimation
We study the inference of conditional independence graph from the partial Phase Locking Value (PLV) index of multivariate time series. A typical application is the inference of temporal functional connectivity from brain data. We extend the recently propo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
The Fractional Quaternion Fourier Number Transform
In this paper, we define a fractional version of the quaternion Fourier number transform (QFNT). With this purpose, we first study the eigenstructure of the QFNT; this is used to obtain the eigendecomposition of the corresponding transform matrix, from wh
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Advms: A Multi-Source Multi-Cost Defense Against Adversarial Attacks
Designing effective defense against adversarial attacks is a crucial topic as deep neural networks have been proliferated rapidly in many security-critical domains such as malware detection and self-driving cars. Conventional defense methods, although sho
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Joint Phoneme-Grapheme Model For End-To-End Speech Recognition
This paper proposes methods to improve a commonly used end-to-end speech recognition model, Listen-Attend-Spell (LAS). The methods we proposed use multi-task learning to improve generalization of the model by leveraging information from multiple labels. T
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Adaptive Normalization For Forecasting Limit Order Book Data Using Convolutional Neural Networks
Deep learning models are capable of achieving state-of-the-art performance on a wide range of time series analysis tasks. However, their performance crucially depends on the employed normalization scheme, while they are usually unable to efficiently handl
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Image Super-Resolution Using Residual Global Context Network
Recent studies have showed that convolutional neural networks (CNN) can effectively improve the performance of single image super-resolution (SR). However, previous methods rarely considered long-range dependencies between pixels and channel-wise interdep
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Gated Hypernet Decoder For Polar Codes
Hypernetworks were recently shown to improve the performance of message passing algorithms for decoding error correcting codes. In this work, we demonstrate how hypernetworks can be applied to decode polar codes by employing a new formalization of the pol
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Shape From Bandwidth: Central Projection Case
Consider an unknown surface painted with a band-limited texture. We show that only the knowledge of the bandwidth of the texture is enough to estimate the shape of the surface from a single image taken by a camera. We model the problem as a central projec
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Siamese Content-Attentive Graph Convolutional Network For Personality Recognition Using Physiology
Affective multimedia content has long been used as stimulation to study an individual's personality using physiology. In this work, we propose a novel Siamese Content-Attentive Graph Convolutional Network (SCA-GCN) to learn a discriminative physiology rep
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Deep Matrix Completion On Graphs: Application In Drug Target Interaction Prediction
This work proposes matrix completion via deep matrix factorization on graphs. The work is motivated by the success of two very recent studies on (shallow) matrix completion on graphs and deep matrix factorization (without graphs). We show that the propose
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Adaptive Sequential Interpolator Using Active Learning For Efficient Emulation Of Complex Systems
Many fields of science and engineering require the use of complex and computationally expensive models to understand the involved processes in the system of interest. Nevertheless, due to the high cost involved, the required study becomes a cumbersome pro
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Improved Deep Neural Network For Modeling Speaker Characteristics At Different Temporal Scales
This paper presents an improved deep embedding learning method based on a convolutional neural network (CNN) for text-independent speaker verification. Two improvements are proposed for x-vector embedding learning: (1) a multiscale convolution (MSCNN) is
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Batman: Bayesian Target Modelling For Active Inference
Active Inference is an emerging framework for designing intelligent agents. In an Active Inference setting, any task is formulated as a variational free energy minimisation problem on a generative probabilistic model. Goal-directed behaviour relies on a c
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Focusing On Attention: Prosody Transfer And Adaptative Optimization Strategy For Multi-Speaker End-To-End Speech Synthesis
End-to-end speech synthesis can generate high-quality synthetic speech and achieve high similarity scores with low-resource adaptation data. However, the generalization of out-domain texts is still a challenging task. The limited adaptation data leads to
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Efficient Alternative To Network Pruning Through Ensemble Learning
Convolutional Neural Networks (CNNs) currently represent the best tool for classification of image content. CNNs are trained in order to develop generalized expressions in form of unique features to distinguish different classes. During this process, one
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Semanticgan: Generative Adversarial Networks For Semantic Image To Photo-Realistic Image Translation
Generative Adversarial Networks (GANs) have shown remarkable success in Semantic label map to Photo-realistic image Translation (S2PT) task. However, the results of the state-of-the-art approaches are often limited to blurriness and artifacts, and still f
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Graph Regularized Tensor Train Decomposition
With the advances in data acquisition technology, tensor objects are collected in a variety of applications including multimedia, medical and hyperspectral imaging. As the dimensionality of tensor objects is usually very high, dimensionality reduction is
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Determined Source Separation Using The Sparsity Of Impulse Responses
In this paper, we propose an over-determined sound source separation method considering the sparsity of impulse responses. Conventional methods, including independent low-rank matrix analysis (ILRMA), have mainly focused on design of realistic sound gener
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Noise-Robust Key-Phrase Detectors For Automated Classroom Feedback
With the goal of giving teachers automated feedback about their classrooms, we investigate how to train automatic speech detectors of key phrases such as good job, thank you, please, and you're welcome. This kind of language conveys support and respect fr
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning To Separate Sounds From Weakly Labeled Scenes
Deep learning models for monaural audio source separation are typically trained on large collections of isolated sources, which may not be available in domains such as environmental monitoring. We propose objective functions and network architectures that
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Optimizing Bayesian Hmm Based X-Vector Clustering For The Second Dihard Speech Diarization Challenge
This paper presents an analysis of our diarization system winning the second DIHARD speech diarization challenge, track 1. This system is based on clustering x-vector speaker embeddings extracted every 0.25s from short segments of the input recording. In
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Exocentric To Egocentric Image Generation Via Parallel Generative Adversarial Network
Cross-view image generation has been recently proposed to generate images of one view from another dramatically different view. In this paper we investigate exocentric (third-person) view to egocentric (first-person) view image generation. This is a chall
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Whosecough: In-The-Wild Cougher Verification Using Multitask Learning
Current automatic cough counting systems can determine how many coughs are present in an audio recording. However, they cannot determine who produced the cough. This limits their usefulness as most systems are deployed in locations with multiple people (i
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Exact Sparse Nonnegative Least Squares
We propose a novel approach to solve exactly the sparse nonnegative least squares problem, under hard l0 sparsity constraints. This approach is based on a dedicated branch-and-bound algorithm. This simple strategy is able to compute the optimal solution e
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Converting Written Language To Spoken Language With Neural Machine Translation For Language Modeling
When building a language model (LM) for spontaneous speech, the ideal situation is to have a large amount of spoken, in-domain training data. Having such abundant data, however, is not realistic. We address this problem by generating texts in spoken langu
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Exploring Appropriate Acoustic And Language Modelling Choices For Continuous Dysarthric Speech Recognition
There has been much recent interest in building continuous speech recognition systems for people with severe speech impairments, e.g., dysarthria. However, the datasets that are commonly used are typically designed for tasks other than ASR development, or
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Semi-Supervised Rank Tracking Algorithm For On-Line Unmixing Of Hyperspectral Images
This paper addresses the problem of rank tracking in real time hyperspectral image unmixing methods. Based on the On-line Alternating Direction Method of Multipliers (ADMM), we propose a new hyperspectral unmixing approach that integrates prior informatio