Showing 1701 - 1750 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improving Robustness Of Deep Learning Based Monaural Speech Enhancement Against Processing Artifacts
In voice telecommunication, the intelligibility and quality of speech signals can be severely degraded by background noise if the speaker at the transmitting end talks in a noisy environment. Therefore, a speech enhancement system is typically integrated
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Lookahead Converges To Stationary Points Of Smooth Non-Convex Functions
The Lookahead optimizer [Zhang et al., 2019] was recently proposed and demonstrated to improve performance of stochastic first-order methods for training deep neural networks. Lookahead can be viewed as a two time-scale algorithm, where the fast dynamics
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Constant-Envelope Precoding For Satellite Systems
In this paper, Constant-Envelope Precoding techniques are presented for satellite-based communication systems. In the developed transmission technique the signals of the antennas are designed to be of constant amplitude, improving the robustness of the la
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Cost Aware Adversarial Learning
The problem of making the classifier design resilient to test data falsification is considered. In the literature, a few countermeasures have been proposed to defend machine learning algorithms against test data falsification, but a common assumption empl
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Universal Phone Recognition With A Multilingual Allophone System
Recently, multilingual speech recognition has achieved tremendous progress by sharing parameters across languages. Multilingual acoustic models, however, generally ignore the difference between phonemes (sounds that can support lexical contrasts in a emp
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning Partial Differential Equations From Data Using Neural Networks
We develop a framework for estimating unknown partial differential equations (PDEs) from noisy data, using a deep learning approach. Given noisy samples of a solution to an unknown PDE, our method interpolates the samples using a neural network, and extra
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Local-Global Feature For Video-Based One-Shot Person Re-Identification
One-shot video-based re-identification, which uses only one labeled tracklet for each identity, is challenging since the framework usually suffers misalignment and inefficient utilizing of unlabeled data. In this paper we propose a novel local-global prog
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Joint Blind Calibration And Time-Delay Estimation For Multiband Ranging
In this paper, we focus on the problem of blind joint calibration of multiband transceivers and time-delay (TD) estimation of multipath channels. We show that this problem can be formulated as a particular case of covariance matching. Although this proble
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Addressing Challenges In Building Web-Scale Content Classification Systems
Understanding the semantic meaning of content on the web through the lens of a taxonomy has many practical advantages. However, when building large-scale content classification systems, practitioners are faced with unique challenges involving finding the
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Low-Latency Single Channel Speech Enhancement Using U-Net Convolutional Neural Networks
Single-channel speech enhancement (SE) can be described, in its simplest terms, as learning a transformation from single-channel noisy speech to the clean speech. To do this, we propose a simple but effective U-Net convolutional neural network (CNN) based
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Generalization Of Principal Component Analysis
Conventional principal component analysis (PCA) finds a principal vector that maximizes the sum of second powers of principal components. We consider a generalized PCA that aims at maximizing the sum of an arbitrary convex function of principal components
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Improved Selective Active Noise Control Algorithm Based On Empirical Wavelet Transform
The gradual adaptation and possibility of divergence have been the two main obstacles in the efficient implementation of conventional adaptive active noise control (ANC) to a wider range of applications. Selective ANC (SANC) has been proposed to rapidly r
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning A Representation For Cover Song Identification Using Convolutional Neural Network
Cover song identification is a challenging task in the field of Music Information Retrieval (MIR) due to complex musical variations between query tracks and cover versions. Previous works typically utilize hand-crafted features and alignment algorithms fo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Reduced-Complexity Singular Value Decomposition For Tucker Decomposition: Algorithm And Hardware
Tensors, as the multidimensional generalization of matrices, are naturally suited for representing and processing high dimensional data. To date, tensors have been widely adopted in various data-intensive applications, such as machine learning and big dat
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Counting Dense Objects In Remote Sensing Images
Estimating accurate number of interested objects from a given image is a challenging yet important task. Significant efforts have been made to address this problem and achieve great progress, yet counting number of ground objects from remote sensing image
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Ednfc-Net: Convolutional Neural Network With Nested Feature Concatenation For Nuclei-Instance Segmentation
Accurate nuclei identification is an important step in diagnosis of several diseases. The problem is complex due to heterogeneity in structure, color, and texture among the different categories of cells. The problem is further complicated due to overlappe
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Novel Saliency-Driven Oil Tank Detection Method For Synthetic Aperture Radar Images
Synthetic aperture radar (SAR) imaging system plays an important role in earth observation research. This leads to the significance of target detection in SAR image. In this paper, we propose a novel saliency-driven oil tank detection method (SDD) for SAR
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Proximal Distance Algorithm For Nonconvex Qcqp With Beamforming Applications
This paper studies nonconvex quadratically constrained quadratic program (QCQP), which is known to be NP-hard in general. In the past decades, various approximate approaches have been developed to tackle the QCQP, including semidefinite relaxation (SDR),
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Audio Sound Determination Using Feature Space Attention Based Convolution Recurrent Neural Network
The classification framework has been popularly adopted to perform sound event detection. However, the existing neural network based classification based approaches treat each feature dimension equally and the varying influence of feature dimensions has n
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Return To Dereverberation In The Frequency Domain Using A Joint Learning Approach
Dereverberation is often performed in the time-frequency domain using mostly deep learning approaches. Time-frequency domain processing, however, may not be necessary when reverberation is modeled by the convolution operation. In this paper, we investigat
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Adaptation Of Rnn Transducer With Text-To-Speech Technology For Keyword Spotting
With the advent of recurrent neural network transducer (RNN-T) model, the performance of keyword spotting (KWS) systems has greatly improved. However, the KWS systems, employed for wake-word detection, still rely on the availability of keyword specific tr
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Regression Before Classification For Temporal Action Detection
Action classification combined with location regression is a widely-utilized mechanism in existing temporal action detection methods. However, there exists an inconsistency problem between locations and categories of action instances in this mechanism. Mo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Resource Management In The Multibeam Noma-Based Satellite Downlink
A beam-free approach to channel allocation in a multi-beam four-color satellite coverage area is taken. Non-Orthogonal Multiple Access (NOMA) and Orthogonal Multiple Access (OMA) are compared as methods to serve users non-necessarily located on the refere
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Iq-Stan: Image Quality Guided Spatio-Temporal Attention Network For License Plate Recognition
License plate recognition (LPR) is one of the essential components in intelligent transportation systems. Although the image processing algorithms for LPR have been extensively studied in the past several years, the recognition performance is still not sa
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Unified Sequence-To-Sequence Front-End Model For Mandarin Text-To-Speech Synthesis
In Mandarin text-to-speech (TTS) system, the front-end text processing module significantly influences the intelligibility and naturalness of synthesized speech. Building a typical pipeline-based front-end which consists of multiple individual components
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unsupervised Key Hand Shape Discovery Of Sign Language Videos With Correspondence Sparse Autoencoders
Recognition of sign language is a difficult task which often requires tedious annotations by sign language experts. End-to-end learning attempts that bypass frame level annotations have achieved some success in limited datasets, but it has been shown that
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Self-Supervised Learning For Audio-Visual Speaker Diarization
Speaker diarization, which is to find the speech segments of specific speakers, has been widely used in human-centered applications such as video conferences or human-computer interaction systems. In this paper, we propose a self-supervised audio-video sy
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Balanced Binary Neural Networks With Gated Residual
Binary neural networks have attracted numerous attention in recent years. However, mainly due to the information loss stemming from the biased binarization, how to preserve the accuracy of networks still remains a critical issue. In this paper, we attempt
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Robust Speaker Recognition Using Unsupervised Adversarial Invariance
In this paper, we address the problem of speaker recognition in challenging acoustic conditions using a novel method to extract robust speaker-discriminative speech representations. We adopt a recently proposed unsupervised adversarial invariance architec
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Text Adaptation For Speaker Verification With Speaker-Text Factorized Embeddings
Text mismatch between pre-collected data, either training data or enrollment data, and the actual test data can significantly hurt text-dependent speaker verification (SV) system performance. Although this problem can be solved by carefully collecting dat
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi-View Clustering Via Mixed Embedding Approximation
This paper tackles multi-view clustering via proposing a novel mixed embedding approximation (MEA) method. Formally, we aim to learn a uniform orthogonal embedding based on the orthogonal pre-embeddings of each view. At first, we hope that the uniform emb
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multilinear Generalized Singular Value Decomposition (Ml-Gsvd) With Application To Coordinated Beamforming In Multi-User Mimo Systems
In this paper, we propose a new Multilinear Generalized Singular Value Decomposition (ML-GSVD) which allows to jointly factorize a set of matrices with one common dimension. The ML-GSVD is an extension of the Generalized Singular Value Decomposition (GSVD
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Wind: Wasserstein Inception Distance For Evaluating Generative Adversarial Network Performance
In this paper, we present Wasserstein Inception Distance (WInD), a novel metric for evaluating performance of Generative Adversarial Networks (GANs). The proposed metric extends on the rationale of the previously proposed Fr?chet Inception Distance (FID),
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Gci Detection From Raw Speech Using A Fully-Convolutional Network
Glottal Closure Instants (GCI) detection consists in automatically detecting temporal locations of most significant excitation of the vocal tract from the speech signal. It is used in many speech analysis and processing applications, and various algorithm
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Oh, Jeez! Or Uh-Huh? A Listener-Aware Backchannel Predictor On Asr Transcriptions
This paper presents our latest investigation on modeling backchannel in conversations. Motivated by a proactive backchanneling theory, we aim at developing a system which acts as a proactive listener by inserting backchannels, such as continuers and asses
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Graphtts: Graph-To-Sequence Modelling In Neural Text-To-Speech
This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS), which maps the graph embedding of the input sequence to spectrograms. The graphical inputs consist of node and edge representations constructed from input texts. The en
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Indylstms: Independently Recurrent Lstms
We introduce Independently Recurrent Long Short-term Memory cells: IndyLSTMs. These differ from regular LSTM cells in that the recurrent weights are not modeled as a full matrix, but as a diagonal matrix, i.e. the output and state of each LSTM cell depend
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Bipartite Belief Propagation Polar Decoding With Bit-Flipping
For the scenarios with high throughput requirements, the belief propagation (BP) decoding is one of the most promising decoding strategies for polar codes. By pruning the redundant variable nodes (VNs) and check nodes (CNs) in the original factor graph, t
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
High-Resolution Attention Network With Acoustic Segment Model For Acoustic Scene Classification
The spectral information of acoustic scenes is diverse and complex, which poses challenges for acoustic scene tasks. To improve the classification performance, a variety of convolutional neural networks (CNNs) are proposed to extract richer semantic infor
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Low-Complexity Accurate Mmwave Positioning For Single-Antenna Users Based On Angle-Of-Departure And Adaptive Beamforming
The problem of position estimation of a mobile user equipped with a single antenna receiver using downlink transmissions in addressed. The advantages of this setup compared to the classical MIMO and uplink scenarios are analyzed in terms of achievable the
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Analysis Of Acoustic Features For Speech Sound Based Classification Of Asthmatic And Healthy Subjects
Non-speech sounds (cough, wheeze) are typically known to perform better than speech sounds for asthmatic and healthy subject classification. In this work, we use sustained phonations of speech sounds, namely, /A:/, /i:/, /u:/, /eI/, /oU/, /s/, and /z/ fro
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Context And Uncertainty Modeling For Online Speaker Change Detection
Speaker change detection is often addressed as a key component in speaker diarization systems. In this work we focus on online speaker change detection as a standalone task which is required for online closed captioning of broadcast television. Contrary t
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Differentially Modulated Spectrally Efficient Frequency-Division Multiplexing
This letter proposes a differentially modulated non-orthogonal spectrally efficient frequency-division multiplexing (D-SEFDM) architecture, which allows us to dispense with any pilot overhead needed for channel estimation at the receiver, while increasing
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Speech Enhancement Using A Two-Stage Network For An Efficient Boosting Strategy
A novel neural network architecture, called two-stage network (TSN), with a multi-objective learning (MOL) method for an efficient boosting strategy (BS) is proposed for speech enhancement. BS is an ensemble method using multiple base predictions (MBPs) f
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Real-Time Sound Event Detection On The Edge: Porting Vggish On Low-Power Iot Microcontrollers
Internet of Things (IoT) applications typically require a large number of heterogeneous devices to be distributed in the environment, which can generate large amounts of data for wireless transmission, affecting the energy requirements and lifetime of the
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning To Transfer Multi-Speaker Emotional Prosody To A Neutral Speaker
Most recent emotional speech synthesizers have been studied with a large training data. These systems require a sufficient number of audios to be recorded with respect to different emotions for each speaker. Acquiring emotional speech is more expensive th
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
High-Accuracy Classification Of Attention Deficit Hyperactivity Disorder With L2,1-Norm Linear Discriminant Analysis
Attention Deficit Hyperactivity Disorder (ADHD) is a high incidence of neurobehavioral disease in school-age children. Its neurobiological classification is meaningful for clinicians. The existing ADHD classification methods suffer from two problems, i.e.