Showing 401 - 450 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Proximal Multitask Learning Over Distributed Networks With Jointly Sparse Structure
Modeling relations between local optimum parameter vectors in multitask networks has attracted much attention over the last years. This work considers a distributed optimization problem for parameter vectors with a jointly sparse structure among nodes, th
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Generating Multilingual Voices Using Speaker Space Translation Based On Bilingual Speaker Data
[2 Videos ]
We present progress towards bilingual Text-to-Speech which is able to transform a monolingual voice to speak a second language while preserving speaker voice quality. We demonstrate that a bilingual speaker embedding space contains a separate distribution
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Neural Percussive Synthesis Parameterised By High-Level Timbral Features
We present a deep neural network-based methodology for synthesising percussive sounds with control over high-level timbral characteristics of the sounds. This approach allows for intuitive control of a synthesizer, enabling the user to shape sounds withou
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Robustness Of Sparse Bayesian Learning In Correlated Environments
In this paper we explore the robustness of Sparse Bayesian Learning (SBL) in an environment with correlated sources. We provide two new perspectives to understand SBL's strategy for handling correlated sources. Using a Minimum Power Distortionless Respons
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
End-To-End Multi-Person Audio/Visual Automatic Speech Recognition
Traditionally, audio-visual automatic speech recognition has been studied under the assumption that the speaking face on the visual signal is the face matching the audio. However, in a more realistic setting, when multiple faces are potentially on screen
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Bband Index: A No-Reference Banding Artifact Predictor
Banding artifact, or false contouring, is a common video compression impairment that tends to appear on large flat regions in encoded videos. These staircase-shaped color bands can be very noticeable in high-definition videos. Here we study this artifact,
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Gender Differences On The Perception And Production Of Utterances With Willingness And Reluctance In Chinese
This study intends to explore the effects of gender differences on the perception and production of emotional intonation with willingness and reluctance. In the perceptual study, 20 native Mandarin listeners were instructed to rate perceived degree of wil
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Experiments In Creating Online Course Content For Signal Processing Education
The creation of the NPTEL platform in India has led to a vast population of engineering students getting access to quality online content for Signal Processing. These courses are globally accessible, free of cost, and also provide a means of obtaining cer
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Gait Phase Segmentation Using Weighted Dynamic Time Warping And K-Nearest Neighbors Graph Embedding
Gait phase segmentation is the process of identifying the start and end of different phases within a gait cycle. It is essential to many medical applications, such as disease diagnosis or rehabilitation. This work utilizes inertial measurement units (IMUs
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Towards Blind Quality Assessment Of Concert Audio Recordings Using Deep Neural Networks
Live music audio and video recordings represent a large percentage of the huge amount of User Generated Content (UGC) that is available on the internet today. Applications and services related to the management and consumption of this content may signific
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
On The Limit Distribution Of The Canonical Correlation Coefficients Between The Past And The Future Of A High-Dimensional White Noise
It is shown that the distribution of the estimated canonical correlation coefficients between the past and the future of a high-dimensional multivariate white noise sequence converges almost surely towards a limit distribution whose density is given in cl
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Weight Sharing And Deep Learning For Spectral Data
We propose a novel method to co-train deep convolutional neural networks for data sets of differing position specific data. This is an advantage in chemometrics where individual measurements represent exact chemical compounds, e.g. for given wavelengths,
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Pathloss Prediction Using Deep Learning With Applications To Cellular Optimization And Efficient D2D Link Scheduling
In this paper we propose a highly efficient and very accurate method for estimating the propagation pathloss from a point x to all points y on the 2D plane. Our method, termed RadioUNet, is a deep neural network. For applications such as user-cell site as
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Audio-Based Detection Of Explicit Content In Music
We present a novel automatic system for performing explicit content detection directly on the audio signal. Our modular approach uses an audio-to-character recognition model, a keyword spotting model associated with a dictionary of carefully chosen keywor
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Ordinal Learning For Emotion Recognition In Customer Service Calls
Approaches toward ordinal speech emotion recognition (SER) tasks are commonly based on the categorical classification algorithms, where the rank-order emotions are arbitrarily treated as independent categories. To employ the ordinal information between em
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Multi-Phase Gammatone Filterbank For Speech Separation Via Tasnet
In this work, we investigate if the learned encoder of the end-to-end convolutional time-domain audio separation network (Conv-TasNet) is the key to its recent success, or if the encoder can just as well be replaced by a deterministic hand-crafted filterb
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Hybrid Text Normalization System Using Multi-Head Self-Attention For Mandarin
In this paper, we propose a hybrid text normalization system using multi-head self-attention. The system combines the advantages of a rule-based model and a neural model for text preprocessing tasks. Previous studies in Mandarin text normalization usually
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
On The Frequency Domain Detection Of High Dimensional Time Series
In this paper, we address the problem of detection, in the frequency domain, of a M-dimensional time series modeled as the output of a M ? K MIMO filter driven by a K-dimensional Gaussian white noise, and disturbed by an additive M-dimensional Gaussian co
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Perceptual Loss Function For Neural Modelling Of Audio Systems
This work investigates alternate pre-emphasis filters used as part of the loss function during neural network training for nonlinear audio processing. In our previous work, the error-to-signal ratio loss function was used during network training, with a f
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Minimal Personalization Of Dynamic Binaural Synthesis With Mixed Structural Modeling And Scattering Delay Networks
This paper provides a small set of essential parameters for a personalized and effective real-time auralization with headphones. An image-guided procedure with two 2D images of the user's head guides the mixed structural modeling of head-related transfer
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Filterbank Design For End-To-End Speech Separation
Single-channel speech separation has recently made great progress thanks to learned filterbanks as used in ConvTasNet. In parallel, parameterized filterbanks have been proposed for speaker recognition where only center frequencies and bandwidths are learn
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unet 3+: A Full-Scale Connected Unet For Medical Image Segmentation
Recently, a growing interest has been seen in deep learning-based semantic segmentation. UNet, which is one of deep learning networks with an encoder-decoder architecture, is widely used in medical image segmentation. Combining multi-scale features is one
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Accelerating Distributed Deep Learning By Adaptive Gradient Quantization
To accelerate distributed deep learning, gradient quantization technique is widely used to reduce the communication cost. However, the existing quantization schemes suffer from either model accuracy degradation or low compression ratio (arisen from a redu
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Realizability Of Planar Point Embeddings From Angle Measurements
Localization of a set of nodes is an important and a thoroughly researched problem in robotics and sensor networks. This paper is concerned with the theory of localization from inner-angle measurements. We focus on the challenging case where no anchor loc
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
End-To-End Voice Conversion Via Cross-Modal Knowledge Distillation For Dysarthric Speech Reconstruction
Dysarthric speech reconstruction (DSR) is a challenging task due to difficulties in repairing unstable prosody and correcting imprecise articulation. Inspired by the success of sequence-to-sequence (seq2seq) based text-to-speech (TTS) synthesis and knowle
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
On Design Of Optimal Smart Meter Privacy Control Strategy Against Adversarial Map Detection
We study the optimal control problem of the maximum a posteriori (MAP) state sequence detection of an adversary using smart meter data. The privacy leakage is measured using the Bayesian risk and the privacy-enhancing control is achieved in real-time usin
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Fifo Based Accelerator For Convolutional Neural Networks
In recent years, Deep Neural Networks (DNNs) have achieved state-of-the-art results in various fields like Computer Vision, Natural Language Processing and Speech Recognition. Of all the DNN architectures, Convolutional Neural Networks (CNNs) have been mo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unsupervised Style And Content Separation By Minimizing Mutual Information For Speech Synthesis
We present a method to generate speech from input text and a style vector that is extracted from a reference speech signal in an unsupervised manner, i.e., no style annotation, such as speaker information, is required. Existing unsupervised methods, durin
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
On The Effect Of Reflectance On Phasor Field Non-Line-Of-Sight Imaging
Non-line-of-sight (NLOS) imaging aims to visualize a occluded scene by exploiting its indirect reflections on visible surfaces. Previous methods approach this problem inverting the light transport on the hidden scene, but are limited to isolated, diffuse
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning Domain Invariant Representations For Child-Adult Classification From Speech
Diagnostic procedures for ASD (autism spectrum disorder) involve semi-naturalistic interactions between the child and a clinician. Computational methods to analyze these sessions require an end-to-end speech and language processing pipeline that go from r
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multimodal Violence Detection In Videos
Effective tools for detection of violence are highly demanded, specially when dealing with video streams. Such tools have a wide range of applications, from forensics and law enforcement to parental control over the ever increasing amount of videos availa
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
The Swax Benchmark: Attacking Biometric Systems With Wax Figures
A face spoofing attack occurs when an intruder attempts to impersonate someone who carries a gainful authentication clearance. It is a trending topic due to the increasing demand for biometric authentication on mobile devices, high-security areas, among o
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Nasil : Neural Architecture Search With Imitation Learning
Automated machine learning (AML) refers to a class of techniques that, given a problem, can find an optimal set of model architectures, properties, and parameters. In recent years, AML has shown great success in finding neural network structures that are
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Enhanced Decoding Algorithm For Coded Compressed Sensing
Coded compressed sensing is an algorithmic framework tailored to sparse recovery in very large dimensional spaces. This framework is originally envisioned for the unsourced multiple access channel, a wireless paradigm attuned to machine-type communication
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Semi-Regular Geometric Kernel Encoding & Reconstruction For Video Compression
Conventional video coding schemes employ a hybrid motion prediction / residual transform coding paradigm, which only exploits redundancy in individual pairs of video frames for compression gain. However, rigid geometric structures in 3D space---e.g., a bu
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Feedback Turbo Autoencoder
Designing channel codes is one of the core research areas for modern communication systems. Canonical channel codes asymptotically achieve near-capacity performance under a large block length regime for additive white gaussian noise channels. However, thi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Neural Oracle Search On N-Best Hypotheses
In this paper, we propose a neural search algorithm to select the most likely hypothesis using a sequence of acoustic representations and multiple hypotheses as input. The algorithm provides a sequence level score for each audio-hypothesis pair that is ob
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learn-By-Calibrating: Using Calibration As A Training Objective
Calibration error is commonly adopted for evaluating the quality of uncertainty estimators in deep neural networks. In this paper, we argue that such a metric is highly beneficial for training predictive models, even when we do not explicitly measure the
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Privacy-Preserving Image Sharing Via Sparsifying Layers On Convolutional Groups
We propose a practical framework to address the problem of privacy-aware image sharing in large-scale setups. We argue that, while compactness is always desired at scale, this need is more severe when trying to furthermore protect the privacy-sensitive co
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Simple Caching Schemes For Non-Homogeneous Miso Cache-Aided Communication Via Convexity
We present a novel scheme for cache-aided communication over multiple-input and single output (MISO) cellular networks. The presented scheme achieves the same number of degrees of freedom as known coded caching schemes, but, at much lower complexity. The
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Feedback Recurrent Autoencoder
In this work, we propose a new recurrent autoencoder architecture, termed Feedback Recurrent AutoEncoder (FRAE), for online compression of sequential data with temporal dependency. The recurrent structure of FRAE is designed to efficiently extract the red
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Accelerating Linear Algebra Kernels On A Massively Parallel Reconfigurable Architecture
Much of the recent work on domain-specific architectures has focused on bridging the gap between performance/efficiency and programmability. We consider one such example architecture, Transformer, consisting of light-weight cores interconnected by caches
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Arsm Gradient Estimator For Supervised Learning To Rank
We propose a new model for supervised learning to rank. In our model, the relevance labels are assumed to follow a categorical distribution whose probabilities are constructed based on a scoring function. We optimize the training objective with respect to
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Dynamically Modulated Deep Metric Learning For Visual Search
This paper propose dynamically modulated metric learning (DMML) for learning a tiered similarity space to perform visual search. Existing methods often treat the training samples having different degree of information with equal importance which hinders i
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Accurate Semidefinite Relaxation Method For 3-D Rigid Body Localization Using Aoa
This paper addresses the rigid body localization problem using angle-of-arrival measurements. We formulate the problem as a constrained weighted least squares (CWLS) minimization problem with the rotation matrix and position vector as variables, which is
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Joint Optimization Of Sampling Patterns And Deep Priors For Improved Parallel Mri
Multichannel imaging techniques are widely used in MRI to reduce the scan time. These schemes typically perform undersampled acquisition and utilize compressed-sensing based regularized reconstruction algorithms. Model-based deep learning (MoDL) framework
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Overlap Local-Sgd: An Algorithmic Approach To Hide Communication Delays In Distributed Sgd
Distributed stochastic gradient descent (SGD) is essential for scaling the machine learning algorithms to a large number of computing nodes. However, the infrastructures variability such as high communication delay or random node slowdown greatly impedes
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Frame-Level Phoneme-Invariant Speaker Embedding For Text-Independent Speaker Recognition On Extremely Short Utterances
This paper investigates a phoneme-invariant speaker embedding approach for speaker recognition on extremely short utterances. Intuitively, phonemes are nuisance information for text-independent speaker recognition task since the contents of the speech are
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Normalized Least-Mean-Square Algorithms With Minimax Concave Penalty
We propose a novel problem formulation for sparsity-aware adaptive filtering based on the nonconvex minimax concave (MC) penalty, aiming to obtain a sparse solution with small estimation bias. We present two algorithms: the first algorithm uses a single f