IEEE ICASSP 2020 Virtual Conference May 2020 | IEEETV

Thu, 16 July, 2020

Showing 1551 - 1600 of 1951

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Exact Sparse Nonnegative Least Squares

00:14:47

0 views

We propose a novel approach to solve exactly the sparse nonnegative least squares problem, under hard l0 sparsity constraints. This approach is based on a dedicated branch-and-bound algorithm. This simple strategy is able to compute the optimal solution e

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Converting Written Language To Spoken Language With Neural Machine Translation For Language Modeling

00:14:43

0 views

When building a language model (LM) for spontaneous speech, the ideal situation is to have a large amount of spoken, in-domain training data. Having such abundant data, however, is not realistic. We address this problem by generating texts in spoken langu

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Exploring Appropriate Acoustic And Language Modelling Choices For Continuous Dysarthric Speech Recognition

00:14:41

0 views

There has been much recent interest in building continuous speech recognition systems for people with severe speech impairments, e.g., dysarthria. However, the datasets that are commonly used are typically designed for tasks other than ASR development, or

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

A Semi-Supervised Rank Tracking Algorithm For On-Line Unmixing Of Hyperspectral Images

00:15:24

0 views

This paper addresses the problem of rank tracking in real time hyperspectral image unmixing methods. Based on the On-line Alternating Direction Method of Multipliers (ADMM), we propose a new hyperspectral unmixing approach that integrates prior informatio

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Compressive Adaptive Bilateral Filtering

00:12:01

0 views

We propose a fast algorithm for an adaptive variant of the classical bilateral filter, where the range kernel is allowed to vary from pixel to pixel. Several fast and accurate algorithms have been proposed for bilateral filtering, but they assume that the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Clock Synchronization Over Networks Using Sawtooth Models

00:19:30

0 views

Clock synchronization and ranging over a wireless network with low communication overhead is a challenging goal with tremendous impact. In this paper, we study the use of time-to-digital converters in wireless sensors, which provides clock synchronization

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Generating Empathetic Responses By Looking Ahead The User’S Sentiment

00:14:56

0 views

An important aspect of human conversation difficult for machines is conversing with empathy, which is to understand the user's emotion and respond appropriately. Recent neural conversation models that attempted to generate empathetic responses either focu

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Robust Pricing Mechanism For Resource Sustainability Under Privacy Constraint In Competitive Online Learning Multi-Agent Systems

00:14:28

0 views

We consider the problem of resource congestion control for competing online learning agents under privacy and security constraints. Based on the non-cooperative game as the model for agents' interaction and the noisy online mirror ascent as the model for

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Time Reversal Based Robust Gesture Recognition Using Wifi

00:12:59

1 view

Gesture recognition using wireless sensing opened a plethora of applications in the field of human-computer interaction. However, most existing works are not robust without requiring wearables or tedious training/calibration. In this work, we propose WiGR

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Improving Prosody With Linguistic And Bert Derived Features In Multi-Speaker Based Mandarin Chinese Neural Tts

00:14:50

1 view

Recent advances of neural TTS have made ?human parity? synthesized speech possible when a large amount of studio-quality training data from a voice talent is available. However, with only limited, casual recordings from an ordinary speaker, human-like TTS

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

On Regularization Parameter For L0-Sparse Covariance Fitting Based Doa Estimation

00:14:27

0 views

In sparse DOA estimation methods, the regularization parameter is generally empirically tuned. In this paper, we provide a statistical method allowing to estimate an admissible interval where it must be chosen. This work is conducted in the case of an Uni

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Optimal Transport Based Change Point Detection And Time Series Segment Clustering

00:15:03

0 views

Two common problems in time series analysis are the decomposition of the data stream into disjoint segments, each of which is in some sense ?homogeneous? - a problem that is also referred to as Change Point Detection (CPD) - and the grouping of similar no

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Using Intelligent Reflecting Surfaces For Rank Improvement In Mimo Communications

00:15:07

0 views

An intelligent reflecting surface (IRS), consisting of reconfigurable metamaterials, can be used to partially control the radio environment and thereby bring new features to wireless communications. Previous works on IRS have particularly studied the rang

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Retinal Vessel Segmentation Via A Semantics And Multi-Scale Aggregation Network

00:12:38

0 views

Precise segmentation of retinal vessels is crucial for a computer-aided diagnosis system of retinal fundus images. However, this task remains challenging due to large variations in scales and poor segmentation of capillary vessels. In this paper, we propo

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Constrained Spectral Clustering For Dynamic Community Detection

00:15:31

0 views

Networks are useful representations of many systems with interacting entities, such as social, biological and physical systems. Characterizing the meso-scale organization, i.e. the community structure, is an important problem in network science. Community

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Sound Texture Synthesis Using Ri Spectrograms

00:14:45

0 views

This article introduces a new parametric synthesis method for sound textures based on existing works in visual and sound texture synthesis. Starting from a base sound signal, an optimization process is performed until the cross-correlations between the fe

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Fully Pipelined Iteration Unrolled Decoders The Road To Tb/S Turbo Decoding

00:13:43

0 views

Turbo codes are a well-known code class used for example in the LTE mobile communications standard. They provide built-in rate flexibility and a low-complexity and fast encoding. However, the serial nature of their decoding algorithm makes high-throughput

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Learning To Detect Keyword Parts And Whole By Smoothed Max Pooling

00:12:02

0 views

We propose smoothed max pooling loss and its application to keyword spotting systems. The proposed approach jointly trains an encoder (to detect keyword parts) and a decoder (to detect whole keyword) in a semi-supervised manner. The proposed new loss func

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Joint Coding And Modulation In The Ultra-Short Blocklength Regime For Bernoulli-Gaussian Impulsive Noise Channels Using Autoencoders

00:14:59

0 views

This paper develops a joint coding and modulation scheme for end-to-end communication system design using an autoencoder architecture in the ultra-short blocklength regime. Unlike the classical approach of separately designing error correction codes and m

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Leveraging Unpaired Text Data For Training End-To-End Speech-To-Intent Systems

00:14:00

0 views

Training an end-to-end (E2E) neural network speech-to-intent (S2I) system that directly extracts intents from speech requires large amounts of intent-labeled speech data, which is time consuming and expensive to collect. Initializing the S2I model with an

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Active Learning With Unsupervised Ensembles Of Classifiers

00:10:15

0 views

The present work introduces a simple scheme for active classification of data using unsupervised ensembles of classifiers. Uncertainty sampling, with different uncertainty measures, is evaluated for data selection, while an online expectation maximization

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Maximum Likelihood Multi-Speaker Direction Of Arrival Estimation Utilizing A Weighted Histogram

00:12:56

0 views

In this contribution, a novel maximum likelihood (ML) based direction of arrival (DOA) estimator for concurrent speakers in a noisy reverberant environment is presented. The DOA estimation task is formulated in the short-time Fourier transform (STFT) in t

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Optimal Joint Channel Estimation And Data Detection By L1-Norm Pca For Streetscape Iot

00:15:25

0 views

We prove, for the first time in the literature of communication theory and machine learning, the equivalence of joint maximum-likelihood (ML) optimal channel estimation and data detection (JOCEDD) to the problem of finding the $L_1$-norm principal compone

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Quickest Detection Of Growing Dynamic Anomalies In Networks

00:14:19

0 views

The problem of quickest growing dynamic anomaly detection in sensor networks is studied. Initially, the observations at the sensors, which are sampled sequentially by the decision maker, are generated according to a pre-change distribution. At some unknow

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Mt-Gcn For Multi-Label Audio Tagging With Noisy Labels

00:09:10

0 views

Multi-label audio tagging is the task of predicting the types of sounds occurring in an audio clip. Recently, large-scale audio datasets such as Google's AudioSet, have allowed researchers to use deep learning techniques for this task but this comes at th

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Small-Footprint Keyword Spotting On Raw Audio Data With Sinc-Convolutions

00:12:13

0 views

Keyword Spotting (KWS) enables speech-based user interaction on smart devices. Always-on and battery-powered application scenarios for smart devices put constraints on hardware resources and power consumption, while also demanding high accuracy as well as

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Far-Field Location Guided Target Speech Extraction Using End-To-End Speech Recognition Objectives

00:13:06

0 views

Target speech extraction is a specific case of source separation where an auxiliary information like the location or some pre-saved anchor speech examples of the target speaker is used to resolve the permutation ambiguity. Traditionally such systems are o

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

A Single-Rf Architecture For Multiuser Massive Mimo Via Reflecting Surfaces

00:14:21

0 views

In this work, we propose a new single-RF MIMO architecture which enjoys high scalability and energy-efficiency. The transmitter in this proposal consists of a single RF illuminator radiating towards a reflecting surface. Each element on the reflecting sur

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Speech-To-Singing Conversion In An Encoder-Decoder Framework

00:15:05

0 views

In this paper our goal is to convert a set of spoken lines into sung ones. Unlike previous signal processing based methods, we take a learning based approach to the problem. This allows us to automatically model various aspects of this transformation, thu

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Sparse Low-Redundancy Linear Array With Uniform Sum Co-Array

00:14:58

3 views

Sparse arrays can resolve vastly more scatterers than the number of sensors, in tasks such as coherent source localization. This entails significant cost reductions compared to conventional arrays with uniformly spaced elements. In this paper, we introduc

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Sdtcn: Similarity Driven Transmission Computing Network For Image Dehazing

00:12:31

0 views

Transmission similarity is an important feature which can greatly increase the capability of convolutional neural network (CNN) to fit transmission map. However, it is not sufficiently utilized in existing algorithms. In this paper, we propose a novel lig

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

One-Bit Compressed Sensing Using Generative Models

00:13:49

0 views

In this paper, we address the classical problem of one-bit compressed sensing. We present a deep learning based reconstruction algorithm that relies on a generative model. The generator which is a neural network, learns a mapping from a low dimensional sp

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Two-Element Biomimetic Antenna Array Design And Performance

00:15:30

0 views

Arrays of closely-spaced antennas with mutual coupling have been considered recently with analogies to the hearing mechanism in small insects that exhibit excellent direction finding capabilities. We develop a model for a two-element array system that inc

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Toward Better Speaker Embeddings: Automated Collection Of Speech Samples From Unknown Distinct Speakers

00:10:56

0 views

The accuracy of speaker verification and diarization models depends on the quality of the speaker embeddings used to separate audio samples from different speakers. With the goal of training better embedding models, we devise an au- tomatic pipeline for l

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis

00:15:13

0 views

This paper proposes a hierarchical, fine-grained and interpretable latent variable model for prosody based on the Tacotron 2 text-to-speech model. It achieves multi-resolution modeling of prosody by conditioning finer level representations on coarser leve

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Real-Time, Universal, And Robust Adversarial Attacks Against Speaker Recognition Systems

00:14:26

0 views

As the popularity of voice user interface (VUI) exploded in recent years, speaker recognition system has emerged as an important medium of identifying a speaker in many security-required applications and services. In this paper, we propose the first real-

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

A Simple But Effective Bert Model For Dialog State Tracking On Resource-Limited Systems

00:12:00

0 views

In a task-oriented dialog system, the goal of dialog state tracking (DST) is to monitor the state of the conversation from the dialog history. Recently, many deep learning based methods have been proposed for the task. Despite their impressive performance

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Speaker-Aware Target Speaker Enhancement By Jointly Learning With Speaker Embedding Extraction

00:14:54

1 view

Deep learning based speech separation approaches have received great interest, among which the recent speaker-aware speech enhancement methods are promising for solving difficulties such as arbitrary source permutation and unknown number of sources. In th

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Wawenets: A No-Reference Convolutional Waveform-Based Approach To Estimating Narrowband And Wideband Speech Quality

00:13:38

0 views

Building on prior work we have developed a no-reference (NR) waveform-based convolutional neural network (CNN) architecture that can accurately estimate speech quality or intelligibility of narrowband and wideband speech segments. These Wideband Audio Wav

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Automatic Fluency Evaluation Of Spontaneous Speech Using Disfluency-Based Features

00:12:41

0 views

This paper describes an automatic fluency evaluation of spontaneous speech. Although we regularly observe a variety of different disfluencies in spontaneous speech, we focus on two types of phenomena, i.e., filled pauses and word fragments. This paper aim

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

A Time-Frequency Network With Channel Attention And Non-Local Modules For Artificial Bandwidth Extension

00:12:50

0 views

Convolution neural networks (CNNs) have been achieving increasing attention for the artificial bandwidth extension (ABE) task recently. However, these methods use the flipped low-frequency phase to reconstruct speech signals, which may lead to the well-kn

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Interpretable Self-Attention Temporal Reasoning For Driving Behavior Understanding

00:14:58

0 views

Performing driving behaviors based on causal reasoning is essential to ensure driving safety. In this work, we investigated how state-of-the-art 3D Convolutional Neural Networks (CNNs) perform on classifying driving behaviors based on causal reasoning. We

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Polarizing Front Ends For Robust Cnns

00:14:49

0 views

The vulnerability of deep neural networks to small, adversarially designed perturbations can be attributed to their ?excessive linearity.? In this paper, we propose a bottom-up strategy for attenuating adversarial perturbations using a nonlinear front end

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

A Novel Rank Selection Scheme In Tensor Ring Decomposition Based On Reinforcement Learning For Deep Neural Networks

00:14:58

0 views

Tensor decomposition has been proved to be effective for solving many problems in signal processing and machine learning. Recently, tensor decomposition finds its advantage for compressing deep neural networks. In many applications of deep neural networks

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Voice Based Classification Of Patients With Amyotrophic Lateral Sclerosis, Parkinson's Disease And Healthy Controls With Cnn-Lstm Using Transfer Learning

00:13:38

0 views

In this paper, we consider 2-class and 3-class classification problems for classifying patients with Amyotrophic Lateral Sclerosis (ALS), Parkinson?s Disease (PD), and Healthy Controls (HC) using a CNN-LSTM network. Classification performance is examined

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Automatic Event Detection Of Rem Sleep Without Atonia From Polysomnography Signals Using Deep Neural Networks

00:10:49

0 views

Rapid eye movement (REM) sleep behavior disorder (RBD) is a sleep disorder that features loss of atonia, or REM sleep without atonia (RSWA). RBD and RSWA are early manifestations of degenerative neurological diseases such as Parkinson's and Lewy Body Deme