IEEE ICASSP 2020 Virtual Conference May 2020

Thu, 16 July, 2020

Showing 1001 - 1050 of 1951

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Accurate Localization Of Auv In Motion By Explicit Solution Using Time Delays

00:12:21

0 views

Accurate localization of an autonomous underwater vehicle (AUV) is essential in many applications. The motion of an AUV during the measurement acquisition period can be significant and the localization performance can suffer considerably if it is neglecte

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Algorithmic Exploration Of American English Dialects

00:10:50

0 views

In this paper, we use a novel algorithmic approach to explore dialectal variation in American English speech. Without the need for human annotations, we are able to use a corpus transcribed in text form only. Our results show that, in general, American En

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Improving Deep Learning Classification Of Jpeg2000 Images Over Bandlimited Networks

00:12:29

0 views

JPEG2000 (j2k) is a highly popular format for image and video compression. It plays a major role in the rapidly growing applications of cloud based image classification. Considering limited network bandwidth, we propose an end-to-end deep learning framewo

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Multi-View Approach For Mandarin Non-Native Mispronunciation Verification

00:13:05

0 views

Traditionally, the performance of non-native mispronunciation verification systems relied on effective phone-level labelling of non-native corpora. In this study, a multi-view approach is proposed to incorporate discriminative feature representations whic

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

K-Autoencoders Deep Clustering

00:10:59

0 views

In this study we propose a deep clustering algorithm that extends the k-means algorithm. Each cluster is represented by an autoencoder instead of a single centroid vector. Each data point is associated with the autoencoder which yields the minimal reconst

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Adversarial Multi-Task Learning For Speaker Normalization In Replay Detection

00:11:24

0 views

Spoofing detection algorithms in voice biometrics are adversely affected by differences in the speech characteristics of the various target users. In this paper, we propose a novel speaker normalisation technique that employs adversarial multi-task learni

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

The Matched Reassigned Cross-Spectrogram For Phase Estimation

00:13:16

1 view

In this paper, the matched reassigned spectrogram is expanded into a novel matched phase reassignment (MPR) method based on the reassigned cross-spectrogram. It is shown that for two phase synchronized oscillating transient signals, the method gives perfe

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Interpretable Machine Learning In Sustainable Edge Computing: A Case Study Of Short-Term Photovoltaic Power Output Prediction

00:16:03

0 views

With the Internet of Things continuously penetrating into all spheres of our daily life, the increasing use of smart devices enabled the emergence of the edge computing paradigm. To meet the needs of saving energy and reducing electricity bills for each h

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Sparse Convolutional Beamforming For Wireless Ultrasound

00:12:29

1 view

Wireless ultrasound systems can make the imaging process much more efficient, affordable and accessible for users. The standard technique to create B-mode images is to rely on delay and sum (DAS) beamforming, in which the signals at each transducer elemen

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Divergence-Based Adaptive Extreme Video Completion

00:12:15

0 views

Extreme image or video completion, where, for instance, we only retain 1% of pixels in random locations, allows for very cheap sampling in terms of the required pre-processing. The consequence is, however, a reconstruction that is challenging for humans a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Fast Acoustic Scattering Using Convolutional Neural Networks

00:15:59

0 views

Diffracted scattering and occlusion are important acoustic effects in interactive auralization and noise control applications, typically requiring expensive numerical simulation. We propose training a convolutional neural network to map from a convex scat

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Adversarial Video Compression Guided By Soft Edge Detection

00:14:57

0 views

We propose a video compression framework using conditional Generative Adversarial Networks (GANs). We rely on two encoders: one that deploys a standard video codec and another one which generates low-level soft edge maps. For decoding, we use a standard v

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Transformer Vae: A Hierarchical Model For Structure-Aware And Interpretable Music Representation Learning

00:11:57

0 views

Structure awareness and interpretability are two of the most desired properties of music generation algorithms. Structure-aware models generate more natural and coherent music with long-term dependencies, while interpretable models are more friendly for h

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Geometrically Constrained Independent Vector Analysis For Directional Speech Enhancement

00:15:09

0 views

This paper addresses the multichannel directional speech enhancement problem with geometrically constrained independent vector analysis (GCIVA), where we aim to combine the high separation performance from blind source separation and the capability of dir

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Espnet-Tts: Unified, Reproducible, And Integratable Open Source End-To-End Text-To-Speech Toolkit

00:14:22

0 views

This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet-TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit supports state-of-the-art E2E-TTS models, including Tacotron~2, Transformer TT

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

The Processing Of Mandarin Chinese Tonal Alternations In Contexts: An Eye-Tracking Study

00:10:58

0 views

This study investigated the perception of Mandarin tonal alternations in disyllabic words. In Mandarin, a low-dipping Tone3 is converted to a high-rising Tone2 when followed by another Tone3, known as third tone sandhi. Although previous studies showed st

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Adversarial Attacks On Gmm I-Vector Based Speaker Verification Systems

00:14:20

0 views

This work investigates the vulnerability of Gaussian Mixture Model (GMM) i-vector based speaker verification systems to adversarial attacks, and the transferability of adversarial samples crafted from GMM i-vector based systems to x-vector based systems.

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Detecting Mismatch Between Text Script And Voice-Over Using Utterance Verification Based On Phoneme Recognition Ranking

00:15:00

0 views

The purpose of this study is to detect the mismatch between text script and voice-over. For this, we present a novel utterance verification (UV) method, which calculates the degree of correspondence between a voice-over and the phoneme sequence of a scrip

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Regularized Beamformer For The Spherical Microphone Array To Cope With The White Noise Amplification

[2 Videos ]

Spherical microphone arrays with compact aperture and maximum directivity factor have been one of the popular research fields but are usually accompanied by the white noise amplification problem, which hinders them for practical applications. This paper p

Show videos in this product

Regularized Beamformer For The Spherical Microphone Array To Cope With The White Noise Amplification

00:11:39

0 views

Spherical microphone arrays with compact aperture and maximum directivity factor have been one of the popular research fields but are usually accompanied by the white noise amplification problem, which hinders them for practical applications. This paper p
Regularized Beamformer For The Spherical Microphone Array To Cope With The White Noise Amplification

00:12:09

0 views

Spherical microphone arrays with compact aperture and maximum directivity factor have been one of the popular research fields but are usually accompanied by the white noise amplification problem, which hinders them for practical applications. This paper p

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Neural Network Based On First Principles

00:13:25

0 views

In this paper, a Neural network is derived from first principles, assuming only that each layer begins with a linear dimension-reducing transformation. The approach appeals to the principle of Maximum Entropy (Max-Ent) to find the posterior distribution o

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Av(Se)²: Audio-Visual Squeeze-Excite Speech Enhancement

00:13:38

1 view

The goal of audio-visual speech enhancement (AVSE) is to supplement audio-only information with visual information, such as target speaker's lip movements, to improve the intelligibility and overall perceptual quality of noisy speech signals. We propose a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Unseen Face Presentation Attack Detection With Hypersphere Loss

00:13:01

0 views

Presentation attack is one of the main threats to face verification systems and attracts great attention of research community. Recent methods achieve great success in intra-database test. However, the problem is more complex in practical scenario as the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

L-Vector: Neural Label Embedding For Domain Adaptation

00:15:24

0 views

We propose a novel neural label embedding (NLE) scheme for the domain adaptation of a deep neural network (DNN) acoustic model with unpaired data samples from source and target domains. With NLE method, we distill the knowledge from a powerful source-doma

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Attention-Guided Deraining Network Via Stage-Wise Learning

00:12:46

0 views

Due to diverse rain shapes, directions, densities as well as different distances to cameras, rain streaks in the air are interweaved and overlapped. However, most existing deraining methods are inherently oblivious this phenomenon and tend to learn a sing

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Joint Enhancement And Denoising Of Low Light Images Via Jnd Transform

00:13:20

0 views

Low light images suffer from low dynamic range and severe noise due to low signal-to-noise ratio (SNR). In this paper, we propose joint enhancement and denoising of low light images via just-noticeable-difference (JND) transform. We achieve contrast enhan

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Signal Clustering With Class-Independent Segmentation

00:14:00

0 views

Radar signals have been dramatically increasing in complexity, limiting the source separation ability of traditional approaches. In this paper we propose a Deep Learning-based clustering method, which encodes concurrent signals into images, and, for the f

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Model Order Selection In Doa Scenarios Via Cross-Entropy Based Machine Learning Techniques

00:15:01

0 views

In this paper, we present a machine learning approach for estimating the number of incident wavefronts in a direction of arrival scenario. In contrast to previous works, a multilayer neural network with a cross-entropy objective is trained. Furthermore, w

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Instant Adaptive Learning: An Adaptive Filter Based Fast Learning Model Construction For Sensor Signal Time Series Classification On Edge Devices

00:14:04

0 views

Construction of learning model under computational and energy constraints, particularly in highly limited training time requirement is a critical as well as unique necessity of many practical IoT applications that use time series sensor signal analytics f

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Spectrograms Fusion With Minimum Difference Masks Estimation For Monaural Speech Dereverberation

00:13:16

0 views

Spectrograms fusion is an effective method for incorporating complementary speech dereverberation systems. Previous linear spectrograms fusion by averaging multiple spectrograms shows very good performance. However, this simple method can?t be applied to

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Sequential Deep Unrolling With Flow Priors For Robust Video Deraining

00:12:10

0 views

Video deraining has attracted wide attention since the urgent demand of high-quality video in recent years. The indistinct details and nonideal deraining effects are the most common defects in existing techniques, whose cause lies in the insufficient usag

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Linear Speedup In Saddle-Point Escape For Decentralized Non-Convex Optimization

00:14:31

0 views

Under appropriate cooperation protocols and parameter choices, fully decentralized solutions for stochastic optimization have been shown to match the performance of centralized solutions and result in linear speedup (in the number of agents) relative to n

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Weighted Krylov-Levenberg-Marquardt Method For Canonical Polyadic Tensor Decomposition

00:14:27

0 views

Weighted canonical polyadic (CP) tensor decomposition appears in a wide range of applications. A typical situation where the weighted decomposition is needed is when some tensor elements are unknown, and the task is to fill in the missing elements under t

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Streaming Automatic Speech Recognition With The Transformer Model

00:14:28

1 view

Encoder-decoder based sequence-to-sequence models have demonstrated state-of-the-art results in end-to-end automatic speech recognition (ASR). Recently, the transformer architecture, which uses self-attention to model temporal context information, has bee

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Source Coding Of Audio Signals With A Generative Model

00:13:53

0 views

We consider source coding of audio signals with the help of a generative model. We use a construction where a waveform is first quantized, yielding a finite bitrate representation. The waveform is then reconstructed by random sampling from a model conditi

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Lifter Training And Sub-Band Modeling For Computationally Efficient And High-Quality Voice Conversion Using Spectral Differentials

00:14:29

0 views

In this paper, we propose computationally efficient and high-quality methods for statistical voice conversion (VC) with direct waveform modification based on spectral differentials. The conventional method with a minimum-phase filter achieves high-quality

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Speaker Adaptation Of A Multilingual Acoustic Model For Cross-Language Synthesis

00:12:38

0 views

Several studies have shown promising results in adapting DNN-based acoustic models as a mechanism to transfer characteristics from pre-trained models. One such example is speaker adaptation using a small amount of data, where fine-tuning has helped train

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

One-Shot Voice Conversion By Vector Quantization

00:13:36

0 views

In this paper, we propose a vector quantization (VQ) based one-shot voice conversion (VC) approach without any supervision on speaker label. We model the content embedding as a series of discrete codes and take the difference between quantize-before and q

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Reconstruction Of Fri Signals Using Deep Neural Network Approaches

00:15:02

0 views

Finite Rate of Innovation (FRI) theory considers sampling and reconstruction of classes of non-bandlimited continuous signals that have a small number of free parameters, such as a stream of Diracs. The task of reconstructing FRI signals from discrete sam

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Meta-Learning For Robust Child-Adult Classification From Speech

00:14:26

0 views

Computational modeling of naturalistic conversations in clinical applications has seen growing interest in the past decade. An important use-case involves child-adult interactions within the autism diagnosis and intervention domain. In this paper, we addr

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Residual Attention Network For Wavelet Domain Super-Resolution

00:14:54

0 views

Single-image super-resolution plays an important role in computer vision area. However, previous works using convolutional neural networks perform badly when reconstructing high frequency details, result in over-smooth and lacking of textural information

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Supervised Graph Representation Learning For Modeling The Relationship Between Structural And Functional Brain Connectivity

00:14:33

1 view

In this paper, we propose a supervised graph representation learning method to model the relationship between brain functional connectivity (FC) and structural connectivity (SC) through a graph encoder-decoder system. The graph convolutional network (GCN)

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Online Tensor Completion And Free Submodule Tracking With The T-Svd

00:15:00

0 views

We propose a new online algorithm, called TOUCAN, for the tensor completion problem of imputing missing entries of a low tubal-rank tensor using the tensor-tensor product (t-product) and tensor singular value decomposition (t-SVD) algebraic framework. We

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Feature Enhancement With Deep Feature Losses For Speaker Verification

00:14:04

0 views

Speaker Verification still suffers from the challenge of generalization to novel adverse environments. We leverage on the recent advancements made by deep learning based speech enhancement and propose a feature-domain supervised denoising based solution.

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Multi-Stage Residual Hiding For Image-Into-Audio Steganography

00:12:39

0 views

The widespread application of audio communication technologies has speeded up audio data flowing across the Internet, which made it a popular carrier for covert communication. In this paper, we present a cross-modal steganography method for hiding image c

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Improving Sequence-To-Sequence Speech Recognition Training With On-The-Fly Data Augmentation

00:16:13

0 views

Sequence-to-Sequence (S2S) models recently started to show state-of-the-art performance for automatic speech recognition (ASR). With these large and deep models overfitting remains the largest problem, outweighing performance improvements that can be obta

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Riemannian Geometry And CraméR-Rao Bound For Blind Separation Of Gaussian Sources

00:16:41

0 views

We consider the optimal performance of blind separation of Gaussian sources. In practice, this estimation problem is solved by a two-step procedure: estimation of a set of covariance matrices from the observed data and approximate joint diagonalization of

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Humbug Zooniverse: A Crowd-Sourced Acoustic Mosquito Dataset

00:11:25

0 views

Mosquitoes are the only known vector of malaria, which leads to hundreds of thousands of deaths each year. Understanding the number and location of potential mosquito vectors is of paramount importance to aid the reduction of malaria transmission cases. I

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Receiver Design And Agc Optimization With Self Interference Induced Saturation

00:14:27

0 views

In-band Full Duplex (FD) is a wireless communication technology which has the potential to transmit and receive simultaneously in the same frequency band. Self-interference cancellation (SIC) is the key enabler to achieve FD operation. As the SI is severe

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Spatial Attention For Far-Field Speech Recognition With Deep Beamforming Neural Networks

00:12:03

0 views

In this paper, we introduce spatial attention for refining the information in multi-direction neural beamformer for far-field automatic speech recognition. Previous approaches of neural beamformers with multiple look directions, such as the factored compl

All Channels page: Communities submenu block

Communities

All Channels page: Societies submenu block

Societies

Events Showcase: ES submenu block

Event showcases

Recently Added Speakers

Events Hub Submenu block

Education: Education submenu block

Education Activity

2020 EAB AWARDS

2020 EAB AWARDS

IEEE ICASSP 2020 Virtual Conference May 2020