IEEE ICASSP 2020 Virtual Conference May 2020

Thu, 16 July, 2020

Showing 151 - 200 of 1951

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Proximal Dual Consensus Method For Linearly Coupled Multi-Agent Non-Convex Optimization

00:12:11

0 views

Motivated by large-scale signal processing and machine learning applications, this paper considers the distributed multi-agent optimization problem for a linearly constrained non-convex problem. Each of the agents owns a local cost function and local vari

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Extracting Unit Embeddings Using Sequence-To-Sequence Acoustic Models For Unit Selection Speech Synthesis

00:12:15

0 views

This paper presents a method of using the intermediate representations between linguistic and acoustic features in a Tacotron model to derive the cost functions for unit selection speech synthesis. By extracting the outputs of the Tacotron encoder, each p

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Non-Local Nested Residual Attention Network For Stereo Image Super-Resolution

00:12:25

0 views

Nowadays CNN-based stereo image super-resolution(SR) methods have obtained remarkable performance. However, most of existing methods only superficially portrayed the low layer features without considering the uneven distribution of information, which is i

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Bandit Sampling For Faster Activity And Data Detection In Massive Random Access

00:14:02

0 views

This paper considers the grant-free random access scheme in IoT networks with a massive number of devices that are sporadically active. By embedding the data symbols in the signature sequences, joint device activity detection, and data decoding can be ach

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Similarity Learning For Cover Song Identification Using Cross-Similarity Matrices Of Multi-Level Deep Sequences

00:12:11

0 views

In recent years, several deep learning models have been proposed for cover song identification and they have been designed to learn fixed-length feature vectors for music tracks. However, the aspect of temporal progression of music, which is important for

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Meta Learning For End-To-End Low-Resource Speech Recognition

00:14:22

0 views

In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Multi-Task Self-Supervised Learning For Robust Speech Recognition

[2 Videos ]

Despite the growing interest in unsupervised learning, extracting meaningful knowledge from unlabelled audio remains an open challenge. To take a step in this direction, we recently proposed a problem-agnostic speech encoder (PASE), that combines a convol

Show videos in this product

Multi-Task Self-Supervised Learning For Robust Speech Recognition

00:01:40

1 view

Despite the growing interest in unsupervised learning, extracting meaningful knowledge from unlabelled audio remains an open challenge. To take a step in this direction, we recently proposed a problem-agnostic speech encoder (PASE), that combines a convol
Multi-Task Self-Supervised Learning For Robust Speech Recognition

00:15:30

0 views

Despite the growing interest in unsupervised learning, extracting meaningful knowledge from unlabelled audio remains an open challenge. To take a step in this direction, we recently proposed a problem-agnostic speech encoder (PASE), that combines a convol

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Audio-Visual Calibration With Polynomial Regression For 2-D Projection Using Svd-Phat

00:13:49

0 views

This paper proposes a straightforward 2-D method to spatially calibrate the visual field of a camera with the auditory field of an array microphone by generating and overlaying an acoustic image over an optical image. Using a low-cost microphone array and

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Feature Affine Projection Algorithms

00:13:56

0 views

There is a growing research interest in proposing new techniques to detect and exploit signals/systems sparsity. Recently, the idea of hidden sparsity has been proposed, and it has been shown that, in many cases, sparsity is not explicit, and some tools a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Doa Estimation In Systems With Nonlinearities For Mmwave Communications

00:13:50

2 views

Accurate and efficient methods for Direction of Arrival (DOA) estimation play an important role in mmWave channel estimation methods. This estimation procedure can potentially be affected by the different RF and analog components in the communication syst

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Perception-Distortion Trade-Off With Restricted Boltzmann Machines

00:10:43

0 views

In this work, we introduce a new procedure for applying Restricted Boltzmann Machines (RBMs) to missing data inference tasks, based on linearization of the effective energy function governing the distribution of observations. We compare the performance of

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Evaluation Of Joint Auditory Attention Decoding And Adaptive Binaural Beamforming Approach For Hearing Devices With Attention Switching

00:14:52

1 view

Beamforming is a common technique used to improve speech intelligibility and listening comfort of hearing aids users in a noisy environment. Traditional hearing aids beamforming algorithms require the a priori knowledge of the auditory attention of the li

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Using X-Vectors To Automatically Detect Parkinson's Disease From Speech

00:12:01

0 views

The promise of new neuroprotective treatments to stop or slow the advance of Parkinson's Disease (PD) urges for new biomarkers or detection schemes that can deliver a faster diagnosis. Given that speech is affected by PD, the combination of deep neural ne

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Few-Shot Acoustic Event Detection Via Meta Learning

00:11:59

0 views

We study few-shot acoustic event detection (AED) in this paper. Few-shot learning enables detection of new events with very limited labeled data and facilitates personalization of AED systems for users in real applications. Compared to other research area

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Improving Voice Separation By Incorporating End-To-End Speech Recognition

00:14:58

0 views

Despite recent advances in voice separation methods, many challenges remain in realistic scenarios such as noisy recording and the limits of available data. In this work, we propose to explicitly incorporate the phonetic and linguistic nature of speech by

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Distilling Attention Weights For Ctc-Based Asr Systems

00:13:20

0 views

We present a novel training approach for connectionist temporal classification (CTC) -based automatic speech recognition (ASR) systems. CTC models are promising for building both a conventional acoustic model and an end-to-end (E2E) ASR model. However, CT

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Estimation Of Post-Nonlinear Causal Models Using Autoencoding Structure

00:13:11

0 views

Discovering causal relations in complex systems is an important problem in many research fields. To describe such systems involving nonlinear causal relations, the post-nonlinear (PNL) causal model has been proposed. However, despite its identifiability,

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Robust Low Rate Speech Coding Based On Cloned Networks And Wavenet

00:13:17

0 views

Rapid advances in machine-learning based generative modeling of speech make its use in speech coding attractive. However, the current performance of such models drops rapidly with noise contamination of the input, preventing use in practical applications.

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

3D Deformation Signature For Dynamic Face Recognition

00:15:34

0 views

This work proposes a novel 3D Deformation Signature (3DS) to represent a 3D deformation signal for 3D Dynamic Face Recognition. 3DS is computed given a non-linear 6D-space representation which guarantees physically plausible 3D deformations. A unique defo

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

An Analysis Of Speech Enhancement And Recognition Losses In Limited Resources Multi-Talker Single Channel Audio-Visual Asr

00:10:09

0 views

In this paper, we analyzed how audio-visual speech enhancement can help to perform the ASR task in a cocktail party scenario. Therefore we considered two simple end-to-end LSTM-based models that perform single-channel audiovisual speech enhancement and ph

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Fully-Neural Approach To Heavy Vehicle Detection On Bridges Using A Single Strain Sensor

00:14:30

0 views

Bridge weigh-in-motion (BWIM) is a technique for detecting heavy vehicles that may cause serious damage to real bridges. BWIM is realized by analyzing the strain signals observed at places on the bridge in terms of bridge-component responses to the axle l

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Energy Efficient Acceleration Of Floating Point Applications Onto Cgra

00:14:29

0 views

In this paper, we propose a novel CGRA architecture and associated compilation flow supporting both integer and floating-point computations for energy efficient acceleration of DSP applications. Experimental results show that the proposed accelerator achi

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Overlapped State Hidden Semi-Markov Model For Grouped Multiple Sequences

00:14:26

0 views

Efficient analysis of multiple sequential data is becoming necessary for identifying sequential patterns of multiple objects of interest. This analysis has major practical and technical importance because finding such patterns necessitates extraction and

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Anomalydae: Dual Autoencoder For Anomaly Detection On Attributed Networks

00:13:13

0 views

Anomaly detection on attributed networks aims at finding nodes whose patterns deviate significantly from the majority of reference nodes, which is pervasive in many applications such as network intrusion detection and social spammer detection. However, mo

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Improving The Performance Of Transformer Based Low Resource Speech Recognition For Indian Languages

00:13:19

0 views

The recent success of the Transformer based sequence-to-sequence framework for various Natural Language Processing tasks has motivated its application to Automatic Speech Recognition. In this work, we explore the application of Transformers on low resourc

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

On The Byzantine Robustness Of Clustered Federated Learning

00:11:58

0 views

Federated Learning (FL) is currently the most widely adopted framework for collaborative training of (deep) machine learning models under privacy constraints. Albeit it's popularity, it has been observed that Federated Learning yields suboptimal results i

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Dynamic Channel Pruning For Correlation Filter Based Object Tracking

00:12:44

0 views

Fusion of multi-channel representations has played a crucial role in the success of correlation filter (CF) based trackers. But, all channels do not contain useful information for target localization at every frame. During challenging scenarios, ambiguous

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Patch-Level Selection And Breadth-First Prediction Strategy For Reversible Data Hiding

00:12:10

0 views

A core work in reversible data hiding is designing an embedding method enabling the hider to take advantages of smooth elements as many as possible while the detection procedure for marked elements is invertible to the receiver. It motivates us to introdu

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Frequency-Dependent Directional Feedback Delay Network

00:14:24

0 views

A recent publication introduced the Directional Feedback Delay Network, a parametric artificial reverberation algorithm capable of producing direction-dependent energy decay. This method extends the capabilities of Feedback Delay Networks by using multich

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Auglabel: Exploiting Word Representations To Augment Labels For Face Attribute Classification

00:13:41

0 views

Augmenting data in image space (eg. flipping, cropping etc) and activation space (eg. dropout) are being widely used to regularise deep neural networks and have been successfully applied on several computer vision tasks. Unlike previous works, which are m

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Energan: A Generative Adversarial Network For Energy Disaggregation

00:12:38

0 views

An efficient, appliance-level approach for energy disaggregation, exploiting the benefits of Generative Adversarial Networks, is presented. The concept of adversarial training supports the creation of fine tuned dissagregators, which produce more detailed

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Wirtinger Flow Algorithms For Phase Retrieval From Binary Measurements

00:12:48

0 views

We consider the problem of Binary Phase Retrieval, wherein we attempt to recover signals from their quadratic measurements, which are further encoded as +1 or ?1 depending on whether they exceed a threshold or not. Binary encoding is the extreme case of q

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Optimal Design Of Energy-Efficient Cell-Free Massive Mimo: Joint Power Allocation And Load Balancing

00:11:30

0 views

A large-scale distributed antenna system that serves the users by coherent joint transmission is called Cell-free Massive MIMO (multiple input multiple output). For a given user set, only a subset of the access points (APs) is likely needed to satisfy the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Fusion Approaches For Emotion Recognition From Speech Using Acoustic And Text-Based Features

00:12:43

0 views

In this paper, we study different approaches for classifying emotions from speech using acoustic and text-based features. We propose to obtain contextualized word embeddings with BERT to represent the information contained in speech transcriptions and sho

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Multi-Branch Learning For Weakly-Labeled Sound Event Detection

00:15:13

0 views

There are two sub-tasks implied in the weakly-supervised SED: audio tagging and event boundary detection. Current methods which combine multi-task learning with SED requires annotations both for these two sub-tasks. Since there are only annotations for au

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

H-Vectors: Utterance-Level Speaker Embedding Using A Hierarchical Attention Model

00:12:36

0 views

In this paper, a hierarchical attention network is proposed to generate utterance-level embeddings (H-vectors) for speaker identification and verification. Since different parts of an utterance may have different contributions to speaker identities, the u

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

How Confident Are You? Exploring The Role Of Fillers In The Automatic Prediction Of A Speaker’s Confidence

00:12:04

0 views

"Fillers", example "um" in English, have been linked to the "Feeling of Another's Knowing (FOAK)" or the listener's perception of a speaker?s expressed confidence. Yet, in Spoken Language Processing (SLP) they remain unexplored, or overlooked as noise. We

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Comparative Study Of Estimating Articulatory Movements From Phoneme Sequences And Acoustic Features

00:13:58

0 views

Unlike phoneme sequences, movements of speech articulators (lips, tongue, jaw, velum) and the resultant acoustic signal are known to encode not only the linguistic message but also carry para-linguistic information. While several works exist for estimatin

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Distributed Tensor Completion Over Networks

00:14:34

0 views

The aim of this paper is to propose a novel distributed strategy for tensor completion, where (partial) data are collected over a network of agents with sparse, but connected, topology. The method hinges on the canonical polyadic decomposition, also known

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Privacy-Preserving Phishing Web Page Classification Via Fully Homomorphic Encryption

00:12:51

0 views

This work introduces a fast and lightweight homomorphic-encryption pipeline that enables privacy-preserving machine learning for phishing web page recognition. The primary goals are to use visual features to train an accurate model and to implement an inf

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Spatially Adaptive Intra Mode Pre-Selection For Erp 360 Video Coding

00:13:59

0 views

In this work, we propose a spatially adaptive HEVC intra mode pre-selection for equirectangular (ERP) 360 video coding. The proposed technique exploits the spatial characteristics of 360 video in the ERP projection to reduce the complexity of intra predic

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Bba-Net: A Bi-Branch Attention Network For Crowd Counting

00:13:10

0 views

In the field of crowd counting, the current mainstream CNN-based regression methods simply extract the density information of pedestrians without finding the position of each person. This makes the output of the network often found to contain incorrect re

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Incorporating Written Domain Numeric Grammars Into End-To-End Contextual Speech Recognition Systems For Improved Recognition Of Numeric Sequences

00:13:51

0 views

Accurate recognition of numeric sequences is crucial for many contextual speech recognition applications. For example, a user might create a calendar event and be prompted by a virtual assistant for the time, date, and duration of the event. We propose a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Within-Sample Variability-Invariant Loss For Robust Speaker Recognition Under Noisy Environments

00:10:39

0 views

Despite the significant improvements in speaker recognition enabled by deep neural networks, unsatisfactory performance persists under noisy environments. In this paper, we train the speaker embedding network to learn the ``clean'' embedding of the noisy

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Low Rank Activations For Tensor-Based Convolutional Sparse Coding

00:10:13

0 views

In this article, we propose to extend the classical Convolutional Sparse Coding model (CSC) to multivariate data by introducing a new tensor CSC model that enforces sparsity and low-rank constraint on the activations. The advantages of this model are thre

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Stock Movement Prediction That Integrates Heterogeneous Data Sources Using Dilated Causal Convolution Networks With Attention

00:19:05

0 views

The purpose of this research is to develop a high performing model for stock movement prediction utilizing financial indicators and news data. Until recently, the majority of prediction models have employed only the financial indicators, but they possess

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Exploring Pre-Training With Alignments For Rnn Transducer Based End-To-End Speech Recognition

00:11:28

0 views

Recently, the recurrent neural network transducer (RNN-T) architecture has become an emerging trend in end-to-end automatic speech recognition research due to its advantages of being capable for online streaming speech recognition. However, RNN-T training

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Fast Direction-Of-Arrival Estimation Of Multiple Targets Using Deep Learning And Sparse Arrays

00:16:20

1 view

In this work, we focus on improving the Direction-of-Arrival (DoA) estimation of multiple targets/sources from a small number of snapshots. Estimation via the sample covariance matrix is known to perform poorly, since the true manifold structure is not re

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Cra: A Generic Compression Ratio Adapter For End-To-End Data-Driven Image Compressive Sensing Reconstruction Frameworks

00:13:38

0 views

End-to-end data-driven image compressive sensing reconstruction (EDCSR) frameworks achieve state-of-the-art reconstruction performance in terms of reconstruction speed and accuracy. However, due to their end-to-end nature, existing EDCSR frameworks can no

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Slow-Time Mimo-Fmcw Automotive Radar Detection With Imperfect Waveform Separation

00:16:23

0 views

This paper considers object detection in the case of imperfect waveform separation, in the context of automotive radars that employ a slow-time MIMO-FMCW signaling scheme. We develop an explicit signal model that accounts for waveform separation residuals

All Channels page: Communities submenu block

Communities

All Channels page: Societies submenu block

Societies

Events Showcase: ES submenu block

Event showcases

Recently Added Speakers

Events Hub Submenu block

Education: Education submenu block

Education Activity

2020 EAB AWARDS

2020 EAB AWARDS

IEEE ICASSP 2020 Virtual Conference May 2020