IEEE ICASSP 2020 Virtual Conference May 2020

Thu, 16 July, 2020

Showing 1901 - 1950 of 1951

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Subjective Quality Estimation Using Pesq For Hands-Free Terminals

00:15:07

0 views

Previous reports have mentioned the possibility that subjective quality of the echo-suppressed speech signal can be estimated based on perceptual evaluation of speech quality (PESQ), but there are few experimental results. We propose third-party listening

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Prediction Of Voicing And The F0 Contour From Electromagnetic Articulography Data For Articulation-To-Speech Synthesis

00:13:36

0 views

Articulation-to-speech synthesis based solely on supraglottal articulation requires some sort of intonation control. This paper examines to what extent the f0 contour of an utterance can be predicted from such supraglottal articulation data. To that end,

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Bayesian Multiple Change-Point Detection With Limited Communication

00:14:54

0 views

Several modern applications involve large-scale sensor networks for statistical inference. For example, such sensor networks are of significant interest for Internet of Things applications. In this paper, we consider Bayesian multiple change-point detecti

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Hand-3D-Studio: A New Multi-View System For 3D Hand Reconstruction

00:12:18

0 views

This paper proposes a new system named as Hand-3D-Studio to capture the 3D hand pose and shape information. Our system includes 15 synchronized DSLR cameras, which can acquire high quality multi-view 4K resolution color images in a circular manner. We the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Source Enumeration Via Toeplitz Matrix Completion

00:13:48

5 views

This paper addresses the problem of source enumeration by an array of sensors in the presence of noise whose spatial covariance structure is a diagonal matrix with possibly different variances, referred to non-iid noise hereafter, when the sources are unc

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Multimodal Transformer Fusion For Continuous Emotion Recognition

00:13:29

0 views

Multimodal fusion increases the performance of emotion recognition because of the complementarity of different modalities. Compared with decision level and feature level fusion, model level fusion makes better use of the advantages of deep neural networks

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

End-To-End Training Of Time Domain Audio Separation And Recognition

00:13:52

0 views

The rising interest in single-channel multi-speaker speech separation sparked development of End-to-End (E2E) approaches to multispeaker speech recognition. However, up until now, state-of-the-art neural network?based time domain source separation has not

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Interrupted And Cascaded Permutation Invariant Training For Speech Separation

00:11:52

0 views

Permutation Invariant Training (PIT) has long been a stepping stone method for training speech separation model in handling the label ambiguity problem. With PIT selecting the minimum cost label assignments dynamically, very few studies considered the sep

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Dnn-Based Distributed Multichannel Mask Estimation For Speech Enhancement In Microphone Arrays

00:13:06

0 views

Multichannel processing is widely used for speech enhancement but several limitations appear when trying to deploy these solutions in the real world. Distributed sensor arrays that consider several devices with a few microphones is a viable solution which

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Performance Analysis For Path Attenuation Estimation Of Microwave Signals Due To Rainfall And Beyond

00:14:03

0 views

The attenuation of microwave signals can be used for meteorological observations. For example, the received signal level (RSL) of backhaul links of cellular systems, which usually has quantization error of 0.1 dB or more for commercial systems, has been u

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Bidirectional Context Propagation Network For Urine Sediment Particle Detection In Microscopic Images

00:12:27

0 views

The microscopic urine sediment examination is a crucial part in the evaluation of renal and urinary tract diseases. Recently, there are emerging CNNs-based detectors to detect the urine sediment particles in an end-to-end manner. However, it is not very c

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Hierarchical Attention Transfer Networks For Depression Assessment From Speech

00:10:39

0 views

A growing area of mental health research is the search for speech-based objective markers for conditions such as depression. However, when combined with machine learning, this search can be challenging due to a limited amount of annotated training data. I

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Group-Utility Metric For Efficient Sensor Selection And Removal In Lcmv Beamformers

00:16:28

0 views

In sensor arrays or sensor networks, tracking each sensor?s utility helps in excluding those which do not sufficiently contribute to the task at hand, thereby reducing energy consumption or avoiding model overfitting. In a linearly-constrained minimum var

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Optimal Sampling Rate And Bandwidth Of Bandlimited Signals - An Algorithmic Perspective

00:13:08

0 views

The bandwidth of a bandlimited signal is a key quantity that is relevant in numerous applications. For example, it determines the minimum sampling rate that is necessary to reconstruct a bandlimited signal from its samples. In this paper we study if it is

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Learning Spatio-Temporal Representations With Temporal Squeeze Pooling

00:09:31

0 views

In this paper, we propose a new video representation learning method, named Temporal Squeeze (TS) pooling, which can extract the essential movement information from a long sequence of video frames and map it into a set of few images, named Squeezed Images

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Improving Sample-Efficiency In Reinforcement Learning For Dialogue Systems By Using Trainable-Action-Mask

00:16:04

0 views

By interacting with human and learning from reward signals, reinforcement learning is an ideal way to build conversational AI. Concerning the expenses of real-users' responses, improving sample-efficiency has been the key issue when applying reinforcement

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Encoding Temporal Information For Automatic Depression Recognition From Facial Analysis

00:17:48

0 views

Depression is a mental illness that may be harmful to an individual?s health. Using deep learning models to recognize the facial expressions of individuals captured in videos has shown promising results for automatic depression detection. Typically, depre

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Cross Lingual Transfer Learning For Zero-Resource Domain Adaptation

00:16:53

0 views

We propose a method for zero-resource domain adaptation of DNN acoustic models, for use in low-resource situations where the only in-language training data available may be poorly matched to the intended target domain. Our method uses a multi-lingual mode

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Joint Scheduling And Beamforming For Delay Sensitive Traffic With Priorities And Deadlines

00:13:54

0 views

Packet scheduling in 5G networks can significantly affect the perfor- mance of beamforming techniques since the allocation of multiple users to the same time-frequency block causes interference between users. A combination of beamforming and scheduling ca

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Self-Driven Graph Volterra Models For Higher-Order Link Prediction

00:14:23

0 views

Link prediction is one of the core problems in network and data science with widespread applications. While predicting pairwise nodal interactions (links) in network data has been investigated extensively, predicting higher-order interactions (higher-orde

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Blind Bounded Source Separation Using Neural Networks With Local Learning Rules

00:13:00

1 view

An important problem encountered by both natural and engineered signal processing systems is blind source separation. In many instances of the problem, the sources are bounded by their nature and known to be so, even though the particular bound may not be

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Robust Covariance Matrix Estimation And Portfolio Allocation: The Case Of Non-Homogeneous Assets

00:14:12

0 views

This paper presents how the most recent improvements made on covariance matrix estimation and model order selection can be applied to the portfolio optimisation problem. The particular case of the Maximum Variety Portfolio is treated but the same improvem

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Deep Learning Approach To Object Affordance Segmentation

00:12:41

0 views

Learning to understand and infer object functionalities is an important step towards robust visual intelligence. Significant research efforts have recently focused on segmenting the object parts that enable specific types of human-object interaction, the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Alignment-Length Synchronous Decoding For Rnn Transducer

00:15:46

0 views

We present a beam decoding strategy for recurrent neural network transducers which has the characteristic that all competing hypotheses within the beam have the same alignment length (number of output symbols plus BLANK symbols). We contrast the proposed

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Generalized Spatial Modulation For Wireless Terabits Systems Under Sub-Thz Channel With Rf Impairments

00:12:05

0 views

Multiple-Input Multiple-Output (MIMO) technique with Index Modulation (IM) over sub-TeraHertz (sub-THz) bands represent a promising solution to design new wireless ultra-high data rate systems. However, the system design over sub-THz bands suffers from ma

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Towards Real-Time Single-Channel Singing-Voice Separation With Pruned Multi-Scaled Densenets

00:15:19

0 views

Modern musical source separation systems based on deep neural networks reach unprecedented levels of separation quality. However, harnessing the power of these large-scale models in typical audio production environments, which frequently offer only limite

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Hierarchical Federated Learning Across Heterogeneous Cellular Networks

00:12:12

0 views

We consider federated edge learning (FEEL), where mobile users (MUs) collaboratively learn a global model by sharing local updates on the model parameters rather than their datasets, with the help of a mobile base station (MBS). We optimize the resource a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

End-To-End Auditory Object Recognition Via Inception Nucleus

[2 Videos ]

Machine learning approaches to auditory object recognition are traditionally based on engineered features such as those derived from the spectrum or cepstrum. More recently, end- to-end classification systems in image and auditory recognition systems have

Show videos in this product

End-To-End Auditory Object Recognition Via Inception Nucleus

00:13:15

0 views

Machine learning approaches to auditory object recognition are traditionally based on engineered features such as those derived from the spectrum or cepstrum. More recently, end- to-end classification systems in image and auditory recognition systems have
End-To-End Auditory Object Recognition Via Inception Nucleus

00:14:50

0 views

Machine learning approaches to auditory object recognition are traditionally based on engineered features such as those derived from the spectrum or cepstrum. More recently, end- to-end classification systems in image and auditory recognition systems have

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Deep Contextualized Acoustic Representations For Semi-Supervised Speech Recognition

00:11:11

0 views

We propose a novel approach to semi-supervised automatic speech recognition (ASR). We first exploit a large amount of unlabeled audio data via representation learning, where we reconstruct an unseen temporal slice of filterbank features from past and futu

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Modeling Uncertainty In Predicting Emotional Attributes From Spontaneous Speech

00:14:01

0 views

A challenging task in affective computing is to build reliable speech emotion recognition (SER) systems that can accurately predict emotional attributes from spontaneous speech. To increase the trust in these SER systems, it is important to predict not on

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Modeling Behavioral Consistency In Large-Scale Wearable Recordings Of Human Bio-Behavioral Signals

00:18:01

1 view

Continuously-worn wearable sensors provide an unprecedented opportunity to unobtrusively measure rich bio-behavioral time-series recordings in natural settings such as the workplace. These time-series data can be helpful in inferring broad patterns of beh

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Application Informed Motion Signal Processing For Finger Motion Tracking Using Wearable Sensors

00:12:31

0 views

Finger motion tracking has applications in user-interfaces, sports analytics, medical rehabilitation and sign language translation. This paper presents a system called FinGTrAC that shows the feasibility of fine grained finger gesture tracking using low i

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

High-Accuracy And Low-Latency Speech Recognition With Two-Head Contextual Layer Trajectory Lstm Model

00:14:32

2 views

While the community keeps promoting end-to-end models over conventional hybrid models, which usually are long short-term memory (LSTM) models trained with a cross entropy criterion followed by a sequence discriminative training criterion, we argue that su

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A New Variational Method For Deep Supervised Semantic Image Hashing

00:11:50

0 views

We present a supervised semantic hashing method which uses a variational autoencoder to represent each database image sample as a product Bernoulli distribution. We show that the probability parameters approach extreme values during training, allowing the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

From Symbols To Signals: Symbolic Variational Autoencoders

00:14:21

0 views

We introduce Symbolic Variational Autoencoders which generate images from symbols that represent semantic concepts. Unlike generic Variational Autoencoders (VAEs) or Generative Adversarial Networks (GANs), the latent distribution from the Symbolic Variati

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Low-Complexity Fixed-Point Convolutional Neural Networks For Automatic Target Recognition

00:11:05

0 views

There has been growing interest in developing neural network based automatic target recognition systems for synthetic aperture radar applications. However, these networks are typically complex in terms of storage and computation which inhibits their deplo

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Learning Multi-Scale Attentive Features For Series Photo Selection

00:12:06

0 views

People used to take a series of nearly identical photos about the same subject, but it is usually a tedious chore to select the reversed ones from them. Despite the remarkable progress, most existing studies on image aesthetics assessment fail to fulfill

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

End-To-End Generation Of Talking Faces From Noisy Speech

00:12:10

0 views

Acoustic cues are not the only component in speech communication; if the visual counterpart is present, it is shown to benefit speech comprehension. In this work, we propose an end-to-end (no pre- or post-processing) system that can generate talking faces

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Study On The Transferability Of Adversarial Attacks In Sound Event Classification

00:13:38

0 views

An adversarial attack is an algorithm that perturbs the input of a machine learning model in an intelligent way in order to change the output of the model. An important property of adversarial attacks is transferability. According to this property, it is

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

An Alternative Signature Design Using L1 Principal Components For Spread-Spectrum Steganography

00:10:05

0 views

As methods for detecting hidden data evolve, there exits an ever increasing need to develop new steganographic solutions. This paper introduces novel spread spectrum (SS) and improved spread spectrum (ISS) multimedia data embedding techniques using L_1 pr

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Single Frequency Filter Bank Based Long-Term Average Spectra For Hypernasality Detection And Assessment In Cleft Lip And Palate Speech

00:12:02

0 views

Hypernasality is an abnormality in speech production observed in subjects with craniofacial anomalies like cleft lip and palate (CLP). Detection and assessment of hypernasality is a primary step in the clinical diagnosis of individuals with CLP. Existing

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Fg2Seq: Effectively Encoding Knowledge For End-To-End Task-Oriented Dialog

00:11:45

0 views

End-to-end Task-oriented spoken dialog systems typically require modeling two types of inputs, namely, the dialog history which is a sequence of utterances and the knowledge base (KB) associated with the dialog history. While modeling these inputs, curren

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Tensorflow Audio Models In Essentia

00:14:54

2 views

Essentia is a reference open-source C++/Python library for audio and music analysis. In this work, we present a set of algorithms that employ TensorFlow in Essentia, allow predictions with pre-trained deep learning models, and are designed to offer flexib

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Enhancing End-To-End Multi-Channel Speech Separation Via Spatial Feature Learning

00:13:18

0 views

Hand-crafted spatial features (e.g., inter-channel phase difference, IPD) play a fundamental role in recent deep learning based multi-channel speech separation (MCSS) methods. However, these manually designed spatial features are hard to incorporate into

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Joint Beamforming And Reverberation Cancellation Using A Constrained Kalman Filter With Multichannel Linear Prediction

00:15:42

0 views

The performance of speech processing systems degrades significantly in far-field scenarios where the distance between the user and microphones increases, leading to low signal-to-noise and signal-to-reverberation ratios. To address this challenge, combini

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Trapezoidal Segment Sequencing: A Novel Approach For Fusion Of Human-Produced Continuous Annotations

00:14:06

0 views

Generating accurate ground truth representations of human subjective experiences and judgements is essential for advancing our understanding of human-centered constructs such as emotions. Often, this requires the collection and fusion of annotations from

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Classification Of Epileptic Ieeg Signals By Cnn And Data Augmentation

00:12:18

0 views

Epileptic focus localization in patients with epileptic seizures is essential when surgery is needed. Recent studies show that this can be done automatically using machine learning approaches. However, well-designed feature extraction methods are often co

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Efficient And Scalable Neural Residual Waveform Coding With Collaborative Quantization

00:14:22

0 views

Scalability and efficiency are desired in neural speech codecs, which supports a wide range of bitrates for applications on various devices. We propose a collaborative quantization (CQ) scheme to jointly learn the codebook of LPC coefficients and the corr

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Learning A Subword Inventory Jointly With End-To-End Automatic Speech Recogntion

00:14:44

0 views

Recent work has demonstrated the promise of using subword units as output targets for sequence-to-sequence automatic speech recognition (ASR) models. Our work builds on the latent sequence decomposition (LSD) framework, in which the use of subword units f

All Channels page: Communities submenu block

Communities

All Channels page: Societies submenu block

Societies

Events Showcase: ES submenu block

Event showcases

Recently Added Speakers

Events Hub Submenu block

Education: Education submenu block

Education Activity

2020 EAB AWARDS

2020 EAB AWARDS

IEEE ICASSP 2020 Virtual Conference May 2020