IEEE ICASSP 2020 Virtual Conference May 2020

Thu, 16 July, 2020

Showing 1901 - 1950 of 1951

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Towards Real-Time Single-Channel Singing-Voice Separation With Pruned Multi-Scaled Densenets

00:15:19

0 views

Modern musical source separation systems based on deep neural networks reach unprecedented levels of separation quality. However, harnessing the power of these large-scale models in typical audio production environments, which frequently offer only limite

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Hierarchical Federated Learning Across Heterogeneous Cellular Networks

00:12:12

0 views

We consider federated edge learning (FEEL), where mobile users (MUs) collaboratively learn a global model by sharing local updates on the model parameters rather than their datasets, with the help of a mobile base station (MBS). We optimize the resource a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

End-To-End Auditory Object Recognition Via Inception Nucleus

[2 Videos ]

Machine learning approaches to auditory object recognition are traditionally based on engineered features such as those derived from the spectrum or cepstrum. More recently, end- to-end classification systems in image and auditory recognition systems have

Show videos in this product

End-To-End Auditory Object Recognition Via Inception Nucleus

00:13:15

0 views

Machine learning approaches to auditory object recognition are traditionally based on engineered features such as those derived from the spectrum or cepstrum. More recently, end- to-end classification systems in image and auditory recognition systems have
End-To-End Auditory Object Recognition Via Inception Nucleus

00:14:50

0 views

Machine learning approaches to auditory object recognition are traditionally based on engineered features such as those derived from the spectrum or cepstrum. More recently, end- to-end classification systems in image and auditory recognition systems have

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Deep Contextualized Acoustic Representations For Semi-Supervised Speech Recognition

00:11:11

0 views

We propose a novel approach to semi-supervised automatic speech recognition (ASR). We first exploit a large amount of unlabeled audio data via representation learning, where we reconstruct an unseen temporal slice of filterbank features from past and futu

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Modeling Uncertainty In Predicting Emotional Attributes From Spontaneous Speech

00:14:01

0 views

A challenging task in affective computing is to build reliable speech emotion recognition (SER) systems that can accurately predict emotional attributes from spontaneous speech. To increase the trust in these SER systems, it is important to predict not on

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Modeling Behavioral Consistency In Large-Scale Wearable Recordings Of Human Bio-Behavioral Signals

00:18:01

1 view

Continuously-worn wearable sensors provide an unprecedented opportunity to unobtrusively measure rich bio-behavioral time-series recordings in natural settings such as the workplace. These time-series data can be helpful in inferring broad patterns of beh

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Application Informed Motion Signal Processing For Finger Motion Tracking Using Wearable Sensors

00:12:31

0 views

Finger motion tracking has applications in user-interfaces, sports analytics, medical rehabilitation and sign language translation. This paper presents a system called FinGTrAC that shows the feasibility of fine grained finger gesture tracking using low i

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

High-Accuracy And Low-Latency Speech Recognition With Two-Head Contextual Layer Trajectory Lstm Model

00:14:32

2 views

While the community keeps promoting end-to-end models over conventional hybrid models, which usually are long short-term memory (LSTM) models trained with a cross entropy criterion followed by a sequence discriminative training criterion, we argue that su

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A New Variational Method For Deep Supervised Semantic Image Hashing

00:11:50

0 views

We present a supervised semantic hashing method which uses a variational autoencoder to represent each database image sample as a product Bernoulli distribution. We show that the probability parameters approach extreme values during training, allowing the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

From Symbols To Signals: Symbolic Variational Autoencoders

00:14:21

0 views

We introduce Symbolic Variational Autoencoders which generate images from symbols that represent semantic concepts. Unlike generic Variational Autoencoders (VAEs) or Generative Adversarial Networks (GANs), the latent distribution from the Symbolic Variati

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Low-Complexity Fixed-Point Convolutional Neural Networks For Automatic Target Recognition

00:11:05

0 views

There has been growing interest in developing neural network based automatic target recognition systems for synthetic aperture radar applications. However, these networks are typically complex in terms of storage and computation which inhibits their deplo

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Learning Multi-Scale Attentive Features For Series Photo Selection

00:12:06

0 views

People used to take a series of nearly identical photos about the same subject, but it is usually a tedious chore to select the reversed ones from them. Despite the remarkable progress, most existing studies on image aesthetics assessment fail to fulfill

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

End-To-End Generation Of Talking Faces From Noisy Speech

00:12:10

0 views

Acoustic cues are not the only component in speech communication; if the visual counterpart is present, it is shown to benefit speech comprehension. In this work, we propose an end-to-end (no pre- or post-processing) system that can generate talking faces

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Study On The Transferability Of Adversarial Attacks In Sound Event Classification

00:13:38

0 views

An adversarial attack is an algorithm that perturbs the input of a machine learning model in an intelligent way in order to change the output of the model. An important property of adversarial attacks is transferability. According to this property, it is

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

An Alternative Signature Design Using L1 Principal Components For Spread-Spectrum Steganography

00:10:05

0 views

As methods for detecting hidden data evolve, there exits an ever increasing need to develop new steganographic solutions. This paper introduces novel spread spectrum (SS) and improved spread spectrum (ISS) multimedia data embedding techniques using L_1 pr

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Single Frequency Filter Bank Based Long-Term Average Spectra For Hypernasality Detection And Assessment In Cleft Lip And Palate Speech

00:12:02

0 views

Hypernasality is an abnormality in speech production observed in subjects with craniofacial anomalies like cleft lip and palate (CLP). Detection and assessment of hypernasality is a primary step in the clinical diagnosis of individuals with CLP. Existing

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Fg2Seq: Effectively Encoding Knowledge For End-To-End Task-Oriented Dialog

00:11:45

0 views

End-to-end Task-oriented spoken dialog systems typically require modeling two types of inputs, namely, the dialog history which is a sequence of utterances and the knowledge base (KB) associated with the dialog history. While modeling these inputs, curren

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Tensorflow Audio Models In Essentia

00:14:54

2 views

Essentia is a reference open-source C++/Python library for audio and music analysis. In this work, we present a set of algorithms that employ TensorFlow in Essentia, allow predictions with pre-trained deep learning models, and are designed to offer flexib

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Enhancing End-To-End Multi-Channel Speech Separation Via Spatial Feature Learning

00:13:18

0 views

Hand-crafted spatial features (e.g., inter-channel phase difference, IPD) play a fundamental role in recent deep learning based multi-channel speech separation (MCSS) methods. However, these manually designed spatial features are hard to incorporate into

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Joint Beamforming And Reverberation Cancellation Using A Constrained Kalman Filter With Multichannel Linear Prediction

00:15:42

0 views

The performance of speech processing systems degrades significantly in far-field scenarios where the distance between the user and microphones increases, leading to low signal-to-noise and signal-to-reverberation ratios. To address this challenge, combini

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Trapezoidal Segment Sequencing: A Novel Approach For Fusion Of Human-Produced Continuous Annotations

00:14:06

0 views

Generating accurate ground truth representations of human subjective experiences and judgements is essential for advancing our understanding of human-centered constructs such as emotions. Often, this requires the collection and fusion of annotations from

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Classification Of Epileptic Ieeg Signals By Cnn And Data Augmentation

00:12:18

0 views

Epileptic focus localization in patients with epileptic seizures is essential when surgery is needed. Recent studies show that this can be done automatically using machine learning approaches. However, well-designed feature extraction methods are often co

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Efficient And Scalable Neural Residual Waveform Coding With Collaborative Quantization

00:14:22

0 views

Scalability and efficiency are desired in neural speech codecs, which supports a wide range of bitrates for applications on various devices. We propose a collaborative quantization (CQ) scheme to jointly learn the codebook of LPC coefficients and the corr

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Learning A Subword Inventory Jointly With End-To-End Automatic Speech Recogntion

00:14:44

0 views

Recent work has demonstrated the promise of using subword units as output targets for sequence-to-sequence automatic speech recognition (ASR) models. Our work builds on the latent sequence decomposition (LSD) framework, in which the use of subword units f

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Rate Assignment In 360-Degree Video Tiled Streaming Using Random Forest Regression

00:14:10

0 views

Streaming of high-resolution 360-degree video is typically done in a viewport-dependent fashion such as in the tile-based viewport-dependent profile of MPEG OMAF wherein clients continuously adapt their tile selection according to the user viewport. From

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Raw Waveform Based End-To-End Deep Convolutional Network For Spatial Localization Of Multiple Acoustic Sources

00:16:05

1 view

In this paper, we present an end-to-end deep convolutional neural network operating on multi-channel raw audio data to localize multiple simultaneously active acoustic sources in space. Previously reported deep learning based approaches work well in local

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Image Segmentation Based Privacy-Preserving Human Action Recognition For Anomaly Detection

00:14:10

0 views

Human Action Recognition and Anomaly Detection significantly improved automatic video analysis, assisted living, and video-based surveillance. The focus of this work is on those applications where privacy protection is required, such as surveillance and a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Multi-Scaled Receptive Field Learning Approach For Medical Image Segmentation

00:13:27

0 views

Biomedical image segmentation has been widely studied, and lots of methods have been proposed. Among these methods, attention U-Net has achieved a promising performance. However, it has drawbacks of extracting the multi-scaled receptive field features at

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Control Of Linear Dynamical Systems Using Sparse Inputs

[2 Videos ]

In this work, we consider control of linear dynamical systems using sparse inputs. We provide an algorithm for determining a sequence of sparse inputs that will take the system from any given initial state to a desired final state, and stay in that state

Show videos in this product

Control Of Linear Dynamical Systems Using Sparse Inputs

00:12:25

0 views

In this work, we consider control of linear dynamical systems using sparse inputs. We provide an algorithm for determining a sequence of sparse inputs that will take the system from any given initial state to a desired final state, and stay in that state
Control Of Linear Dynamical Systems Using Sparse Inputs

00:12:25

0 views

In this work, we consider control of linear dynamical systems using sparse inputs. We provide an algorithm for determining a sequence of sparse inputs that will take the system from any given initial state to a desired final state, and stay in that state

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Study Of Formant Modification For Children Asr

00:12:44

0 views

The performance of automatic speech recognition systems for children?s speech is known to suffer from the large variation and mismatch in the acoustic and linguistic attributes between children?s and adults? speech. One of the various identified sources o

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Learning-Aided Content Placement In Caching-Enabled Fog Computing Systems Using Thompson Sampling

00:12:24

0 views

In this paper, we focus on the problem of online content placement with unknown content popularity in caching-enabled fog computing systems, i.e., how to decide and update cached content on resource-limited edge fog nodes to maximize cache hit rate and mi

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Epoch Extraction From A Speech Signal Using Gammatone Wavelets In A Scattering Network

00:13:13

0 views

In speech production, epochs are glottal closure instants where significant energy is released from the lungs. Extracting an epoch accurately is important in speech synthesis, analysis, and pitch oriented studies. The time-varying characteristics of the s

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Super-Resolution With Noisy Measurements: Reconciling Upper And Lower Bounds

00:14:44

1 view

This paper considers the problem of lower bounding the mean-squared-error (MSE) of unbiased super-resolution estimates. In literature, only upper bounds on the MSE are available which scale with the so-called super-resolution factor (SRF). However, the up

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Model-Based Deep Network For Mri Reconstruction Using Approximate Message Passing Algorithm

00:12:03

0 views

We propose a novel model-based network to reconstruct the magnetic resonance (MR) image. In this network, the Approximate Message Passing (AMP) algorithm is unrolled to solve the optimization problem of compressed sensing MR imaging, and several CNN block

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Achieving The Capacity Of The Dna Storage Channel

00:14:42

454 views

Significant advances in biochemical technologies, such as synthesizing and sequencing devices, have made DNA a competitive medium for archival data storage. In this paper we analyze storage systems based on these macromolecules from an information theoret

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Filtering Out Time-Frequency Areas Using Gabor Multipliers

00:15:56

0 views

We address the problem of filtering out localized time-frequency components in signals. The problem is formulated as a minimization of a suitable quadratic form, that involves a data fidelity term on the short-time Fourier transform outside the support of

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Cramer-Rao Bound On Doa Estimation Of Finite Bandwidth Signals Using A Moving Sensor

00:14:28

0 views

In this paper, we provide a framework for the direction of arrival (DOA) estimation using a moving sensor and evaluate performance bounds on estimation. We introduce a signal model which captures spatio-temporal incoherency in the received signal due to s

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Exploiting Vocal Tract Coordination Using Dilated Cnns For Depression Detection In Naturalistic Environments

00:14:54

477 views

Depression detection from speech continues to attract significant research attention but remains a major challenge, particularly when the speech is acquired from diverse smartphones in natural environments. Analysis methods based on vocal tract coordinati

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Boosted Locality Sensitive Hashing: Discriminative Binary Codes For Source Separation

00:14:59

0 views

Speech enhancement tasks have seen significant improvements with the advance of deep learning technology, but with the cost of increased computational complexity. In this study, we propose an adaptive boosting approach to learning locality sensitive hash

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Shadow Removal Of Text Document Images By Estimating Local And Global Background Colors

00:12:01

0 views

This paper proposes a simple yet effective method for removing shadows from text document images. Assuming that the document mainly contains texts, our method estimates the global and local background colors using statistical analysis of the whole image a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Gaussian Processes Over Graphs

00:15:40

0 views

Kernel Regression over Graphs (KRG) was recently proposed for predicting graph signals in a supervised learning setting, where the inputs are agnostic to the graph. KRG model predicts targets that are smooth graph signals as over the given graph, given th

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Comprehensive Framework For 2D-Jnd Extension To 360-Deg Images

00:10:29

0 views

Masking effect is one of the most important perceptual properties that could be modeled by estimating an adaptive threshold known as the just noticeable difference (JND) referring to the maximum difference not perceived by the human visual system (HVS). I

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Eigenbeam-Esprit For Doa-Vector Estimation

00:12:02

0 views

Several techniques exist to estimate the directions of arrival (DOAs) of sound sources captured with a spherical microphone array. The eigenbeam rotational invariance technique (EB-ESPRIT) uses recurrence relations of spherical harmonics to estimate the D

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Ertis: Real-Time 3D Acoustic Sonar Imaging Using Sparse Microphone Arrays

00:09:06

1 view

In recent years, our research group has developed state of the art 3D sonar sensors which use a low-cost MEMS microphone array for real-time acoustic imaging in air. Using this sensor, various robotic applications have been developed, including obstacle a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Novel Method For Obtaining Diffuse Field Measurements For Microphone Calibration

00:05:48

0 views

NOVELTY OF THE DEMO: Is it possible to obtain a diffused field response of a microphone array and perform calibration in under a minute? If such a method exists, is it possible to achieve an accuracy of half a dB from the expected response? The answer to

All Channels page: Communities submenu block

Communities

All Channels page: Societies submenu block

Societies

Events Showcase: ES submenu block

Event showcases

Recently Added Speakers

Events Hub Submenu block

Education: Education submenu block

Education Activity

2020 EAB AWARDS

2020 EAB AWARDS

IEEE ICASSP 2020 Virtual Conference May 2020