Showing 745 - 768 of 23806
We consider decentralized consensus optimization when workers sample data from non-identical distributions and perform variable amounts of work due to slow nodes known as stragglers. The problem of…
Speaker separation refers to the task of separating a mixture signal comprising two or more speakers. Impressive advances have been made recently in deep learning based talker-independent speaker…
We propose a simple but efficient method termed Guided Learning for weakly-labeled semi-supervised sound event detection (SED). There are two sub-targets implied in weakly-labeled SED: audio tagging…
Heading direction information is crucial to many ubiquitous computing applications. The main stream has been resort- ing to inertial sensors, such as accelerometer, gyroscope and magnetometer, which…
1 views
Robust and computationally efficient anomaly detection in videos is a problem in video surveillance systems. We propose a technique to increase robustness and reduce computational complexity in a…
523 views

This video describes the types of IEEE Article Processing Charges and ordering reprints.

473 views
Recently, there has been growth in providers of speech transcription services enabling others to leverage technology they would not normally be able to use. As a result, speech-enabled solutions have…
We propose an approach for pre-training speech representations via a masked reconstruction loss. Our pre-trained encoder networks are bidirectional and can therefore be used directly in typical…
This paper presents the design and implementation of an Automatic Gain Control (AGC) embedded algorithm for photoplethysmographic (PPG) sensors. We use a number of statistical and spectral…
The interconnection of social, email, and media platforms enables adversaries to manipulate networked data and promote their malicious intents. This paper introduces graph neural network…
Unmanned aerial vehicles (UAVs) can be utilized as aerial base stations to provide communication service for remote mobile users due to their high mobility and flexible deployment. However, the line-…
This paper describes the winning systems developed by the BUT team for the four tracks of the second DIHARD speech diarization challenge. For tracks 1 and 2 the systems were mainly based on…
Recent studies have highlighted adversarial examples as ubiquitous threats to the deep neural network (DNN) based speech recognition systems. In this work, we present a U-Net based attention model, U…
2 views
In this work, we introduce a new procedure for applying Restricted Boltzmann Machines (RBMs) to missing data inference tasks, based on linearization of the effective energy function governing the…
Traditionally, audio-visual automatic speech recognition has been studied under the assumption that the speaking face on the visual signal is the face matching the audio. However, in a more realistic…
For pixel-level crowd understanding?it is time-consuming and laborious in data collection and annotation. Some domain adaptation algorithms try to liberate it by training models with synthetic data,…
This paper deals with positive semidefinite matrix factorization (PSDMF). PSDMF writes each entry of a nonnegative matrix as the inner product of two symmetric positive semidefinite matrices. PSDMF…
Reversible residual network naturally extends the linear lifting scheme with no theoretic guarantee. In this paper, we propose a reversible autoencoder (Rev-AE) with this extended non-linear lifting…
We propose equalization-based data detection algorithms for all-digital millimeter-wave (mmWave) massive multiuser multiple-input multiple-out (MU-MIMO) systems that exploit sparsity in the beamspace…
Deep convolutional neural networks (CNN) have demonstrated superior performance in image super-resolution (SR) problem.However, CNNs are known to be heavily over-parameterized, and suffer from…
Suicide is a major societal challenge globally, with a wide range of risk factors, from individual health, psychological and behavioral elements to socio-economic aspects. Military personnel, in…
End-to-end (E2E) models fold the acoustic, pronunciation and language models of a conventional speech recognition model into one neural network with a much smaller number of parameters than a…
The manual design of analog circuits is a tedious task of parameter tuning that requires hours of work by human experts. In this work, we make a significant step towards a fully automatic design…