Showing 1601 - 1650 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Processing Convolutional Neural Networks On Cache
With the advent of Big Data application domains, several Machine Learning (ML) signal-processing algorithms such as Convolutional Neural Networks (CNNs) are required to process progressively larger datasets at a great cost in terms of both compute power a
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Vimo: Vital Sign Monitoring Using Commodity Millimeter Wave Radio
Accurate monitoring of human vital signs (e.g. breathing and heart rates) is crucial in detecting medical problems. In this paper, we propose ViMo, a calibration-free remote Vital sign Monitoring system that can simultaneously monitor multiple users by le
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Recursive Bayesian Solution For The Excess Over Threshold Distribution With Stochastic Parameters
In this paper, we propose a new approach for analyzing extreme values that are witnessed in financial markets. Our goal is to compute the predictive distribution of extreme events that are clustered in time and, as opposed to modeling just the maximum of
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Computing Hilbert Transform And Spectral Factorization For Signal Spaces Of Smooth Functions
Although the Hilbert transform and the spectral factorization are of central importance in signal processing, both operations can generally not be calculated in closed form. Therefore, algorithmic solutions are prevalent which provide an approximation of
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Content Based Singing Voice Extraction From A Musical Mixture
We present a deep learning based methodology for extracting the singing voice signal from a musical mixture based on the underlying linguistic content. Our model follows an encoder decoder architecture and takes as input the magnitude component of the spe
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Line Spectral Estimation With Palindromic Kernels
Estimation of line spectra is a classical problem in signal processing and arises in many applications. The problem is to estimate the frequencies and corresponding amplitudes of a sum of (possibly complex-valued) sinusoidal components from noisy measurem
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Confidence Estimation For Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks
Recently, there has been growth in providers of speech transcription services enabling others to leverage technology they would not normally be able to use. As a result, speech-enabled solutions have become commonplace. Their success critically relies on
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Clutter Identification Based On Sparse Recovery And L1-Type Probabilistic Distance Measures
Cognitive radar framework has recently been proposed in radar signal processing to develope algorithms for target detection, tracking, and waveform design in the presence of nonstationary environmental (clutter) characteristics. In this framework, there a
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Spoken Document Retrieval Leveraging Bert-Based Modeling And Query Reformulation
Spoken document retrieval (SDR) has long been deemed a fundamental and important step towards efficient organization of, and access to multimedia associated with spoken content. In this paper, we present a novel study of SDR leveraging the Bidirectional E
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Efficient Image Super Resolution Via Channel Discriminative Deep Neural Network Pruning
Deep convolutional neural networks (CNN) have demonstrated superior performance in image super-resolution (SR) problem.However, CNNs are known to be heavily over-parameterized, and suffer from abundant redundancy. The growing size ofCNNs may be incompatib
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Attention Driven Fusion For Multi-Modal Emotion Recognition
Deep learning has emerged as a powerful alternative to hand-crafted methods for emotion recognition on combined acoustic and text modalities. Baseline systems model emotion information in text and acoustic modes independently using Deep Convolutional Neur
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Emet: Embeddings From Multilingual-Encoder Transformer For Fake News Detection
In the last few years, social media networks have changed human life experience and behavior as it has broken down communication barriers, allowing ordinary people to actively produce multimedia content on a massive scale. On this wise, the information di
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Statistics Pooling Time Delay Neural Network Based On X-Vector For Speaker Verification
This paper aims to improve speaker embedding representation based on x-vector for extracting more detailed information for speaker verification. We propose a statistics pooling time delay neural network (TDNN), in which the TDNN structure integrates stati
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Large-Scale Fading Precoding For Maximizing The Product Of Sinrs
This paper considers the large-scale fading precoding design for mitigating the pilot contamination in the downlink of multi-cell massive MIMO (multiple-input multiple-output) systems. Rician fading with spatially correlated channels are considered where
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Adrn: Attention-Based Deep Residual Network For Hyperspectral Image Denoising
Hyperspectral image (HSI) denoising is of crucial importance for many subsequent applications, such as HSI classification and interpretation. In this paper, we propose an attention-based deep residual network to directly learn a mapping from noisy HSI to
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Vapar Synth - A Variational Parametric Model For Audio Synthesis
With the advent of data-driven statistical modeling and abundant computing power, researchers are turning increasingly to deep learning for audio synthesis. These methods try to model audio signals directly in the time or frequency domain. In the interest
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sequential Joint Detection And Estimation With An Application To Joint Symbol Decoding And Noise Power Estimation
Jointly testing multiple hypotheses and estimating a random parameter of the underlying model is investigated in a sequential setup. The optimal scheme is designed such that it minimizes the expected number of used samples while keeping the probabilities
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Automatic Epileptic Seizure Onset-Offset Detection Based On Cnn In Scalp Eeg
We establish a deep learning-based method to automatically detect the epileptic seizure onsets and offsets in multi-channel electroencephalography (EEG) signals. A convolutional neural network (CNN) is designed to identify occurrences of seizures in EEG e
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Robust Fundamental Frequency Estimation In Coloured Noise
Most parametric fundamental frequency estimators make the implicit assumption that any corrupting noise is additive, white Gaussian. Under this assumption, the maximum likelihood (ML) and the least squares estimators are the same, and statistically effici
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Saliency-Based Image Contrast Enhancement With Reversible Data Hiding
Reversible data hiding (RDH) has become a hot research area in the recent years due to its wide applications such as authentication. Among all the RDH methods proposed, contrast enhancement based reversible data hiding is one that was recently proposed. H
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Spectrum Allocation In Wireless Networks For Crowd Labelling
The massive sensing data generated by Internet-of-Things will provide fuel for ubiquitous artificial intelligence (AI), while tremendous labels are required for AI model training via supervised learning. To tackle this challenge, a novel framework of wire
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Staged Training Strategy And Multi-Activation For Audio Tagging With Noisy And Sparse Multi-Label Data
Audio tagging aims to predict whether certain acoustic events occur in the audio clips. Due to the difficulty and huge cost of obtaining manually labeled data with high confidence, researchers begin to focus on audio tagging using a small set of manually-
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Encoding And Decoding Mixed Bandlimited Signals Using Spiking Integrate-And-Fire Neurons
Conventional sampling focuses on encoding and decoding bandlimited signals by recording signal amplitudes at known time points. Alternately, sampling can be approached using biologically-inspired schemes. Among these are integrate-and-fire time encoding m
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Asr Is All You Need: Cross-Modal Distillation For Lip Reading
The goal of this work is to train strong models for visual speech recognition without requiring human annotated ground truth data. We achieve this by distilling from an Automatic Speech Recognition (ASR) model that has been trained on a large-scale audio-
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning To Rank Music Tracks Using Triplet Loss
Most music streaming services rely on automatic recommendation algorithms to exploit their large music catalogs. These algorithms aim at retrieving a ranked list of music tracks based on their similarity with a target music track. In this work, we propose
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Probabilistic Filter And Smoother For Variational Inference Of Bayesian Linear Dynamical Systems
Variational inference of a Bayesian linear dynamical system is a powerful method for estimating latent variable sequences and learning sparse dynamic models in domains ranging from neuroscience to audio processing. The hardest part of the method is inferr
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Deliberation Model Based Two-Pass End-To-End Speech Recognition
End-to-end (E2E) models have made rapid progress in automatic speech recognition (ASR) and perform competitively relative to conventional models. To further improve the quality, a two-pass model has been proposed to rescore streamed hypotheses using the n
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Regularized Fast Multichannel Nonnegative Matrix Factorization With Ilrma-Based Prior Distribution Of Joint-Diagonalization Process
In this paper, we address a convolutive blind source separation (BSS) problem and propose a new extended framework of FastMNMF by introducing prior information for joint diagonalization of the spatial covariance matrix model. Recently, FastMNMF has been p
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Epi-Neighborhood Distribution Based Light Field Depth Estimation
In this paper, a novel depth estimation algorithm tackling foreground occlusion is proposed based on the neighborhood distribution in the sheared epipolar images (EPIs). First, the EPI is sheared to perform refocusing. Next a series of sheared EPI?s neigh
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi Image Depth From Defocus Network With Boundary Cue For Dual Aperture Camera
In this paper, we estimate depth information using two defocused images from dual aperture camera. Recent advances in deep learning techniques have increased the accuracy of depth estimation. Besides, methods of using a defocused image in which an object
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Defense Against Adversarial Attacks On Spoofing Countermeasures Of Asv
Various forefront countermeasure methods for automatic speaker verification (ASV) with considerable performance in anti-spoofing are proposed in the ASVspoof 2019 challenge. However, previous work has shown that countermeasure models are vulnerable to adv
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improving Device Directedness Classification Of Utterances With Semantic Lexical Features
User interactions with personal assistants like Alexa, Google Home and Siri are typically initiated by a wake term or wakeword. Several personal assistants feature "follow-up" modes that allow users to make additional interactions without the need of a wa
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Comparison Of Glottal Closure Instants Detection Algorithms For Emotional Speech
In production of voiced speech, epochs or glottal closure instants (GCIs) refer to the instants of significant excitation of the vocal tract. Extraction of GCIs is used as a pre-processing stage in many areas of speech technology, such as in prosody modif
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Ontology-Aware Framework For Audio Event Classification
Recent advancements in audio event classification often ignore the structure and relation between the label classes available as prior information. This structure can be defined by ontology and augmented in the classifier as a form of domain knowledge. To
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Model-Free Approach To Distributed Transmit Beamforming
This paper presents a model-free solution to distributed transmit beamforming using mobile agents. Each agent is equipped with an antenna and the agents represent the individual elements in an antenna array. The agents are tasked to coordinate their relat
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Gaussian Lpcnet For Multisample Speech Synthesis
LPCNet vocoder has recently been presented to TTS community and is now gaining increasing popularity due to its effectiveness and high quality of the speech synthesized with it. In this work, we present a modification of LPCNet that is 1.5x faster, has tw
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Source Domain Data Selection For Improved Transfer Learning Targeting Dysarthric Speech Recognition
This paper presents an improved transfer learning framework applied to robust personalised speech recognition models for speakers with dysarthria. As the baseline of transfer learning, a state-of-the-art CNN-TDNN-F ASR acoustic model trained solely on sou
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Semi-Implicit Stochastic Recurrent Neural Networks
Stochastic recurrent neural networks with latent random variables of complex dependency structures have shown to be more successful in modeling sequential data than deterministic deep models. However, the majority of existing methods have limited expressi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Decentralized Stochastic Non-Convex Optimization Over Weakly Connected Time-Varying Digraphs
In this paper, we consider decentralized stochastic non-convex optimization over a class of weakly connected digraphs. First, we quantify the convergence behaviors of the weight matrices of this type of digraphs. By leveraging the perturbed push sum proto
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Time-Frequency Loss For Cnn Based Speech Super-Resolution
Speech super-resolution (SR), also called speech bandwidth extension (BWE), aims to increase the sampling rate of a given lower resolution speech signal. Recent years have witnessed the successful application of deep neural networks in time or frequency d
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Dynamic Resource Allocation For Wireless Edge Machine Learning With Latency And Accuracy Guarantees
In this paper, we address the problem of dynamic allocation of communication and computation resources for Edge Machine Learning (EML) exploiting Multi-Access Edge Computing (MEC). In particular, we consider an IoT scenario, where sensor devices collect d
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sampling Classes Of Non-Bandlimited Signals Using Integrate-And-Fire Devices: Average Case Analysis
We investigate the use of integrate-and-fire systems to efficiently sample classes of non-bandlimited signals such as bursts of spikes. The sampling in this case is based on storing some timing information about the signal, and no information about its am
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improving Robustness Of Deep Learning Based Monaural Speech Enhancement Against Processing Artifacts
In voice telecommunication, the intelligibility and quality of speech signals can be severely degraded by background noise if the speaker at the transmitting end talks in a noisy environment. Therefore, a speech enhancement system is typically integrated
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Lookahead Converges To Stationary Points Of Smooth Non-Convex Functions
The Lookahead optimizer [Zhang et al., 2019] was recently proposed and demonstrated to improve performance of stochastic first-order methods for training deep neural networks. Lookahead can be viewed as a two time-scale algorithm, where the fast dynamics
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Constant-Envelope Precoding For Satellite Systems
In this paper, Constant-Envelope Precoding techniques are presented for satellite-based communication systems. In the developed transmission technique the signals of the antennas are designed to be of constant amplitude, improving the robustness of the la
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Cost Aware Adversarial Learning
The problem of making the classifier design resilient to test data falsification is considered. In the literature, a few countermeasures have been proposed to defend machine learning algorithms against test data falsification, but a common assumption empl
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Universal Phone Recognition With A Multilingual Allophone System
Recently, multilingual speech recognition has achieved tremendous progress by sharing parameters across languages. Multilingual acoustic models, however, generally ignore the difference between phonemes (sounds that can support lexical contrasts in a emp
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning Partial Differential Equations From Data Using Neural Networks
We develop a framework for estimating unknown partial differential equations (PDEs) from noisy data, using a deep learning approach. Given noisy samples of a solution to an unknown PDE, our method interpolates the samples using a neural network, and extra
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Local-Global Feature For Video-Based One-Shot Person Re-Identification
One-shot video-based re-identification, which uses only one labeled tracklet for each identity, is challenging since the framework usually suffers misalignment and inefficient utilizing of unlabeled data. In this paper we propose a novel local-global prog
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Toward Better Speaker Embeddings: Automated Collection Of Speech Samples From Unknown Distinct Speakers
The accuracy of speaker verification and diarization models depends on the quality of the speaker embeddings used to separate audio samples from different speakers. With the goal of training better embedding models, we devise an au- tomatic pipeline for l