Showing 841 - 864 of 23807
Deep CNN networks have shown great success in various tasks for text-independent speaker recognition. In this paper, we explore two approaches for modeling long temporal contexts to improve the…
We apply the network Lasso to classify partially labeled data points which are characterized by high-dimensional feature vectors. In order to learn an accurate classifier from limited amounts of…
Targeting at both high efficiency and performance, we propose AlignTTS to predict the mel-spectrum in parallel. AlignTTS is based on a Feed-Forward Transformer which generates mel-spectrum from a…
14 views
We previously proposed a joint learning scheme of Gaussian parameters (scales and centers) and coefficients for online nonlinear estimation. The instantaneous squared error cost in terms of the…
This paper studies the problem of training a semantic segmentation neural network with weak annotations, in order to be applied in aerial vegetation images from Teide National Park. It proposes a…
Network Representation Learning embeds each node in a network into a low-dimensional real-value vector which can be used for downstream tasks such as link prediction and recommendation. Many existing…
There is a growing number of large scale cross-site database collection of resting-state functional magnetic resonance imaging (rs-fMRI) for studying neurobehavioral diseases, such as ADHD. Although…
We consider the issue of designing closed 3D UAV trajectories that allow for an energy efficient collection of data with a UAV-aided wireless sensor network. We consider a 3D wireless channel model…
In this paper, we present a full-reference speech quality prediction model with a deep learning approach. The model determines a feature representation of the reference and the degraded signal…
Speaker diarization, which is to find the speech segments of specific speakers, has been widely used in human-centered applications such as video conferences or human-computer interaction systems. In…
We propose equalization-based data detection algorithms for all-digital millimeter-wave (mmWave) massive multiuser multiple-input multiple-out (MU-MIMO) systems that exploit sparsity in the beamspace…
Deep convolutional neural networks (CNN) have demonstrated superior performance in image super-resolution (SR) problem.However, CNNs are known to be heavily over-parameterized, and suffer from…
Suicide is a major societal challenge globally, with a wide range of risk factors, from individual health, psychological and behavioral elements to socio-economic aspects. Military personnel, in…
End-to-end (E2E) models fold the acoustic, pronunciation and language models of a conventional speech recognition model into one neural network with a much smaller number of parameters than a…
The creation of the NPTEL platform in India has led to a vast population of engineering students getting access to quality online content for Signal Processing. These courses are globally accessible…
Spectral mask based beamforming has showed competitive performance on multi-channel speech enhancement in recent years. However, such methods apply mask estimation on each channel and ensemble the…
Cross-view image generation has been recently proposed to generate images of one view from another dramatically different view. In this paper we investigate exocentric (third-person) view to…
Spherical microphone arrays with compact aperture and maximum directivity factor have been one of the popular research fields but are usually accompanied by the white noise amplification problem,…
Current automatic cough counting systems can determine how many coughs are present in an audio recording. However, they cannot determine who produced the cough. This limits their usefulness as most…
With the increased applications of automatic speech recognition (ASR) in recent years, it is essential to automatically insert punctuation marks and remove disfluencies in transcripts, to improve the…
We consider an automotive radar using a sparse linear array (SLA) in the context of multi-input multi-output (MIMO) radar. The key problem in SLA is the selection of the locations of the array…
2 views
Differential received signal strength (DRSS) provides a practical means of localisation for wireless sensor networks. Closed-form location estimators based on a linearised propagation path loss model…
Adapting speaker verification (SV) systems to a new environment is a very challenging task. Current adaptation methods in SV mainly focus on the backend, i.e, adaptation is carried out after the…