Showing 1 - 50 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
On Divergence Approximations For Unsupervised Training Of Deep Denoisers Based On Stein’S Unbiased Risk Estimator
Recently, there have been several works on unsupervised learning for training deep learning based denoisers without clean images. Approaches based on Stein's unbiased risk estimator (SURE) have shown promising results for training Gaussian deep denoisers.
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A New Perspective For Flexible Feature Gathering In Scene Text Recognition Via Character Anchor Pooling
Irregular scene text recognition has attracted much attention from the research community, mainly due to the complexity of shapes of text in natural scene. However, recent methods either rely on shape-sensitive modules such as bounding box regression, or
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Scalable Kernel Learning Via The Discriminant Information
Kernel approximation methods create explicit, low-dimensional kernel feature maps to deal with the high computational and memory complexity of standard techniques. This work studies a supervised kernel learning methodology to optimize such mappings. We ut
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Optimal Laplacian Regularization For Sparse Spectral Community Detection
Regularization of the classical Laplacian matrices was empirically shown to improve spectral clustering in sparse networks. It was observed that small regularizations are preferable, but this point was left as a heuristic argument. In this paper we formal
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Quantized Tensor Robust Principal Component Analysis
High-dimensional data structures, known as tensors, are fundamental in many applications, including multispectral imaging and color video processing. Compression of such huge amount of multidimensional data collected over time is of paramount importance,
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Separable Optimization For Joint Blind Deconvolution And Demixing
Blind deconvolution and demixing is the problem of reconstructing convolved signals and kernels from the sum of their convolutions. This problem arises in many applications, such as blind MIMO. In this work, we present a separable approach to blind deconv
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unsupervised Pre-Training Of Bidirectional Speech Encoders Via Masked Reconstruction
We propose an approach for pre-training speech representations via a masked reconstruction loss. Our pre-trained encoder networks are bidirectional and can therefore be used directly in typical bidirectional speech recognition models. The pre-trained netw
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Combining Cgan And Mil For Hotspot Segmentation In Bone Scintigraphy
Bone scintigraphy is widely used to diagnose bone tumor and metastasis. Accurate hotspot segmentation from bone scintigraphy is of great importance for tumor metastasis diagnosis. In this paper, we propose a new framework to detect and extract hotspots in
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Rev-Ae: A Learned Frame Set For Image Reconstruction
Reversible residual network naturally extends the linear lifting scheme with no theoretic guarantee. In this paper, we propose a reversible autoencoder (Rev-AE) with this extended non-linear lifting scheme to improve image reconstruction. Nonlinear predic
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multitask Learning For Darpa Lorelei’S Situation Frame Extraction Task
This paper describes a novel approach of multitask learning for an end-to-end optimization technique for document classification. The application motivation comes from the need to extract "Situation Frames (SF)" from a document within the context of DARPA
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Disentangled Multidimensional Metric Learning For Music Similarity
Music similarity search is useful for a variety of creative tasks such as replacing one music recording with another recording with a similar "feel", a common task in video editing. For this task, it is typically necessary to define a similarity metric to
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improving End-To-End Speech Synthesis With Local Recurrent Neural Network Enhanced Transformer
Although Transformer based neural end-to-end TTS model has demonstrated extreme effectiveness in capturing long-term dependencies and achieved state-of-the-art performance, it still suffers from two problems. 1) limited ability to model sequential and loc
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Finite Sample Deviation And Variance Bounds For First Order Autoregressive Processes
In this paper, we study finite-sample properties of the least squares estimator in first order autoregressive processes. By leveraging a result from decoupling theory, we derive upper bounds on the probability that the estimate deviates by at least a posi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Controllable Time-Delay Transformer For Real-Time Punctuation Prediction And Disfluency Detection
With the increased applications of automatic speech recognition (ASR) in recent years, it is essential to automatically insert punctuation marks and remove disfluencies in transcripts, to improve the readability of the transcripts as well as the performan
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Drss-Based Localisation Using Weighted Instrumental Variables And Selective Power Measurement
Differential received signal strength (DRSS) provides a practical means of localisation for wireless sensor networks. Closed-form location estimators based on a linearised propagation path loss model are computationally efficient and hence suitable for wi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi-Task Learning For Voice Trigger Detection
We describe the design of a voice trigger detection system for smart speakers. We address two major challenges. The first is that the detectors are deployed in complex acoustic environments with external noise and loud playback by the device itself. Secon
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning Graph Influence From Social Interactions
In social learning, agents form their opinions or beliefs about certain hypotheses by exchanging local information. This work considers the recent paradigm of weak graphs, where the network is partitioned into sending and receiving components, with the fo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Robustness Assessment Of Automatic Reinke’S Edema Diagnosis Systems
In the past few years there has been a great interest in computer aided diagnosis research. In the field of voice quality assessment, signal processing gives us tools to analyze and extract numeric characteristics describing the analyzed signal. These fea
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Deep Product Quantization Module For Efficient Image Retrieval
Product Quantization (PQ) is one of the most popular Approximate Nearest Neighbor (ANN) methods for large-scale image retrieval, bringing better performance than hashing based methods. In recent years, several works extend the hard quantization to soft qu
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Motion Feedback Design For Video Frame Interpolation
This paper introduces a feedback-based approach to interpolate video frames involving small and fast-moving objects. Unlike the existing feedforward-based methods that estimate optical flow and synthesize in-between frames sequentially, we introduce a mot
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Global And Local Discriminative Patches Exploiting For Action Recognition
Recent human action recognition models mainly focus on exploiting human features, such as pose or skeleton features. However, due to the ignoring of interactive or related scenes exploiting, most of these methods cannot achieve good enough performance. In
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Approximate Inference By Kullback-Leibler Tensor Belief Propagation
Probabilistic programming provides a structured approach to signal processing algorithm design. The design task is formulated as a generative model, and the algorithm is derived through automatic inference. Efficient inference is a major challenge; e.g.,
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Enhanced Safety Of Autonomous Driving By Incorporating Terrestrial Signals Of Opportunity
A receiver autonomous integrity monitoring (RAIM)-based frame- work for autonomous ground vehicle (AGV) navigation is developed. This framework aims to incorporate terrestrial signals of opportunity (SOPs) alongside GPS signals to provide tight horizontal
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Meta Metric Learning For Highly Imbalanced Aerial Scene Classification
Class imbalance is an important factor that affects the performance of deep learning models used for remote sensing scene classification. In this paper, we propose a random fine-tuning meta metric learning model (RF-MML) to address this problem. Derived f
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Real-Time Hand Gesture Recognition Using Temporal Muscle Activation Maps Of Multi-Channel Semg Signals
Accurate and real-time hand gesture recognition is highly beneficial for improving the control of advanced hand prosthesis. Surface Electromyography (sEMG) signals obtained from the forearm are widely used in this area. In this paper, we introduce a novel
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Neural Network Wiretap Code Design For Multi-Mode Fiber Optical Channels
The design of reliable and secure codes with finite block length is an important requirement for industrial machine type communications. In this work, we develop an autoencoder for the multi-mode fiber wiretap channel taking into account the error perform
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Monaural Speech Enhancement Using Intra-Spectral Recurrent Layers In The Magnitude And Phase Responses
Speech enhancement has greatly benefited from deep learning. Currently, the best performing deep architectures use long short-term memory (LSTM) recurrent neural networks (RNNs) to model short and long temporal dependencies. These approaches, however, und
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Frame-Level Mmi As A Sequence Discriminative Training Criterion For Lvcsr
In this work we present frame-level maximum mutual information (MMI) as a novel sequence discriminative training criterion for hybrid HMM-DNN acoustic models. Compared to the standard, sequence-level MMI criterion we show that frame-level MMI has increase
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Self-Attentive Sentimental Sentence Embedding For Sentiment Analysis
We propose the use of a word-level sentiment bidirectional LSTM in tandem with the self-attention mechanism for sentence-level sentiment prediction. In addition to the pro- posed model, we also present a finance report dataset for sentence-level financial
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Fast And High-Quality Singing Voice Synthesis System Based On Convolutional Neural Networks
The present paper describes singing voice synthesis based on convolutional neural networks (CNNs). Singing voice synthesis systems based on deep neural networks (DNNs) are currently being proposed and are improving the naturalness of synthesized singing v
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Exploring A Zero-Order Direct Hmm Based On Latent Attention For Automatic Speech Recognition
In this paper, we study a simple yet elegant latent variable attention model for automatic speech recognition (ASR) which enables an integration of attention sequence modeling into the direct hidden Markov model (HMM) concept. We use a sequence of hidden
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi-Label Sound Event Retrieval Using A Deep Learning-Based Siamese Structure With A Pairwise Presence Matrix
Realistic recordings of soundscapes often have multiple sound events co-occurring, such as car horns, engine and human voices. Sound event retrieval, is a type of content-based search aiming at finding audio samples, similar to an audio query based on the
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi-Task Learning Via Sa-Fpn And Ej-Head
As a concise framework, Mask R-CNN achieves promising performance in object detection and instance segmentation. However, there is room for improvement in two aspects. One is that performing multi-task prediction needs more credible feature extraction and
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Zero-Crossing Precoding With Maximum Distance To The Decision Threshold For Channels With 1-Bit Quantization And Oversampling
Low-resolution devices are promising for systems that demand low energy consumption and low complexity as required in IoT systems. In this study, we propose a novel waveform for bandlimited channels with 1-bit quantization and oversampling at the receiver
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
End-To-End Architectures For Asr-Free Spoken Language Understanding
Spoken Language Understanding (SLU) is the problem of extracting the meaning from speech utterances. It is typically addressed as a two-step problem, where an Automatic Speech Recognition (ASR) model is employed to convert speech into text, followed by a
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Upscaling Vector Approximate Message Passing
In this paper we consider the problem of recovering a signal x of size N from noisy and compressed measurements y = A x + w of size M, where the measurement matrix A is right-orthogonally invariant (ROI). Vector Approximate Message Passing (VAMP) demonstr
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Selection-Channel-Aware Reverse Jpeg Compatibility For Highly Reliable Steganalysis Of Jpeg Images
This paper deeply studies the principle of the recent reverse JPEG compatibility attack [1]. This analysis allows us to cast the problem of hidden data detection in DCT coefficients within hypothesis testing theory. The optimal LR test, thought efficient,
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Novel Moving Sparse Array Geometry With Increased Degrees Of Freedom
In this paper, we propose a novel moving sparse array geometry named dilated arrays (DAs) by extending the dilation of nested arrays to other linear array structures. The theoretical analysis of dilation to other arrays is not straightforward since the re
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Robust Transmission Over Channels With Channel Uncertainty: An Algorithmic Perspective
The availability and quality of channel state information heavily influences the performance of wireless communication systems. For perfect channel knowledge, optimal signal processing and coding schemes are well studied and often closed-form solutions ar
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Effective Approximate Maximum Likelihood Estimation Of Angles Of Arrival For Non-Coherent Sub-Arrays
We consider the problem of estimating the angles of arrival (AOAs) of multiple sources from a single snapshot obtained by a set of non-coherent sub-arrays, i.e., while the antenna elements in each sub-array are coherent, each sub-array observes a differen
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Mental Fatigue Prediction From Multi-Channel Ecog Signal
Early detection of mental fatigue and changes in vigilance could be used to initiate neurostimulation to treat patients suffering from brain injury and mental disorders. In this study, we analyzed electrocorticography (ECoG) signals chronically recorded f
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Knowledge Distillation And Random Erasing Data Augmentation For Text-Dependent Speaker Verification
This paper explores the Knowledge Distillation (KD) approach and a data augmentation technique to improve the generalization ability and robustness of text-dependent speaker verification (SV) systems. The KD method consists of two neural networks, known a
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unsupervised Change Detection For Multimodal Remote Sensing Images Via Coupled Dictionary Learning And Sparse Coding
Archetypal scenarios for change detection generally consider two images acquired through sensors of the same modality. The resolution dissimilarity is often bypassed though a simple preprocessing, applied independently on each image to bring them to the s
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Acoustic Scene Classification Using Deep Residual Networks With Late Fusion Of Separated High And Low Frequency Paths
We investigate the problem of acoustic-scene classification, using a deep residual network applied to log-mel spectrograms complemented by log-mel deltas and delta-deltas.~We design the network to take into account that the temporal and frequency axes in
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Detection Of Speech Events And Speaker Characteristics Through Photo-Plethysmographic Signal Neural Processing
The use of photoplethysmogram signal (PPG) for heart and sleep monitoring is commonly found nowadays in smartphones and wrist wearables. Besides common usages, it has been proposed and reported that person information can be extracted from PPG for other u
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improved End-To-End Spoken Utterance Classification With A Self-Attention Acoustic Classifier
While human language provides a natural interface for human-machine communication, there are several challenges concerning extracting the intents of a speaker when interacting with a virtual agent, especially when the speaker is in a noisy acoustic enviro
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
High Dynamic Range Imaging Using Deep Image Priors
Traditionally, dynamic range enhancement for images has involved a combination of contrast improvement (via gamma correction or histogram equalization) and a denoising operation to reduce the effects of photon noise. More recently, modulo-imaging methods
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Extended Object Tracking Using Hierarchical Truncation Measurement Model With Automotive Radar
Motivated by real-world automotive radar measurements that are distributed around object (e.g., vehicles) edges with a certain volume, a novel hierarchical truncated Gaussian measurement model is proposed to resemble the underlying spatial distribution of
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Federating Solar, Storage And Communications In The Electric Grid And Internet Of Things
A futuristic infrastructure model is envisioned with distributed modules that can produce solar energy, have a storage system and provide services of lighting, electric-vehicle charging and communications. A stochastic model is formulated for the solar po