Showing 1451 - 1500 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Training Deep Spiking Neural Networks For Energy-Efficient Neuromorphic Computing
Spiking Neural Networks (SNNs) encode input information temporally using sparse spiking events, which can be harnessed to achieve higher computational efficiency. However, considering the rapid strides in accuracy enabled by Analog Neural Networks (ANNs),
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Speech Emotion Recognition With Local-Global Aware Deep Representation Learning
Convolutional neural networks (CNN) based deep representation learning methods for speech emotion recognition (SER) have demonstrated great success. The basic design of CNN restricts the ability to model only local information well. Capsule network (CapsN
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multichannel Active Noise Control With Spatial Derivative Constraints To Enlarge The Quiet Zone
Active noise control is an efficient approach in dealing with unwanted acoustic disturbances. However, most of the active noise control algorithms aim to control the signal of the error sensor leading to local noise attenuation only around the error micro
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Speaker Diarization With Session-Level Speaker Embedding Refinement Using Graph Neural Networks
Deep speaker embedding models have been commonly used as a building block for speaker diarization systems; however, the speaker embedding model is usually trained according to a global loss defined on the training data, which could be sub-optimal for dist
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Korean Singing Voice Synthesis Based On Auto-Regressive Boundary Equilibrium Gan
Singing voice synthesis is a generative task that involves not only multidimensional controls of a singer model such as phonetic modulation by lyrics and pitch control by music score but also expressive elements such as breath sounds and vibrato. Recently
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
K-Space Trajectory Design For Reduced Mri Scan Time
The development of compressed sensing (CS) techniques for magnetic resonance imaging (MRI) is enabling a speedup of MRI scanning. To increase the incoherence in the sampling, a random selection of points on the k-space is deployed and a continuous traject
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Intelligent Student Behavior Analysis System For Real Classrooms
In this paper, we design an intelligent student behavior analysis system for recorded classrooms, which automatically detects hand-raising, standing, and sleeping behaviors of students. Detecting these behaviors is quite challenging mainly due to various
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Neural Network Training With Approximate Logarithmic Computations
The high computational complexity associated with training deep neural networks limits online and real-time training on edge devices. This paper proposed an end-to-end training and inference scheme that eliminates multiplications by approximate operations
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multitaper Spectral Granger Causality With Application To Ssvep
The traditional parametric approach to Granger causality (GC), based on linear vector autoregressive modeling, suffers from difficulties related to the inaccurate modeling of the generative process. These limits can be solved by using non-parametric spect
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning To Generate Diverse Questions From Keywords
Diverse text generation has been emerging as an important topic of natural language generation. Traditional studies on question generation mainly investigate how to generate one question based on a given input (one-to-one). In this paper, we focus on a mo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Synchronous Transformers For End-To-End Speech Recognition
For most of the attention-based sequence-to-sequence models, the decoder predicts the output sequence conditioned on the entire input sequence processed by the encoder. The asynchronous problem between the encoding and decoding makes these models difficul
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi-Constraint Spectral Co-Design For Colocated Mimo Radar And Mimo Communications
Single waveform design for automotive joint radar-communications (JRC) is being increasingly considered recently, as it addresses the problem of spectrum sharing between the two systems. The paper addresses the challenge of designing a waveform in MIMO-ra
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Deep Audio-Visual Speech Separation With Attention Mechanism
Previous work shows that audio-visual fusion is a practical approach to deal with the speech separation task in the cocktail party problem. In this paper, we explore a better strategy to utilize visual representations with the attention mechanism. Compare
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
The Open Brands Dataset: Unified Brand Detection And Recognition At Scale
Intellectual property protection(IPP) have received more and more attention recently due to the development of the global e-commerce platforms. brand recognition plays a significant role in IPP. Recent studies for brand recognition and detection are based
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sequential Iot Data Augmentation Using Generative Adversarial Networks
Sequential data in industrial applications can be used to train and evaluate machine learning models (e.g. classifiers). Since gathering representative amounts of data is difficult and time consuming, there is an incentive to generate it from a small grou
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Connected Auto-Encoders Based Approach For Image Separation With Side Information: With Applications To Art Investigation
X-radiography is a widely used imaging technique in art investigation, whether to investigate the condition of a painting or provide insights into artists? techniques and working methods. In this paper, we propose a new architecture based on the use of 'c
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Pre-Training For Query Rewriting In A Spoken Language Understanding System
Query rewriting (QR) is an increasingly important technique for reducing customer friction resulting from errors in a spoken language understanding pipeline originating from various sources such as speech recognition errors, language understanding errors
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning Data Representation And Emotion Assessment From Physiological Data
Aiming at a deeper understanding of human emotional states, we explore deep learning techniques for the analysis of physiological data. In this work, two-channel pre-frontal raw electroencephalography and photoplethysmography signals of 25 subjects were c
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Weighted Speech Distortion Losses For Neural-Network-Based Real-Time Speech Enhancement
This paper investigates several aspects of training a RNN (recurrent neural network) that impact the objective and subjective quality of enhanced speech for real-time single-channel speech enhancement. Specifically, we focus on a RNN that enhances short-t
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Effective Wavenet Adaptation For Voice Conversion With Limited Data
WaveNet has shown its great potential as a direct conversion model in voice conversion. However, due to the model complexity, WaveNet always requires a large amount of training data, which has limited its applications in voice conversion, where training d
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Scalpnet: Detection Of Spatiotemporal Abnormal Intervals In Epileptic Eeg Using Convolutional Neural Networks
We propose ScalpNet: A deep neural network to detect spatiotemporal abnormal intervals from EEGs of epilepsy patients. Since the number of trained clinicians is very limited, it is very crucial to establish automatic detection of abnormal signals caused b
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Video Frame Interpolation Via Exceptional Motion-Aware Synthesis
In this paper, we propose a novel video frame interpolation method via exceptional motion-aware synthesis, in which accurate optical flow could be estimated even with exceptional motion patterns. Specifically, we devise two deep learning modules: exceptio
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Fixed-Point Optimization Of Transformer Neural Network
The Transformer model adopts a self-attention structure and shows very good performance in various natural language processing tasks. However, it is difficult to implement the Transformer in embedded systems because of its very large model size. In this s
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Bangla Voice Command Recognition In End-To-End System Using Topic Modeling Based Contextual Rescoring
In this work, we perform contextual rescoring using multi-label topic modeling to improve the performance of an End-to-End Bangla voice command recognition system. We use a hybrid of Connectionist Temporal Classification (CTC) and Attention mechanism in o
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Study Of Closed Phase Resonance Bandwidths For Oral And Nasal Tracts Using Zero Time Windowing
The periodic opening and closing of the vibrating vocal folds changes the production system continuously during the pro- duction of voiced speech. The subglottal and supraglottal cavities have distinct structure and impedance. A coupling and decoupling of
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Robust Speaker Clustering Method Based On Discrete Tied Variational Autoencoder
Recently, the speaker clustering model based on aggregation hierarchy cluster (AHC) is a common method to solve two main problems: no preset category number clustering and fix category number clustering. In general, model takes features like i-vectors as
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation
We investigate the problem of simultaneous machine translation of long-form speech content. We target a continuous speech-to-text scenario, generating translated captions for a live audio feed, such as a lecture or play-by-play commentary. As this scenari
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Investigating Generalization In Neural Networks Under Optimally Evolved Training Perturbations
In this paper, we study the generalization properties of neural networks under input perturbations and show that minimal training data corruption by a few pixel modifications can cause drastic overfitting. We propose an evolutionary algorithm to search fo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sequence-To-Sequence Singing Synthesis Using The Feed-Forward Transformer
We propose a sequence-to-sequence singing synthesizer, which avoids the need for training data with pre-aligned phonetic and acoustic features. Rather than the more common approach of a content-based attention mechanism combined with an autoregressive dec
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Asr Error Correction And Domain Adaptation Using Machine Translation
Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an increasingly viable service for companies of any size building speech-based products. While these ASR systems are trained on large amounts of data, domain mismatch is still an iss
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Anisotropic Guided Filtering
The guided filter and its derivatives have been widely employed in many image processing and computer vision applications due to their low complexity and good edge-preservation properties. Despite this success, these variants are unable to handle more agg
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Fast High-Dimensional Kernel Filtering
The bilateral and nonlocal means filters are instances of kernel-based filters that are popularly used in image processing. It was recently shown that fast and accurate bilateral filtering of grayscale images can be performed using a low-rank approximatio
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Enriched Speech For Effortless Listening
Human-machine speech interaction is increasingly common in the industrialised world. A (natural or synthetic) speech output that is optimised for high intelligibility and low cognitive load is of interest for both academia and industry: ENRICH (www.enrich
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Video-Driven Speech Reconstruction
This demo will showcase our video-to-audio model which attempts to reconstruct speech from short videos of spoken statements. Our model does so in a completely end-to-end manner where raw audio is generated based on the input video. This approach bypasses
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Clustering Of Nonnegative Data And An Application To Matrix Completion
In this paper, we propose a simple algorithm to cluster nonnegative data lying in disjoint subspaces. We analyze its performance in relation to a certain measure of correlation between said subspaces. We use our clustering algorithm to develop a matrix co
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sndcnn: Self-Normalizing Deep Cnns With Scaled Exponential Linear Units For Speech Recognition
Very deep CNNs achieve state-of-the-art results in both computer vision and speech recognition, but are difficult to train. The most popular way to train very deep CNNs is to use shortcut connec- tions (SC) together with batch normalization (BN). Inspired
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Exploring Entity-Level Spatial Relationships For Image-Text Matching
Exploring the entity-level (i.e., objects in an image, words in a text) spatial relationship contributes to understanding multimedia content precisely. The ignorance of spatial information in previous works probably leads to misunderstandings of image con
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Equalization Of Ofdm Waveforms With Insufficient Cyclic Prefix
In this paper, a simple equalization strategy for OFDM waveforms is proposed that specifically targets the case where the cyclic prefix is insufficient to span the whole channel duration. The proposed architecture can be very efficiently implemented in th
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data
A frame-unit-selection based voice conversion system proposed earlier by us is revisited here to enhance its performance in both speech naturalness and speaker similarity. Speaker independent, bilingual (Mandarin Chinese and American English) deep neural
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Modelling Sea Clutter In Sar Images Using Laplace-Rician Distribution
This paper presents a novel statistical model for the characterisation of synthetic aperture radar (SAR) images of the sea surface. The analysis of ocean surface is widely performed using satellite imagery as it produces information for wide areas under v
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Dual-Path Rnn: Efficient Long Sequence Modeling For Time-Domain Single-Channel Speech Separation
Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods. Unlike the time-frequency domain approaches, the time-domain separation systems often receive input
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Faster-Than-Nyquist Signaling Via Spatiotemporal Symbol-Level Precoding For Multi-User Miso Redundant Transmissions
This paper tackles the problem of both multi-user and intersymbol interference stemming from co-channel users transmitting at a faster-than-Nyquist (FTN) rate in multi-antenna downlink transmissions. We propose a framework for redundant block-based symbol
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Differential Approach For Rain Field Tomographic Reconstruction Using Microwave Signals From Leo Satellites
A differential approach is proposed for tomographic rain field reconstruction using the estimated signal-to-noise ratio of microwave signals from low earth orbit satellites at the ground receivers, with the unknown baseline values eliminated before using
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Allocation Of Computing Tasks In Distributed Mec Servers Co-Powered By Renewable Sources And The Power Grid
We consider a Multiaccess Edge Computing (MEC) network where distributed servers have energy harvesting (e.g., solar) and storage (e.g., batteries) capabilities. Energy from a connected power grid is also available, in case that harvested from ambient sou