Showing 793 - 816 of 23806
Mask-based lensless cameras offer an alternative option to conventional cameras. Compared to conventional cameras, lensless cameras can be extremely thin, flexible, and light-weight. Despite these…
This paper presents a novel 3DoF+ system that allows to navigate, i.e., change position, in scene-based spatial audio content beyond the sweet spot of a Higher Order Ambisonics recording. It is one…
The polyphonic OpenMIC-2018 dataset is based on weak and incomplete labels. The automatic classification of sound events, based on the VGGish bottleneck layer as proposed before by the AudioSet,…
The increasing demands of high resolution and quality aggravate the status of heavy burden of cluster storage side and restricted bandwidth resources. Hence, video de-duplication in storage and…
We investigate the problem of machine learning with mislabeled training data. We try to make the effects of mislabeled training better understood through analysis of the basic model and equations…
Beamspace processing is an efficient and commonly used approach in harmonic retrieval (HR). In the beamspace, measurements are obtained by linearly transforming the sensing data, thereby achieving a…
As handwriting input becomes more prevalent, the large symbol inventory required to support Chinese handwriting recognition poses unique challenges. This paper describes how the Apple deep learning…
Marine buoys aid in the battle against Illegal, Unreported and Unregulated (IUU) fishing by detecting fishing vessels in their vicinity. Marine buoys, however, may be disrupted by natural causes and…
We propose and evaluate transformer-based acoustic models for hybrid speech recognition. Several modeling choices are discussed in this work, including various positional embedding methods and an…
This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features; we extract a speaker representation used for adaptation directly from the test…
1 views
The image-based fine-grained identification of individual giant pandas (Ailuropoda melanoleuca) is an emerging technology, and it is extraordinarily challenging due to the extremely subtle visual…
To rectify fisheye distortion from a single image, we advance self-supervised learning strategies and propose a unique deep learning model of Fisheye GAN (FE-GAN). Our FEGAN learns pixel-level…
Gathering information about the acoustic environment of urban areas is now possible and studied in many major cities in the world. Part of the research is to find ways to inform the citizen about its…
In this paper, we propose a novel soft and monotonic alignment mechanism used for sequence transduction. It is inspired by the integrate-and-fire model in spiking neural networks and employed in the…
1 views

Check out this discussion about wearable technology from IEEE @ SXSW 2015, featuring John C. Havens (Author), Dr. Leslie Saxon (USC Center for Body Computing) and Heather Schlegel (Futurist).

435 views
Recent studies have showed that convolutional neural networks (CNN) can effectively improve the performance of single image super-resolution (SR). However, previous methods rarely considered long-…
The colorization of gray-scale images has always been a challenging task in computer vision. Recently, novel approaches have been introduced for unsupervised image translation between two domains…
In this paper, we use a novel algorithmic approach to explore dialectal variation in American English speech. Without the need for human annotations, we are able to use a corpus transcribed in text…
We propose a new end-to-end neural acoustic model for automatic speech recognition. The model is composed of multiple blocks with residual connections between them. Each block consists of one or more…
1 views
In group object tracking, the identification of the group leader can be highly beneficial for predicting the intention and future manoeuvres of objects as well as learning the underlying group…
We introduce and analyze a novel approach to the problem of speaker identification in multi-party recorded meetings. Given a speech segment and a set of available candidate profiles, a data-driven…
Modeling the relationship between natural speech and a recorded electroencephalogram (EEG) helps us understand how the brain processes speech and has various applications in neuroscience and brain-…
Mobile-edge computing (MEC) is a promising technology to support computation-intensive and delay-sensitive applications at smart devices by offloading their local tasks to the network edge. In this…

At SXSW 2015, Jessica Colao (Director of Partnerships, iHub) provided context on this explosion in mobile innovation in Africa and explained why innovators in Africa value relationships over…

373 views