loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Engin Bumbacher 1 and Vivienne Ming 2

Affiliations: 1 EPFL, Switzerland ; 2 U.C., United States

ISBN: 978-989-8425-99-7

Keyword(s): Pitch perception, Gist, Sparse coding, Generative hierarchical models, Gaussian mixture models, Bayesian inference, Auditory processing, Speech processing.

Related Ontology Subjects/Areas/Topics: Applications ; Audio and Speech Processing ; Bayesian Models ; Cardiovascular Imaging and Cardiography ; Cardiovascular Technologies ; Digital Signal Processing ; Exact and Approximate Inference ; Health Engineering and Technology Applications ; Multimedia ; Multimedia Signal Processing ; Pattern Recognition ; Perception ; Signal Processing ; Software Engineering ; Telecommunications ; Theory and Methods

Abstract: The neural basis of pitch perception, our subjective sense of the tone of a sound, has been a great ongoing debates in neuroscience.Variants of the two classic theories - spectral Place theory and temporal Timing theory - continue to continue to drive new experiments and debates (Shamma, 2004). Here we approach the question of pitch by applying a theoretical model based on the statistics of natural sounds. Motivated by gist research (Oliva and Torralba, 2006), we extended the nonlinear hierarchical generative model developed by Karklin et al. (Karklin and Lewicki, 2003) with a parallel gist pathway. The basic model encodes higher-order structure in natural sounds capturing variations in the underlying probability distribution. The secondary pathway provides a fast biasing of the model’s inference process based on the coarse spectrotemporal structures of sound stimuli on broader timescales. Adapting our extended model to speech demonstrates that the learned code describes a more detail ed and broader range of statistical regularities that reflect abstract properties of sound such as harmonics and pitch than models without the gist pathway. The spectrotemporal modulation characteristics of the learned code are better matched to the modulation spectrum of speech signals than alternate models, and its higher-level coefficients capture information which not only effectively cluster related speech signals but also describe smooth transitions over time, encoding the temporal structure of speech signals. Finally, we find that the model produces a type of pitch-related density components which combine temporal and spectral qualities. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.227.3.146

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Bumbacher, E. and Ming, V. (2012). PITCH-SENSITIVE COMPONENTS EMERGE FROM HIERARCHICAL SPARSE CODING OF NATURAL SOUNDS.In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-8425-99-7, pages 219-229. DOI: 10.5220/0003786802190229

@conference{icpram12,
author={Engin Bumbacher. and Vivienne Ming.},
title={PITCH-SENSITIVE COMPONENTS EMERGE FROM HIERARCHICAL SPARSE CODING OF NATURAL SOUNDS},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2012},
pages={219-229},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003786802190229},
isbn={978-989-8425-99-7},
}

TY - CONF

JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - PITCH-SENSITIVE COMPONENTS EMERGE FROM HIERARCHICAL SPARSE CODING OF NATURAL SOUNDS
SN - 978-989-8425-99-7
AU - Bumbacher, E.
AU - Ming, V.
PY - 2012
SP - 219
EP - 229
DO - 10.5220/0003786802190229

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.