Computer Science Department
School of Computer Science, Carnegie Mellon University


Statistical Modeling and Synthesis of
Intrinsic Structures in Impact Sounds

Sofia C.F.M. Cavaco

July 2007

Ph.D. Thesis


Keywords: Impact sounds, independent component analysis, principal component analysis, environmental acoustics, natural sounds, acoustic signal processing, sound synthesis, sound modeling

A struck object produces sound that depends on the way the object vibrates. This sound is determined by physical properties of the object, such as ts size, geometry, and material, and also by the characteristics of the event, such as the force and location of impact. It is possible to derive physical models of impact sounds given the relationship between the physical and dynamic properties of the object, and the acoustics of the resulting sound. Models of sounds have proven useful in many fields, such as sound recognition, identification of events or properties (e.g. material or length) of the objects involved, sound synthesis, virtual reality and computer graphics. However, physical models are limited because of the a priori knowledge they require and because they do not successfully model all the complexities and variability of real sounds.

In this dissertation, we propose data-driven methods for learning the intrinsic features that govern the acoustic structure of impact sounds. The methods are able to characterize the structures that are common to sounds of the same type as well as their variability. They require no a priori knowledge and aim for low dimensional characterizations of the sounds. In addition, they are not restricted to learn an explicit set of properties of the sounds (e.g., basic features such as decay rate and average spectra); instead, they learn the properties that best characterize the statistics of the data. The methods can learn properties of thesounds such as ringing, resonance, sustain, decay and sharp onsets.

In this dissertation, we also explore the synthesis of impact sounds using the features learned by these methods. We will show that it is possible to manipulate the learned features in order to modify the original sounds, or even to create new sounds (for example, from the interpolation of the representation of recorded sounds). Finally, we will show that the sounds synthesized by the methods are realistic, as they are perceived more often as real than as synthesized.

133 pages

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by