Chair of Multimedia Telecommunications and Microelectronics - Audio Research Group

AM/FM Modeling of Harmonic Sounds

Demonstration: flugelhorn

Download all the audio files from this page in a single ZIP archive (2MB).

In this example we compare the results of AMFM modeling of a warm and slightly breathy flugelhorn sound with and without the prototype signal. For a comparison, we show also the sounds reconstructed from the spectral modeling synthesis technique (SMS) by X.Serra. Please note, that while the SMS model is harmonic (in a sense that tracking is guided by the detected F0), it generates incoherent frequency trajectories that must be encoded individually. We also show a comparison with the bandwidth-enhanced sinusoidal model (LORIS) by K.Fitz which generates bandwidth-enhanced partials modulated by random noise. For fair comparison, we performed a partial selection operation ("distill" command of the LORIS software) that constraints the partials to harmonic multiples of the fundamental frequency.

Most of the techniques perform quite well in resynthesis of this sound by preserving the general timbre and character. The main challenge is in appropriate reconstruction of the amount of attack noise in each note. In a traditional sinusoidal model (and also in SMS) this attack noise is not carried by sinusoidal partials, so it must be resynthesized by the background noise model. We use a 10-th order warped LPC model for this purpose.

Original sound (WAV file, 44.1kHz, 16bit, 312kB)

Reconstructed sound (WAV file, 44.1kHz, 16bit, 370kB) obtained from synthesis based on F0 + Harmonic Envelope subsampled 1:1000
(A phase incoherent, baseline heterodyne analysis, acting as a mock-up of a perfect harmonic sinusoidal model, without residual noise)

Reconstructed sound (WAV file, 44.1kHz, 16bit, 312kB) obtained from synthesis based on instantaneous F0 + Harmonic Envelope subsampled 1:1000 + prototype signal. NOTE: even though no residual background noise is modeled in this example, the resynthesized sound represents a fair amount of the mechanical noise, especially the attack noise.

Reconstructed sound (WAV file, 44.1kHz, 16bit, 312kB) obtained from synthesis based on F0 + Harmonic Envelope subsampled 1:1000 + a noise residual modeled by 10-th order WLPC.

Reconstructed sound (WAV file, 44.1kHz, 16bit, 312kB) obtained from synthesis based on instantaneous F0 + Harmonic Envelope subsampled 1:1000 + prototype signal. As above, the noise residual is modeled by 10-th order WLPC.

Problem revealed in this example:

A part of the attack noise is already covered by the modulated partials. This noise is not phase-aligned with the original sound, so it is not subtracted properly in the residual. As a result, the general amount of noise is audibly overestimated.


Reconstructed sound (WAV file, 44.1kHz, 16bit, 312kB) obtained from the SMS technique with frame rate 44Hz (hop = 1000 samples) and no residual noise. There is a noticeable artifact in the middle of the sound similar to a squeal.


Reconstructed sound (WAV file, 44.1kHz, 16bit, 312kB) obtained from the SMS technique (as above) + background noise modeled using 10-order WLPC. We used our own noise model in this example, since the traditional LPC-based model in the SMS software produced too much artifacts. Note that SMS is quite successful in resynthesis of a sound very similar to the original one, thanks to an accurate noise model.

Reconstructed sound (WAV file, 44.1kHz, 16bit, 312kB) obtained from the LORIS technique bandwidth association region width of 100Hz and partials constrained to harmonic. As in other examples, this technique generally overestimates the amount of noise in the sound.