SMS Analysis
Starting with the STFT that we have already implemented,
in this part we will add everything that is missing in
order to have a complete SMS analysis.
Here is the SMS analysis block diagram:
It's important to note that the input signal is split up
into two branches which are both analyzed spectrally. One
of the resulting spectra will be used for detecting the
sinusoidal part of the signal and the other to form the
spectrum of the residual ('noisy') part of the signal by
substracting the sinusoidal spectrum. It's a good idea to
have the properties of both analyses adjustable independently
as you might want more frequency resolution (through zero-padding
or a bigger window size) before doing the spectral peak
detection for instance. However, the hop size of both analyses
must be equal as they must both analyze the same amount
of input. For this tutorial you may leave the window size
for the sinusoidal part of the signal constant as making
it pitch synchronous would complicate things a lot.
For
the reasons mentioned above, it's a good idea to make
a separate ProcessingComposite which only does the spectral
analysis. The SMS Analysis ProcessingComposite can then
use two of these objects, one for each branch, and allow
their properties to be set independently without too much
duplicate code. You can create a new ProcessingComposite
called MySMSAnalyzer and have it aggregate a MyAnalyzer
object (we will first do the sinusoidal component).
The first thing we need to add is the Peak Detection
and Continuation. But before that we will have to understand
a little better how the SpectralPeak and SpectralPeakArray
classes work. Look at the code and the documentation of
these two classes
Next, we will take a look at the PeakDetect Processing
class. Its only complexity is in the processing algorithm
and in how the configuration parameters are used. Here
you have a diagram that may help you in understanding
the algorithm.
Now we are ready to add a PeakDetect Processing object
to the MyIn class.
Add a peak detection to your application. For the time being
it is not important for the user to control the configuration,
we can just use the default configuration parameters. (Note:
At the time of this writing the PeakDetect algorithm needs
the input spectrum to be in dBs, you will have to first
convert it to dB and back to Linear after the algorithm
is applied).
Next we will add a Processing to perform sinusoidal peak
continuation (SinTracking). Look at the method bool SinTracking::Do(const
SpectralPeakArray& iPeakArray, SpectralPeakArray&
oPeakArray). This is the overload we call if we are doing
an inharmonic analysis and we don't have a valid fundamental
frequency in a given frame. It is based on the classic
McAulay&Quatieri algorithm.
Add a SinTracking object to your application, using always
the inharmonic version of the algorithm.
We can now understand much better what is the use of
the xml configuration file in the SMS Example. We keep
on adding more and more configuration parameters and we
would like them to be available to the user. But it is
very difficult to manage this from a user menu. It would
be much better to have an xml configuration file that
could hold the different parameters that are to be used
in the analysis/synthesis.
Add this functionality to your application. The new user
menu only needs the following options: Load Configuration
File, Display Input Audio, Play Input Audio, Analyze,
Synthesize, Display Output Audio, Play Output Audio (these
latter three are still useless but we can already add
them so they are available for next developments.
- Comment the most important steps you have had to take
up until now.
We have so far analyzed the sinusoidal component, now
we have to do the same with the residual. But if we take
a look again at the SMS analysis block diagram we see
that in order to obtain the residual spectrum, we have
to synthesize the sinusoidal component and substract it
from the "original".This
so called "original" spectrum does not necessarily have
to be the spectrum used as input of the PeakDetection
algorithm. This spectrum has to be the result of a spectral
analysis that has two specific requirements: (1) the analysis
window has to be a BlackmannHarris92 (as this window is
used when synthesizing the spectral peaks of the sinusoidal
component; see below) and (2) this spectrum must be of
the same size as the spectrum we obtained by synthesizing
the spectral peaks (see below).
Note
that the substraction of the sinusoidal component from
the "original" component can be done in the spectral (frequency)
domain as well as the time domain. In this tutorial we
use spectral domain substraction.
To obtain a spectrum from a peak array, we have to use
the SynthSineSpectrum class. This Processing uses a quite
simple algorithm: we just convolve the spectral peak array
with the main lobe of the fourier transform of a Blackman-Harris
92 dB windowing function.
Add this synthesis class to your program.
- Does the zero-padding factor we have used in the spectral
analysis of the sinusoidal component affect the spectrum
synthesized from the spectral peaks in any way (eg.
size of the spectrum, etc)? A!
Now we are ready to substract the synthesized spectrum
from the "original". But we have a problem; if the "original"
spectrum has been analyzed with a window different to
the BH92, we will be substracting two very different things.
That is why the SMSAnalysis class has two different SpectralAnalysis
objects: one for the sinusoidal component and another
one for the residual. The sinusoidal component can be
analyzed using the most convinient window but the residual
always has to use a BH92 because the sinusoidal spectrum
was synthesized using a BH92 window (or at least, it's
main lobe).
Add this functionality to your application (residual spectral
analysis separated from sinusoidal).
Now you are ready to substrack the original spectrum minus
the synthesized one.
And finally, we can add a couple of plots so we can see
the differences betweeen the sinusoidal and residual spectrum.
- Add these Plots and explain what you see, using the
usual examples: sine.wav, sweep.wav,noise.wav, 1.wav, 2.wav and 3.wav.