Processing on the Spectral
Domain
Before we continue with
the tutorial, we must restructure our application a bit.
Ultimately, we want an application which can analyze audio,
transform this analysis and resynthesize it. These three
parts of our application will be encapsulated as objects
which are ProcessingComposites. We must familiarize ourselves
with the concepts of Processings, ProcessingComposites
and Configs.
- What is a Processing? Look at the Processing base
class. What are its most important methods? There are
some that have to always be implemented in any class
that derives from this base class. What are they?
- What is a ProcessingComposite? Look at the base class.
What is the difference between this and the basic Processing
class?
In the next part of the
tutorial we are going to focus on the analysis part of
our application. Now, we will create a new class called
MyAnalysizer, which is derived from ProcessingComposite.
(Note: to do so, we also need to create a Dynamic Type
class to configure this class which is derived from ProcessingConfig).
For now, you can have the class only print "I'm doing
it!" in it's Do() method. The configuration class can
only have a name field.
Add an instance of this
class to your application class, and add a temporary item
to the menu which calls MyAnalyzer::Do() in order to test
it.
Implement and compile this. Run it and varify that everything
is working as it should.
If this is all working as
it should (ie. it's printing the correct string). It's
a good idea to make a copy of this 'empty' class as we
must implement 3 of such classes later on in the tutorial.
Possibly on of the major attratives of the CLAM
library are its spectral processing capabilities. From
here to the end of the tutorial we will focus on this
domain while becoming familiar with more CLAM
tools and ways of working.
First we have to talk about a very important ProcessingData:
the Spectrum. Open the Spectrum.hxx file. It is a quite
complex data class. Most of its complexity is due to the
fact that it allows for its data to be stored in different
formats.
- Explain the different formats the Spectrum offers
to represent spectral data. Note that the Spectrum is
the only ProcessingData in the CLAM
repository that has an associated configuration. Explain
what are the different attributes of this configuration
and what is their meaning.
In order to have a spectrum in our application, we will
have to deal with the FFT. At the time of this writing,
in CLAM we have three different implemenations
of the FFT: one based in the Numerical Recipes book, another
one that uses the new_Ooura implementation, and finally
another one that uses the FFTW library from the MIT. This
latter is the most efficient and is the one that is used
by default.
- Look at the FFT_Config.hxx file in the CLAM
repository. What is the only parameter used to configure
the FFT? What is its mathematical relation with the
size of the resulting spectrum?
Now we will add an FFT to our analysis composite (MyAnalyzer)
and we will add the "Analyze" option in the main user
menu in our application. Once the user chooses this option
we will ask for the FFT size. One of the problems we have
to face is how to "cut" the input audio into smaller chunks
or frames (for the time being they have to be the same
size as the FFT).
Add the FFT without adding any other Processing.
To debug CLAM applications, we can use
some of the tools available in the Visualization modules.
But sometimes it is very interesting to generate an XML
file with the content present in one of the objects that
are in memory at a given moment.Most CLAM
objects can be serialized to XML by calling their Debug()
method (a Debug.xml file will be created), either explicitly
in code or using a debugger. General purpose XML serialization/deserialization
is provided by the XMLStorage class.
Now we can consult the content of our spectrum in XML
and describe its main features.
- Debug your application, adding a breakpoint after
the FFT has been performed. Then Debug() your spectrum
and analyze the resulting XML file. Repeat this process
with sine.wav, sweep.wav and noise.wav files. What is the content of the resulting spectrum
in each case?
As we have just seen, textual debugging is sometimes
not the most convenient when trying to analyze the effect
of a given algorithm or process. We will use a Spectrum
Snapshot to be able to inspect the result visually.
Add a Spectrum Snapshot to your application so you can visualize
the result of the FFT.
Now we can already see the result of the spectral analysis.
These are the main features that are usually analyzed
when looking at a spectrum: fundamental frequency (if
available), harmonicity, centroid (center of gravity of
the spectrum), spectral deviation (standard deviation
from the center of gravity, indicates whether the spectral
energy is concentrated at one point or spread throughout
the whole spectral range) and spectral irregularity (how
flat the spectrum is).
- Analyze the following files: sine.wav,
sweep.wav, noise.wav,1.wav, 2.wav and 3.wav. Comment what is the shape of the spectrums taking into
account the above properties.
To finish this part of the tutorial, we have realized
that the spectrum snapshot we have just added has introduced
a big overhead because it opens for every audio frame.
Add an option to the user menu so the snapshot can be activated/deactivated.