Existing solutions for the descriptors problem
David Garcia

Introduction

This text is provided as complement for the analysis done to give a general solution in CLAM for sharing description extraction algorithms among projects.

DescriptorSet solution (Nir/Miquel)

Intent

The main idea on this solution is to provide a reusable descriptors container with no compile time semantic bindings. The data type and name of each descriptor are provided on runtime. Algorithms results are inserted in and fetched from the descriptor container by name using a type safe interface.

Description

That solution is implemented using several maps from strings to values. Each map is dedicated to a single data type, so all the TData (float/double) descriptors go inside one different map than those descriptors that are strings. They are also accessed using specialized access methods for each data type.

Supported types are

Serialization can be done directly by storing each descriptor with its name. Deserialization is more complex because we may not know which are the names or the types involved. So we need a kind of data dictionary for a project that defines such things.

Analysis

The main force of that solution is that algorithms are independent of the final data dictionary definition. Currently, the application itself can insert/retrieve from the DescriptorSet those descriptors that the processing produces or needs. In a port based application this function can be performed by special processings that are configured with the descriptor name to retrieve/insert.

Fact sheet

AudioClas descriptors (Nicolas Wack)

Intent

The AudioClas descriptors system was designed in order to reuse other projects descriptors extraction in AudioClas. The final system addresses two problems:

Descriptor Abstraction

A 'Descriptor' is an object that calculates a given descriptor. Each Descriptor subclass has:

The only thing you have to do is to add a new descriptor is defining a new class and using a registrator to add it to the descriptor factory.

Calculation with dependencies

A planner just take each goal descriptor:

  1. checks that the descriptor is not already calculated
  2. asks for calculation for dependencies
  3. takes pointers to input dependencies and output on the data pool
  4. ask the descriptor t calculate itself
  5. when the descriptor is frame based the upper is done for each frame

Descriptor pool

All the descriptors, even the spectral data is stored as raw data in two TData pools: one for frame descriptors and another for global descriptors. Frame descriptors pool are stored interleaved, that is, sorted by frame not by descriptors.

Analysis

One of the highlights of that approach is the plugability of new descriptors.

This approach separates the reference scope of each descriptor and its calculation dependencies. It only takes two scope levels: Frame and Global. The final system should support more scope levels than those.

Data storage is contiguous and sequential. In opposition to DescriptorsSet approach, it gives more speed on descriptor retrieval but because the lookup is done by integer indexing. It also does a name based retrieval on the offset-size-hop to do the indexed lookup but is reused along all the frames.

Fact sheet

BasicStatistics refactoring (Xavi Amatriain)

Intent

Most descriptors are obtained on statistics over an array. Most of those descriptors need doing some calculations again and again. This approach intent is to minimize the calculations by catching them.

Description

The data structure Stats contains the cached data. That class also knows how to calculate that from a given Array. Every time a client ask it for a given statistic, the object checks whether it has already calculated such statistic and then return the cached value or calculate it.

Analysis

The biggest flaw of this solution is the scalability of such system. To aggregate, for example, statistics on frames along one segment. Having an array of statistical descriptors, which are a tuple of descriptors, and because operation are applied over full elements of an Array, they cannot be applied on a single descriptor along an array of tuples of descriptors. The provided solution is to compute the statistic (ie. the mean) on every instantiated descriptor on a descriptor set (ie. SpectralDescriptors) along the array. This is clearly not what we intended.

Fact Sheet

Possible enhancements

This system should be highly enhanced by generalizing the interface to collect data. The generalization can be made following two alternative paths:

Any of these solution will provide scalability on calculating statistics on statistics but it still will lack on project extensibility, discrete descriptors and algorithms based on data semantics.