GSoC 2008/ideas

From Clam
Jump to: navigation, search

Integrating external libraries

Integrating one or more of those libraries could be great


An existing experiment of processing exists as plugin. Still it is not working properly and proper configuration parameters must be found.

Working with different samplerates is a hard problem now in CLAM. One solution could be using network configuration parameters (see GSoC project). still, wave file could provide different samplerates thant the Also integration with FileReaders and Writers and with Backends will be useful.

This project consists in providing both a good Resampling processing and minimizing samplerate impedance between processings requiring a given samplerate, audio files and the audio backend.

Libflac and libspeex

CLAM's AudioFileReaders/Writers use different backends for audio loading and saving. Such backends offers the readers and writers a common interface to several libraries. Current underlying libraries are sndfile for wav, aiff, au, and others, libmad for MP3 and liboggvorbis for ogg/vorbis.

There are some other libraries developed at which support other formats:

  • libflac supports FLAC which is a lossless compression format
  • libspeex supports speex format which is specialized on voice compression

This project will consist on adding support for such libraries in CLAM.


Aubio provide nice realtime onset detection algorithms. One project could be integrating aubio as processing(s) into clam. Other task to be done is to create an Annotator extractor for Aubio. There is an outdated example in CLAM/Example/TickExtractor that did tempo tracking and meter estimation for the Annotator. Development on such example was discontinued during a period when aubio license was not clearly GPL. Now it needs some work to have it working again.

This project could consist in providing processings that makes aubio features availables on the NetworkEditor or/and making the TickExtractor example working again.

New standalone applications

End user applications are interesting because they increase the user base and project visibility. Building CLAM based applications require skills on Qt.

Standalone chord extractor application

We have the chord extraction running whithin Annotator, NetworkEditor, Offline, but none of these applications are addressed towards the real use that one can make of such technology being a guitar player.

This project consists on building an application which addresses usability issues for the use case that some one want to learn the chords of a song.

Some of the concrete use cases to be addressed are:

  • Most of user actions should consist on single key actions (no control keys) to facilitate usage with an musical instrument at hand.
  • Precomputing a song when loading it.
  • Playback
    • Preloading the song
    • Moving fastforward and backwards
    • Playing a limited length of the song and then back
    • Replay from the same point
    • Time stretch playback
  • Display
    • A scrolling segmentation view
    • Segmentation view should have chord text displayed
    • Segmentation view should have a different color for each chord for the user to associate them. Maybe different color association per song or maybe using a color/pattern combination for Root/Mode.

Educational vowel synthesizer

Build a educative program that would consist on:

  • synthesizing different vowels by placing a point within the Vowel triangle,
  • and, the reverse, given an input vowel from the microphone place a dot on the triangle, so the students can check their pronunciation. Here's a screenshot of what a Vowel triangle might look like, additionally with 'Voronoi' regions shown - showing what the near misses are most likely to be confused with in a particular language.

This could be enhanced by:

  • displaying the mouth position for the vowel,
  • visualizing the spectral peaks so they can identify the effect,
  • changing the pitch,
  • changing the synthesized gender,
  • changing the vocal track characteristics,
  • extracting user's vocal track characteristics for synthesized sound.

A teacher could limit the set of vowels to the ones used for a particular language such as Catalan or English, so that the students just see the relevant ones for the exercise.

Note: Last year a student focused on having some of the processing working. Processing still needs work but this year we will like to focus on the interface so we can say the application exists, and then enhance the processing as secondary goal.

Real-time synthesizer using SMS models

Salto is a CLAM legacy application to synthesize Saxo. This project consist on revamp a very simple Salto like synthesizer.

It consists on:

  • Building a db of sms samples
  • Building processings to
    • process control signals
    • choose samples
    • interpolate
    • create stream flow
    • etc.

Recovering from Processing errors

Some error conditions are not available until the processings are executed. Such conditions are raised on processing thread while it should be detected by the application interface thread.

This task (which is not a project by itself, but can complement others that are also small) consists in implementing a proper method of raising such error conditions and catching without the application dying, just recovering the backend and stopping the player.

Also graphical (qt) handler of assertion failures has been disabled because when the assertion is thrown from the processing thread the error message was even more obscure that the console message you get with the plain console handler. Another related task could be catching assertions and forward them to the application thread by any means without AssertFailure reaching the top of the processing thread stack.

Enhancing plugin system

TODO: Split that in small project packages

Plugin systems are both a way of extending available modules for CLAM (CLAM as a plugin host) and also a way of using CLAM networks in 3rd party contexts (CLAM as a development tool to build plugins).

Several plugins systems exist: Ladspa, DSSI, LV2, VST, AudioUnits... Also CLAM has its own plugin system to allow native 3rd party extensions. After the last GSoC, CLAM supports some of them as host and some other as development tool, but there is still interesting aspects to be addressed:

  • Integrating other kind of plugins being CLAM the host: LV2, VST...
  • Plugin reloading and checking: one should be able to reload one kind of plugins or explicitly loading a given plugin
  • Creating plugins given a CLAM network
    • CLAM network as Ladspa/LV2 plugin
    • CLAM network as VST plugin
    • One button generation of plugins from NetworkEditor (native, LADSPA, VST...)
  • Explicit plugin (re)loading
  • Enabling Prototyper Qt interfaces in plugins systems that allow that such as VST, CoreUnits or LV2.

See more here: Devel/Plugins TODOs

This project has room for several projects. It also has contacts with many other projects: Subnetworks, metainformation, visual definition of processings... So if you choose this project, be specific on the set of aspects that will be addressed.

Integrate Faust with CLAM

FAUST is a functional language for audio processing. A faust module can be compiled as a ladspa plugin and then loaded into a clam network. Screenshot showing SVG diagrams into CLAM NetworkEditor: Thigs to be added:

  • Display faust diagrams (svg) embedded in clam processing boxes. Navigable (hierarchically) faust diagrams
  • On-the-fly compilation of faust modules
  • Faust code editing within clam networkeditor

Network Scalability (aka Subnetworks)

One compeling issue in CLAM is addressing scalability for the visual builder. That means building processings modules out of a composition of existing ones. CLAM supports that just by coding (ProcessingComposites), but one could consider sinks and sources of a network the ports and controls of a brand new processing. A project to address that should include:

  • Infrastructure to control the flow of recursive networks
  • Integrating user-defined networks in the component library.
  • Providing user interface on the NetworkEditor.
  • ...

Relative configurations and network configuration

That is being able to configure some parameters of a module in function of other modules parameters or even network level ones.

  • Some configuration parameters should take the value from other module parameters.
  • We could consider output only parameters which are computed based on the input ones.
  • We could add network level parameters so that we can change a repeated parameter at once.
  • Network level parameters could become regular module configuration when used as a subnetwork
  • A way of setting the connections that could be some hideable control ports with a different shape and position. Other could be editable expressions.

Network Editor documentation system

The idea is providing networks and processing modules of documentation which can be browsable and navigable from the network editor interface. This project would include the interface and a system to obtain metadata (web, webservice or just compiled information).

Processing Editor

The idea is that one could build new compiled processing modules from the interface by graphically defining the structure of the module and coding with an editor the key methods.

  • Defining a new processing just by defining the ports and the code in the Do method
  • The system will compile it and will add it to the user's plugin library
  • It would be great to have it integrated on the NetworkEditor
  • This project could be related to the Python scripting project (see below) but we are considering C++ code here.

Scripting CLAM in Python

  • several approaches: Adhoc API or API export using SWIG or SIP
  • build and use a clam network with python
  • this could lead to a NetworkEditor python console with tab-completion (QPyConsole)
  • use clam processings and processing-data with python. use it to visualize data, etc.
  • write clam processings with python
    • Related to the 'Processing editor' project
  • Network scoring
    • Sequencing changes on a network along the time
    • Configurations, connections, control sending...


Improve usability of SMSTools workflow

  • Remove the files to process from the configuration and use configuration as presets
  • Data to open are just audio or analysis data
  • Object centric interaction: you apply an action (synthesis, analysis, transform) to an object
  • Minimize the provided data by using object information (sampling rate on analysis, and most of the analysis parameters embedded on the object for resynthesis)

Convergence of SMSTools and CLAM widgets infrastructure

  • Transformation parameter automation using time-lines (ala ardour automation)
  • Metadata based widgets
  • Widgets interaction (zoom)
  • Network based transformations. The user should be able to define the transformation chain using NetworkEditor.

Enhancing chord detection algorithm for real-time usage

Chord extraction on Network Editor is less precise than the one available on the Annotator because it is realtime and it cannot include the post processing Annotator does. This project addresses enhancements to the algorithm for realtime usage as well as visual representations to exploiting imprecise data so that user could figure out what's going on.

Due to the extend it could be split into a set of projects. Check the details on the Chord Extraction TODO's page.

Speech synthesis

CLAM for Video

  • Add video processing/playback capabilities
  • Extend current Port Monitor infrastructure to generative visual algorithms or user-defined mappings between the audio analysis features and the graphics
  • Interface CLAM with GStreamer

Qt based interfaces for CLAM Network based VST plugins

Currently the Prototyper enables wrapping network in an interface built in with Qt Designer and play them as Jack or PortAudio application.

We would like to do the same with VST plugins. VST plugins have their own graphical control loop but some people succeded on using Qt on it. The key then is using such trick to run a Qt interface in several steps. First a dummy programmed qt interface. Then a xml (ui file) based interface and finally to connect it to the network.

Annotator and description data pools

Core CLAM developers are dedicated to real time processing issues and they have interest on someone taking care of this key application. Tasks of the following proposals can be mixed up in a single proposal if they build a coherent set.

See also Devel/Annotator_TODO's for further ideas.

Coretize Annotator Pool features

High priority on Annotator front.

Parts of this project can be added to another Annotator based project to add priority bonus or it can be a project by itself if you are a refactoring fanatic.

Annotator schemas have some features that are implemented in specific classes not in the core CLAM Pool related classes. In consequence, Annotator has an ill duplication of structures, and also other CLAM based applications cannot benefit of such features. This project would consist on merging such features in the common Pool related classes in CLAM core, refactoring the Annotator code base and enhancing/redesigning the Pool API usability.

Annotator data type and view plugins

High priority on Annotator front.

Many parts of the annotator are defined as plugin based: Type management, type dependant table editors, instant views, time evolution views and auralization. Those plugins are not real plugins as they are not loaded dlls, but clam already has solved multiplatform dll loading for plugins.

This project would consist in cleaning up the API's of each set of plugins and making them loadable dll's so that third party users can extend pool types and annotator views.

Annotation Merger

High priority on Annotator front.

This project consist in implementing an interface in the Annotator to merge and select the outputs of different existing extractors into a single project. That would be useful for example to combine in a project, chords extracted automatically by an algorithm, with hand annotated ground truth while keeping both in separated files.

BOCA interface was able to merge several annotation sources, such as extractors, databases and webservices, as a unique Annotator descriptors file (data pool). Python scripts developed to implement that do pretty well to define merges, projections and attribute renaming of several annotation sources (extractors, databases, webservices...).

BOCA merging scripts are distributed with the annotator. With some knowledge you can build a python extractor that does the merge. But for regular users this is not that intuitive to write an script, so this project proposes to build:

  • A graphical interface to build merging script
  • Some extra application logic needed to launch several extractors and to mapping back pool modifications.

VAMP plugins on the Annotator

VAMP is the feature extraction plugin system used by SonicVisualizer. This project would consist in being able to lauch VAMP plugins within the Annotator and integrating its results in an Annotator data pool.

Web services based extractors

Building extractors that call public web services such as Lastfm, MusicBrainz, EchoNest... and wraps the results as CLAM data pool xml to be open by the Annotator.

Pool based web services

Another possibility is offering web services that given an audio excerpt they return you the descriptors as an xml pool with or as any other existing format.

Macro manipulation of descriptors with python on Annotator

Make the annotation data accessible to python scripts so that one can automate repetitive descriptors manipulation. Some use cases:

  • "I want to correct an offset that a given segment annotation adds to all segments"
  • "I want to extend a regular beat annotation and just correct when it varies"

Enhancing Segment Editor

  • Selection and edit actions (copy, paste..)
  • Overlapping segments
  • Displaying segment attributes as height
  • Displaying segment attributes as color
  • Using metadata: Units, intervals...

Enhancing Annotator Views

  • Bring more CLAM monitors as instant views
  • Time evolution views for multibin frame parameters (spectrums, pcp's, chord correlation...)
  • Better frame level float editor

Flexible auralization on Annotator

Annotator provides auralization for the user to check the descriptors. But auralization capabilities are limited to playing a click on segment onset and float value frame level descriptors controlling the pitch of an oscillator. By adding a general way to convert descriptors to control signals they can be used to control an arbitrary synthesizer set by the user. Possible solutions for that could be a CLAM network with selected hooks for descriptors or generating OSC or MIDI to control a synthesizer.

This project consists in:

  • Define controllers that can take attributes on the pool data to control a network
  • Recode Annotator auralizer to use networks and to feed in control values
  • Provide the interface to configure auralizators for each project
  • Build several useful auralizations.

Configuration parameters for extractors

Currently there is no way of specifying configuration parameters to the extractors. Each extractor could have their own parameters so we need a system to deal with such flexibility but still being simple to define new extractors.

  • Defining a special extractor option to query the parameters (name, type, range, defaults...) as the one we have to query the output data schema.
  • Implementing the interface to query such parameters depending on the extractor.
  • Launching the extractor feeding the parameters
  • Offering a C++/python solution to ease the programming of extractors with configuration.

Enhancing NetworkEditor Usability

  • Add undo/redo support.
  • Add the possibility of save all the controls values to have networks ready to play (like a patch).
  • Cut and paste.
  • Any other feature that could enhance the usability

Hot-wiring in NetworkEditor

Currently any change to the network structure such as adding or removing processings, connecting them are not realtime safe and the network playback is stopped for safety reasons. This project would enable changing connections without stopping the network.

That means doing real-time safe changes on the network structure. Some known techniques exist to do that and mentors will provide guidance on how to address core classes.

Proficiency on C++ and basic notions of threading are the key skills to have. Having some knowledge on the CLAM core classes implementation would be perfect but not required.

MIDI support

  • NetworkEditor: Realtime Input/Output using portmidi. Read/Write on files
  • Improve and update all the MIDI support of the framework

Still not included

Those ideas were discarded to be offered as SoC project but if you are still interested, just ask.

CLAM documentation

  • Document CLAM's examples and review tutorials
  • Ideal project for non-expert first timers

CLAM web browser plug-in

  • Develop a plugin for basic CLAM Network edition through a web browser.

Navigation menu