January 26, 2010

Libraries becoming plugins, plugins becoming libraries

Let me post about some of the latest changes on the CLAM build system.

The CLAM project has been distributing two kinds of shared libraries: the main library modules (core, audioio and processing) and a set of plugins which are optional (spacialization, osc, guitareffects...). Plugins are very convenient since programs like the NetworkEditor can be extended without recompile and enable third party extensions. They are so convenient that we were seriously considering splitting processing and auidioio as plugins.

But the use of content of a clam plugin is limited to abstract Processing interface. No other symbols are visible from other components. That has a serious impact on the flexibility to define plugin boundaries. For example, if a plugin defines a new data token type, any processings managing that token type should be in that same plugin. To use symbols of one plugin from another implies installing headers, provide soname, linker name and so on. That is, building a module library.

So, if plugins want to be libraries and libraries want to be plugins, let them all be both. Each module will be compiled as library (libclam_mymodule.so.X.Y.Z), with soname (libclam_mymodule.so.X.Y), linker link (libclam_mymodule.so), headers (include/CLAM/mymodule/*), pc file... and it will also provide a plugin library (libclam_mymodule_plugin.so) which has no symbol by itself but is linked against the module library. When a program loads such a plugin, all the processings in the module library become available via the abstract interface. On the other side, if program or a module needs to use explicitly the symbols of another module, it just has to include the headers and link against the module library.

I just applied the changes to the plugins and seems to work nicely. Adapting the main modules will be harder because the old sconstruct file, but it is a matter of days and we can split them. I moved a lot of common SCons code to the clam.py scons tool. There is a cute environment method called ClamModule which generates a clam module including the module library, soname, linker name, plugin, pkg-config, and install targets. I also added some build system freebies for the modules and applications:

  • All the intermediate generated code apart in a 'generated' directory
  • Non verbose command line
  • Colour command line
  • Enhanced scanning methods with black lists

Well, do not rely too much on the current clam scons tool API. Changes are still flowing through the subversion. Of course, warn me if I broke something. ;-)

January 19, 2010

Boolean Controls in CLAM

After a long refactoring to get typed controls into clam without breaking anything, we already have them. Kudos for this achievement go also to Francisco (who started the whole thing as its GSoC project), Hernan, Nael and Pau.

Now controls are defined just like ports with a type. So for example, you can define an out control being the type OutControl<MyType>. As a side effect, control callbacks came in natural way. Instead of using a different class (formerly InControlCallback<MyType, HostProcessingType>), now you just have to pass a processing method to the control constructor. Templates do the magic too, but that is hidden from the processing programmer API which is cool. See, some example bellow.

So, once we enabled typed controls working, now is time to have more than just float controls. The main problem now is that in order to get some useful new control type you have to modify the network editor and the prototyper to make them useful.

Although the most demanding needs are for enum and integer controls, I started with simpler bool controls. My goal is you don't have to modify the NetworkEditor or the Prototyper to introduce a new control type. That is, a plugin could add new control senders, control displays, default connected processings (double clicking on a control) and binders for the prototyper (to locate ui elements to link to processing controls). So i wanted a new simple use case not being float to explore a feasible API.

Below you can see a network with a BoolControlSender (widget hxx/cxx, processing hxx,cxx) and a BinaryCounter (hxx,cxx) outputing into two BoolControlDisplay (widget hxx/cxx, processing hxx,cxx). Maybe is a little late for using CLAM for your Christmas lights, i guess ;-)

BinaryCounter, BoolControlDisplay and BoolControlSender

Currently, instead of bools we use floats considering a threshold for being true or false. A new ControlGate processing provides a transition by doing such translation:

ControlGate translates floats to booleans



So now is time to look to the code I had to add and see what can be enhanced to ease adding new control types:
  • One of the things I didn't liked during the implementation is having to add an entry into a long list of 'if' statements in ProcessingBoxEmbededWidget.cxx. This is clamming for a refactoring into a Factory.
  • Also the menu entry filling (to connect to new processing) and the default create-and-connect action become a list of if clauses with the type as parameter.
  • Both, prototyper binders and embedded widgets had to duplicate the control sending code. That could be generalized into a binder object that both use.

So, it is clear that there is room for a lot of enhancement and it looks like those enhancements could also be applied to ports as well :-)

December 21, 2009

The hidden complexity of survey design (Part 1)

A couple of weeks ago I attended a two-day course on Survey Design and Evaluation. In my recent research (see for instance the Rate It Again publication in last Recsys conference) I have become more and more interested on how people give their opinions.

The course was taught entirely by Professor Willem Saris, a very well-known researcher in survey design that was able to attract attendees from all over the world for this course. Although the course was fairly advanced, it touched upon the very issues that I wanted to see discussed. In this post I will try to very briefly mention some of them. More than trying to give a thorough explanation, I hope to draw your attention over some of these issues. Even if you are not into Recommender Systems, it is not strange for Computer Science researchers to be involved in projects in which you need to do some sort of survey, and I am sure you will find some of these issues as interesting as I have.

I will summarize some initial issues in this first post and dive into others in future posts if you consider it interesting enough.

But, before I start, let me throw in a couple of surprising conclusions just to catch your attention:
  1. Batteries of agree/disagree questions are evil! Yes, I am sure you have come across them and possibly even designed a survey in which users are asked at the beginning something like "Mark how much you agree/disagree with the following statements". Well, this is to be avoided at any cost. I will clarify why and how you can replace these kinds of questions.
  2. You cannot compare results among different demographic groups assuming that the same response means the same to any group. It turns out that different countries, for instance, have different rating styles and understand questions differently. For instance, British respondents tend to be much milder in their response than Spaniards. In a scale from 0 to 5, a British 3 might mean the same as a Spanish 5! In any case, this is not something you can assume in advanced, it is something you need to analyze in order to guarantee fair comparisons. More on this later.
Ok, so I hope I have caught your attention by now and you agree with me that these issues are very interesting and seldom explained (actually, during the course we saw many examples of professional surveys that were plain wrong).

The method for developing a survey presented in the course was a three step procedure: (1) Distinguish between concepts by postulation and concepts by intuition; (2) Develop assertions for concepts by intuition; and (3) Develop requests for an answer from assertions. Let's see them in a bit of detail.

1. Concepts by postulation and concepts by intuition

One first important decision when trying to measure a given concept is whether we can measure it by intuition or by postulation. If we think that a concept is straightforward enough, we can directly ask the question we would like to be answered (e.g. How often do you watch sports on television?). However, many times we are trying to measure concepts for which a simple and direct question won't do (e.g. How interested in politics are you?) so we need to measure them by postulation.

When measuring a concept by postulation, we need to decompose the complex concept we want to measure into a series of indicators. These indicators can be either formative or reflective. Formative indicators are variables that define the concept. They should take into account all the necessary components and are not necessarily correlated. On the other hand, reflective indicators are consequences of the concept being measured (e.g. people watch the news because they are interested on politics). These indicators are correlated since they are all linked by the originating concept.

2. From concept to assertion

There are three forms of assertions for asking a concept by intuition:
  • Subject + LV predicator + subject complement (e.g. Politicians are fair)
  • Subject + predicator + direct object (e.g. I like conservatives)
  • Subject + predicator (e.g. The importance of world economics has changed)
On the other hand, a concept by intuition might be measuring different kinds of subjective variables that can be separated into categories such as: evaluation, importance, feelings, rights, policies... It turns out that depending on the kind of subjective variable we are measure, one kind of structure might or might not be appropriate. For instance, if you are measuring importance, only structure 1 will work (there is a complete table that I cannot reproduce where you see the relation between kind of variable and structure to use).

3. From assertions to requests for answers

One last step is to decide how to present the request to the survey participant. The following list summarizes the different options available:
  • Direct request
    • With WH word
    • Without WH word
      • Direct Instruction ("Please indicate....")
      • Direct Request ("Will you vote...")
  • Indirect Request (made of pre request and subordinate clause)
    • With WH word ("Tell me why you think...")
    • Without WH word ("Do you think....?")
Following this 3 step approach does not guarantee you are avoiding all errors but rather guarantees that you are looking into all the issues that are needed in order to decide what is the right question to measure a given concept.

And, if you cannot avoid all errors, what can you do about it? Well, you can measure them and take them into account and even predict them. In order to do that I would need to introduce the Multitrait Multimethod Approach and concepts such as reliability, validity, and quality. But that shall be in a second post if there is enough interest on this.

You can read more on these issues in Wille Saris' book "Design, Evaluation, and Analysis of Questionnaires for Survey Research".

December 13, 2009

On the uselessness of content for recommendations

This is one of the hot discussions that has sparked as a result of the Netflix Prize. During the competition several teams reported trying to use movie metadata always with discouraging results. This is probably best summarized by a 2008 post by Pragmatic Theory, one of the leading teams.

The issue was re-opened during the last Recsys conference in two ways: First, there was an interesting discussion during one of the panels including the leading teams. Second a paper with a rather provocative title was published: "Recommending new movies: even a few ratings are more valuable than metadata" .

After this, I have seen several discussions in which people used these findings to conclude that content-based recommendations are little more than a dead end, and it is not worth to invest on such research. One such discussion happened in the Recommender Systems group in LinkedIn. But, it was in the Music-IR list, where things heated up the most, turning into a long and interesting thread. Most of what follows is basically an edited version of what I already expressed in those two discussions.

In a few words, my take on this issue is that results reported in the context of the Netflix competition are (1) Algorithm-dependent and (2) dataset-dependent. Although these findings are a valid explanation of why people found no use for metadata in the context of the Netflix prize, one can not extrapolate this finding to other contexts. Why?
  1. Results related to the Netflix prize only refer to how some specific content features help improve the success measure chosen in this case (RMSE). It is a well-known fact that RMSE in a Recommender System does not correlate perfectly with user satisfaction. Things like, for instance, serendipity or novelty, are more likely to come out of a content-based than a CF Recsys since content-based approaches are better suited to explore the long tail.
  2. The dataset in Netflix is somewhat representative of many Recsys cases, but not all. For instance, the sparsity of the rating matrix is much greater in the "movie" dimension, than in the "user" dimension. That is, for a given movie, we are likely to have many ratings. On the other hand, for a given user, we are likely to have very few ratings. As some of the participants in the Recsys panel explained, the Netflix problem is more about how to fill in user "missing values" than movie "missing values". That is one of the reasons why movie content does not help much. Adding content to the user dimension (for instance by adding demographics) would probably have helped. Obviously, this is not easy to do unless Netflix had included the phone number or SSN of users in the dataset.
  3. When people talk about content information in the context of the Netflix Prize, they are referring to a very specific form of content: editorial metadata coming mainly from imdb. But, in different settings, there are many other and better sources of content information. For instance, one can try to infer descriptors by automatically analyzi ng the signal (either video or audio) and use those features for content-based recommendations. We are still far from having automatic algorithms that can on their own bring useful enough features to map to user preferences. But, that does not mean these features do not exist. Another approach to extracting those features is to have experts manually anotate the content. This is what Pandora does in their music recommendation system. And although I have not seen hard numbers, it seems users are more satisfied than when using CF alone.
I think that we will probably see the use of content (and user demographics) in the second edition of the prize, since the dataset will be very different and will include fewer ratings per movie and more user info.

All that said, in the general case, and with no other info on the problem, I would probably venture to say that Collaborative Filtering is a more general solution than content-based. But clearly, the best solution is to combine both as each solves a part of the problem.

So, let me try to summarize my thinking in a set of simple statements:
  • CF is more effective than content-based recommendations in the general case.
  • The fact that editorial metadata has not proved useful to increase RMSE accuracy in the Netflix Prize does not mean that content-based recommendations are useless.
  • Adding some sort of content description helps recommendations as long as this description does effectively describe the content and maps into user preferences.
  • Editorial metadata does not map directly to the content, neither to user preferences so its usefulness may be very limmited.
  • Feautures automatically derived from the content map directly to the content but not to user preferences in the general case. Lots of research efforts still need to go into this to close this semantic gap.
  • Manually annotated content features map to the content and to user preferences so they should prove useful as in the case of Pandora. But they might be expensive in the general case.
As always, looking forward to your comments.

November 24, 2009

Clam developers at the Blender conference


Clam developers Pau Arumí and Natanel Olaiz recently presented some new work in the fantastic Blender conference in Amsterdam. The talk was about a technology developed at BarcelonaMedia involving an innovative usage of Blender for 3D audio using CLAM for the audible-scene rendering and decoding and Ardour for playing out to any loudspeaker-layout. It was really nice to meet Blender developers and artists, and the overall conference was fun and a great experience! Now we expect to collaborate more with the Blender project in the future.

Our talk was entitled: Remixing of movie soundtracks into immersive 3D audio

The summary:
We present a use of Blender for an innovative purpose: the remastering of traditional movie soundtracks into highly-immersive 3D audio soundtracks. To that end we developed a complete workflow making use of Blender with Python extensions, Ardour (the Digital Audio Workstation) and audio plugins for 3D  spatialization and room acoustics simulation. The workflow consists in two main stages: the authoring of a simplified scene and the audio rendering. The first stage is done within Blender: taking advantage of the video sequence editor playing next to a 3D view, the operator recreates the animation of sound sources mimicking the original video. He then associates the objects in the scene with existing audio tracks of an Ardour session with the soundtrack mix and, optionally, adds acoustics properties to the scene prop materials (e.g. defining how a wooden room will sound) to render acoustics simulation using ray-tracing algorithms. In the second stage, a specification of the loudspeakers positions used in the exhibition is given, and the Ardour session with the soundtrack is automatically modified incorporating all the Blender’s edited sound scene, the necessary routing, and the 3D audio decoding plugins such as Ambisonics and other techniques implemented with CLAM.

The slides are available (we hope to add the accompanying videos soon).

November 04, 2009

The Wisdom of the Few

One of the most common approaches to Recommender Systems is the so-called Collaborative Filtering. The main rationale is the following: In order to predict items that you will like, we find the most similar users to you by looking at your previous likes and dislikes. We then recommend items that those users have liked, but you still don't know.

There are several caveats with this approach. One of them is that we need an effective way of capturing users likes and dislikes. Most of the times we need to do this by asking users to explicitly rate items. This is the typical 1 to 5 star rating that you get in many services from Netflix to Amazon. But we know, as I commented in an earlier post, that users are noisy when giving that feedback.
So, because rating feedback is noisy, we are prone to make errors when predicting what a user likes or doesn't like.

But, standard Collaborative Filtering has several other problems. First, because we need to compute neighbors and predictions, we need to transmit all user ratings to a centralized server and this can compromise user privacy. The number of users and items is likely to be huge and applying this approach is computationally expensive and has scalability issues. And so on...

We have proposed a new approach called "Expert-based Collaborative Filtering". In this approach, instead finding neighbors from a general pool of like-minded users similar to the target, we find neighbors in an expert database. The rationale is that these experts will be much more consistent in their ratings (i.e. less noisy) and data will be less sparse.

We have conducted experiments using movies and experts from Rotten Tomatoes and concluded that users prefer recommendations drawn from like-minded experts more than those predicted from (noisy) like-minded peers.

In the next SIGIR 2009 conference in Boston we will be presenting the paper entitled "The Wisdom of the Few: A Collaborative Filtering Approach Based on Expert Opinions from the Web". Here you can access a copy of the paper where you will find a complete explanation about this new approach.

Update: here are the slides I presented at SIGIR

October 26, 2009

Recsys 09


Last week I attended the 2009 ACM Conference on Recommender Systems, Recsys09 for short. The conference took place in New York University's Stern School of Business organized by Alex Tuzhilin. This was the 3rd edition of this very special conference for me. Special for several reasons such as the fact that it is the main conference in the area that I am focusing my research; or the fact that I am co-chairing the conference next year in Barcelona. The area of recommender systems has also a special attraction since it combines people with backgrounds as different as HCI, Marketing, Data Mining, Information Retrieval, or Mathematics. If you add the fact that there is an extremely important representation from industry, and many of which you won't easily see in many other conferences from Netflix to Autodesk and a great number of start-ups, you have an explosive cocktail. People in the audience that rave when they see a formula that cannot fit into one slide mix with senior committee members that propose to automatically reject papers that use the Greek alphabet.

The conference has been steadily growing for the past years. It started out of a workshop organized in Bilbao by Strands. The first edition was then held in Minneapolis, home to the Movielens group which could also be considered birth place of the area as a whole. Then off to EPFL and finally this year in NY. The numbers are astonishing for a conference as young (and presumably focused) as this one: more than 280 attendees and an acceptance rate of 19% make it look almost like a first-tier conference.

If you want to get a good idea of what went on during the conference I recommend you take a look at the tweets hashed with #recsys09. And if you want a really quick idea of what where the core topics, look at the beautiful tag cloud below, generated from the tweets by Barry Smyth. In the next paragraphs I will briefly highlight what I think were the most important ideas discussed during the conference.


The first day, we had 3 very interesting tutorials. These tutorials had the great virtue of already setting what would be 3 of the most important topics during the conference: Social Recommendations and Trust, Algorithms, and the Netflix Prize.

In the first tutorial, Jennifer Golbeck did an awesome job of introducing the field of Trust-based Recommendations and explain the challenges in the field. The tutorial was extremely interactive with many questions and comments from the audience. It is true that the idea of trust is also one that very easily leads to passionate debates and opinions. The area of trust and social-based recommendations appeared again and again during the conference. There was a whole session devoted to it in the main track (or 2 if we include the one on tags and Social Networks) and a workshop on the last day. Interestingly enough, though, I did hear relevant people from the industry say that they did not believe social recommendations to be of any practical use. Don't really know what to make of that though.

The second tutorial was more of a traditional and classical lecture on Bayesian Methods. Bayesian Methods is the most popular (but not only) approach to model-based recommendations. They have two main advantages: they allow for the use of nice probabilistic formalisms, and they allow to infer knowledge from the resulting model. However, latent models based on Matrix Factorization have proved to be more reliable and, in principle, they also allow to infer knowledge from the latent variables. During the conference there were 2 different sessions on algorithms, which were dominated by different approaches to hybridize recommendations and by improvements over pre-existing collaborative filtering methods. Among the latter, I should mention the Best Paper winner, Benjamin Marlin. His paper proves that missing data (i.e. items that have not been rated) cannot be considered random and he introduces a way of taking some non-random effects into account. I found the conclusions of the paper not very striking, but the approach and scope of the idea is. And Marlin deserves the award for being the first to point to this issue, and also for all his great work in the area in general.

The last tutorial in day 1, which started a thread of its own, was a discussion on the lessons learned from the Netflix Prize. Very, very interesting discussion where some of the issues I mentioned in my previous blog post were brought up. For instance, I asked about the goodness of RMSE as a success measure. Everybody agrees that the only way to really evaluate a recommender is to do A/B tests on a real system but you cannot do this in an unsupervised way such as the contest. However, I insisted on the possibility of using other measures such as top-N related ones (e.g. nDCG). The (not very convincing) answer to this possibility was from the participants: it would be much harder to optmize algorithms for top-N measures that for the much more simple RMSE. The Netflix prize appeared now and again during the conference, especially since it was finally awarded recently. For instance, there was a very provocative paper by one of the participant teams proving that metadata is useless. This has stirred a heated discussion on whether that means that content-based approaches are useless altogether. The simple answer: NO. They are useless in the very specific case of the Netflix competition and dataset, and using RMSE as the success measure. Content-based approaches (and hybrids) are here to stay and need much more research.

The last thread that was also started on the very first day was the industrial one. As I mentioned before, company presence in Recsys is very relevant. And this year it was kicked of by a panel where Netflix and Yahoo discussed on the 8 challenges of the Recommender Systems Field. The panel was extremely interesting because John Riedl did a great jog on conducting it and on getting the two industry particpants to prepare it for weeks. To summarize, the Challenges were: transparency, exploration, navigation, time value, user action interpretation, evaluation, scalability, and relation academy/industry. The next industrial activity in the program was Francisco Marin's keynote where instead of the challenges he talked about the 10 lessons learned during his years of experience. It was a brilliant keynote that impacted many people (especially some students that were then deciding to change the orientation of their PhD). In Francisco's vision the algorithm is only 5% of the Recommender, while the most important part is the User Interface, which should take around 50% of the resources. But, if you want an excellent summary of this keynote, take a look at Neal Lathia's reconstruction from tweets. The last activity worth mentioning from this industrial thread was the Industry Workshop on the last day. It was organized by Marc Torrens (the other co-chair of next year's conference) and it attracted more than 45 people from industry.

A final thread that did not start on the first day was the application-related one. There was an applications session that was a sort of miscellaneous but where Jill Freyne presented a very interesting and well-delivered paper on the effect of people recommendation on social networks. In this application thread I should include some of the very interesting posters in the poster session. Applications that went all the way from a source code recommender from Karatzoglou and Weimer to IPTV or mobile tourist recommender systems.

Anoother very interesting thing left out of these 5 thread was the Workshop on Context-aware Recommender Systems where I presented some of our preliminary work on time-dependent music recommendation.

As a final personal promotion note I should say that my paper was probably an interesting oddball in the conference. It was the only paper that addressed the issue of data quality and user feedback and the impact it has on the recommendations. It made it really tough on the organizers to decide what session it should belong to, so I ended up presenting in the Trust session. But my impression was the it was very well received and i opens up a whole new avenue of future research in the field. Here you can check the slides I used during the presentation.

Overall, a great conference. And although the bar was set very, very high, we hope to exceed expectations in our 2010 Recsys in Barcelona. Hope to see everyone there!

(Btw, this is a very personal overview. Feel free to leave you in the form of comments and let me know if there is any mistake or misinterpretation)

August 18, 2009

Showing a little about CLAM as a prototyping tool at the Audio Club of FIUBA

Last week, at the recent ‘audio club’ of my university, I was showing how to work with the CLAM framework as a tool to prototype realtime audio signal processing applications in a simple and fast way.

We started with an example network to show some about the NetworkEditor capabilities: karaoke.clamnetwork
Karaoke

After that, we continue with a simple ‘diodo distortion’ plugin:

We specified and generated the source code base in this way:
Especificación de distorsión tipo diodo

We wrote this code:

bool Do()
        {
            bool result = Do( mEntrada.GetAudio(), mSalida.GetAudio() );
 
            mEntrada.Consume();
            mSalida.Produce();
 
            return result;
        }
   
        bool Do(const Audio& in, Audio& out)
        {
            int size = in.GetSize();
 
            const DataArray& inb = in.GetBuffer();
            DataArray& outb = out.GetBuffer();
 
            for (int i=0;i<size ;i++)
            {
                if ( fabs(inb[i])>0.8 )
                    outb[i] = inb[i]&lt;0.? -0.8:0.8;
                else
                    outb[i] = inb[i];
            }
            return true;
        }
</size>

And built this net to try it:
Red para probar distorsión de diodo

The source code of the plugin ready to be compiled is here: pluginDistorsiónDiodo_ClubAudioFiuba.tar.gz

As extra, I leave here pluginDistorsiónDiodoConControlDeClipping_ClubAudioFiuba.tar.gz the same diode distortion, but with a clipping control to set the threshold when playing.

They liked these kind of prototyping features a lot, so probably we’re going to keep using it at the club.

June 01, 2009

Preparing for GSoC 2008

GSoC 2008 is already here! We are preparing our submision for CLAM as organization and I hope we are as lucky as last year. For GSoC 2007 we got 6 fervent students who pushed CLAM a big step forward. We still don't know whether we will be selected as organization or not. We haven't even filled the submision data. But it is time to trigger some resorts. So, what to do now?

If you are an experienced CLAM developer, please consider becoming a mentor. The more mentors the more students we can cope with.

If you are a user, is the time to push your favourite feature into the GSoC project proposals.

If you are an student wanting to be part of the program, I advice you to get involved with the project from now as we will consider early involvement a big plus for eligibility.

If you are Xavi, Pau or myself, then you should fill CLAM submision instead of blogging ;-)

I love summer.

VST plugins with Qt user interface

I recently did an spike on what we need to make VST plugins first class CLAM citizens. CLAM allows to visually build JACK and PortAudio based applications with Qt interfaces as well as GUI-less VST and LADSPA plugins. The more flashy feature of VST is user interfaces that are mostly built using VSTGUI. We are using Qt as interface for JACK and Portaudio based apps because we are using the nice features of Qt toolkit to dynamically bind the UI elements and the underlaying processing. Moreover, Qt styling features enables shinning designer-made interfaces. Why not being able to reuse the same interface for VST and JACK? That has been a long standing TODO in CLAM so now is time to address it.

In summary, we fully solved croscompiling vst's from linux and we even started using qt interfaces as vst gui. In that last point, there still is a lot of work to do, but the basic question on whether you can use qt to edit a vst plugin is now out of any doubt.

To make the spike simpler, and in order not to collide with other CLAM developers, currently working on it, i just left apart all the CLAM wrapping part, just addressing vst crosscompiling and Qt with the sdk examples.

Cross compilation was pretty easy. This time I found lot more documentation on mingw and even scons. Just by adding the crossmingw scons tool we are already using for the apps and i managed to get Linux cross-compiled plugins running on Wine.

Adding a regular vstgui user interface is just a matter of compiling vstgui sources along with the example editor that comes in the sdk.

Once there, we should address Qt. VSTGUI is just a full graphical toolkit implementing the 'editor interface' plus a toolkit with some provided widgets and, i guess, a way of automating the binding of controls to processing. So what we need for qt is to implement the AEffEditor interface using the qt toolkit instead. The first problem is about the graphical loop. You have to create a QApplication and calling qApp::processEvents() on the editor's idle method so that qt widgets get responsive. The problem then is that, if you don't provide a QWidget as parent to your interface, it becomes a top level window ignoring the host provided window that still appears as an empty one.

VST host provides such window as a native Windows handle. How do you create a widget on an existing window handle? Months ago trolls redirected me to a commercial solution. Not such a 'solution' for us, a FLOSS project. So i was digging in windows qt source code for a hack when i found the answer just at the public and multiplatform QWidget api. QWidget::create works like a charm. The following simple class is a native window wrapper you can use as a regular QWidget.


class QVstWindow : public QWidget
{
Q_OBJECT
public:
QVstWindow(WId handle) {create(handle);}
QVstWindow::~QVstWindow() {}
};

Still there are some issues: focus handling, reopening, drag&drop... But the basic mouse clicking and resizing works

Once i got that, loading a designer ui file was very easy.

As I said there are still many caveats to solve. A matter of playing with it and refining things. Here is a list of TODO's:

  • Communicate controls from and to the interface
  • Handle focus and other events properly
  • Build a CLAM network wrapper which reensembles more the one for LADSPA
  • Wiki documentation on how to build your own plugin
  • One button plugin generator like the one we have for LADSPA ;-)

I feel that there is more people around other projects interested in using Qt for VST plugins so this is also a call for collaborative research on pending issues, at least the generic ones. Contact us on the CLAM development list or for a broader audience in the Linux Audio Developers list.

May 17, 2009

CLAM at LAC 2009 and WWW 2009


Several nice CLAM related presentations has been given in conferences during last month. At the Linux Audio Conference in Parma, we presented an article on Blender-CLAM integration for real-time 3D audio (paper, slides, and video available at the link) and we also gave a workshop on CLAM app and plugin prototyping features. At the WWW2009 in Madrid, we presented an article on the new web services based extractors for Annotator and the data source aggreation interface also some videos of the presentation and demos are available featuring data sources aggregation and live chord extraction from youtube videos.

April 03, 2009

March 03, 2009

Google Summer of Code 2009 Warming Up


GSoC 2009

GSoC 2009

Google Summer of Code 2009 is warming up. We still don’t know whethe CLAM will be hosted again in this program. But, in any case, we really encourage you to get involved in the program.

If you have doubts, we recommend you take a look at the following video.

And, where to follow? Take a look at this ToDo List for GSoC 2009 and of course, read the program FAQ.

February 23, 2009

New Domain: clam-project.org


CLAM has moved to a new home: clam-project.org We also changed the wiki URL scheme.

Home: http://clam-project.org/
Planet: http://clam-project.org/planet
Wiki: http://clam-project.org/wiki
Testfarm: http://clam-project.org/testfarm

And last but not least we moved the subversion server to the new domain and we changed some repository names. You can easily migrate existing subversion sandboxes by using the following command:

’svn switch –relocate [old-svn-root] [new-svn-root] [sandbox]

You can get the svn-root with ’svn info [sandbox]‘ and the new locations for the repositories are:
clam: http://clam-project.org/clam
clam-test-data: http://clam-project.org/clam_data
clam-oldapps: http://clam-project.org/clam_oldapps
clam-web: http://clam-project.org/clam_web
efficiencyguardian: http://clam-project.org/efficiencyguardian

Thousands thanks to the MTG and the IUA for hosting CLAM resources for so long after not being an official MTG project. And special thanks to Jordi Funollet, the MTG sysop, who has helped us to do the migration and responded to all our weird support petitions during those three years ;-)

The CLAM Team.

February 20, 2009

?

Leyendo el diario:

EE.UU. dice que veinte países darán más contribución civil o militar en Afganistán
Gates asegura que varios aliados de la OTAN le han asegurado que “están dispuesto a incrementar su contribución militar, civil o de entrenamiento”

Eh????

Cracovia (Polonia). (EFE).- El secretario de Defensa estadounidense, Robert Gates, aseguró

Ahhh!

:)

February 19, 2009

Nuevo dominio de CLAM

Para aquellos interesados, e intentando contribuir al indexado de los buscadores… El sitio de CLAM se ha mudado a http://clam-project.org

:)

February 05, 2009

Demo of CLAM Aggregator

Technorati 标签:

Video
-- To view it on Youtube, click HERE (You may choose the HD mode to view a high quality version)
-- To download this video, click HERE


Screenshots

Project-1: Web-based extractor

 2 3SemWebExtractor1

 

Project-2: Clam Aggregator

--Setting the Configuration

4 5 6

--Editting the Multi-level Descriptors

7 8  9

 

--Viewing the Combined Schema

10

Demo of CLAM Aggregator

Technorati 标签:

Video
-- To view it on Youtube, click HERE (You may choose the HD mode to view a high quality version)
-- To download this video, click HERE


Screenshots

Project-1: Web-based extractor

 2 3SemWebExtractor1

 

Project-2: Clam Aggregator

--Setting the Configuration

4 5 6

--Editting the Multi-level Descriptors

7 8  9

 

--Viewing the Combined Schema

10

January 02, 2009

Script para editar cambios locales en un “sandbox” SVN

No se cuán útil puede ser esto para alguien más, ya que de por sí es algo específico para programadores (que usen SVN) y además abre los archivos en cuestión con el gvim (que también requiere cierto conocimiento), así que a quienes puede servirle seguramente sabrán como hacer este script ellos mismos… Pero bueno, a mi me resultó útil y lo comparto:

    gvim `svn stat -q | sed -e ’s/^M[ \t]*\(.*\)/\1/g’ | tr ‘\n’ ‘ ‘`

… y para hacer de este post algo un poco más útil para más gente, voy a explicar cómo es que funciona esa línea:

Para empezar gvim es un editor de texto, que agrega una interfaz gráfica al viejo y querido vim. Así que en principio, lo que se quiere es llamar al gvim seguido de la lista de archivos que se quieren abrir. Veamos ahora cómo se obtiene esa lista, gracias al poder de los pipes unix, sus poderosos comandos y el intérprete de comandos bash:

Empecemos por el bash. Además de permitir el uso de los pipelines (que ya pasaré a explicar), en bash lo que se escriba entre comillas invertidas (“) se ejecutará en otra instancia, y su salida será ejecutada literalmente. Pongamos un ejemplo simple: bash tiene, como tenía el DOS, un comando que se llama “echo“, que imprime lo que se le pase como parámetro. Es decir, echo “hola mundo” imprimirá un hola mundo en la pantalla. Ahora bien, si hacemos `echo “hola mundo”` bash nos dirá: bash: hola: command not found. ¿Por qué?, porque bash ha intentado ejecutar el comando hola con el parámetro mundo.

Pues bien, eso es lo que se hace en la línea de arriba: se le pasa a gvim la salida de los comandos que se ejecutan entre las comillas invertidas.

Pasemos a ver ahora esos comandos:

    svn stat -q

SVN (subversion) es un sistema de control de versiones muy utilizado para desarrollar software en forma cooperativa (aunque puede servir para cualquier otro propósito en el que se necesite tener un historial de modificacion de archivos, tanto en forma cooperativa como individual). Quien esté acostumbrado a la edición en wikis, sabrá que uno puede volver fácilmente a cualquier versión previa (revisión) de cada artículo/página. SVN es algo similar: los archivos incluídos se mantienen en una base de datos en un servidor, y en cualquier momento se puede recuperar una revisión específica de cada archivo, hacer comparaciones entre una y otra revisión, etc…
Pues bien, el comando “svn stat -q” lo que hace es informar de los archivos modificados localmente respecto al servidor. Por ejemplo, en este momento tengo algunos cambios locales respecto al HEAD (última revisión en el servidor) de CLAM. Ejecutándo “svn stat -q” en el directorio de CLAM obtengo:

    M NetworkEditor/src/MainWindow.hxx
    M NetworkEditor/src/NetworkCanvas.hxx
    M CLAM/src/Flow/Networks/FlattenedNetwork.cxx
    M CLAM/src/Flow/Networks/FlattenedNetwork.hxx
    M CLAM/src/Flow/Networks/BaseNetwork.hxx
    M CLAM/src/Flow/Networks/BackEnds/JACKNetworkPlayer.cxx

En efecto, esos son los archivos que quiero editar. la M me indica que tienen modificaciones locales. Pero… cómo utilizar esa lista para pasársela a gvim?

Aquí es donde entran los pipelines, y el uso de los poderosos comandos unix.

¿Qué hace un pipeline? Permite que la salida de texto en la consola generado por un programa/comando sea pasado como entrada al siguiente. En este caso, la salida del “svn stat -q” (las líneas que se muestran arriba, por ejemplo) se pasan como entrada al comando sed, y luego la salida del sed se pasa como entrada al comando tr.

¿Qué hace sed? Permite buscar y reemplazar patrones de texto, con la versatilidad que permite el uso de las expresiones regulares[1]. Las expresiones regulares pueden ser muy complejas, y explicar su funcionamiento llevaría mucho tiempo. A quien le interese profundizar en el tema, le recomiendo mirar este sitio. Aquí baste con analizar cómo funciona la expresión que se utiliza en el script de arriba:

    sed -e ’s/^M[ \t]*\(.*\)/\1/g’

“sed -e ” dice que lo siguiente será una expresión regular, que irá entre comillas simples (’ ‘), con la forma ’s/patronentrada/patronsalida/g’ reemplazará el patron de entrada por el patrón de salida. Si quisiéramos cambiar todas las letras “a” por letras “e”, podríamos poner entre las comillas: ’s/a/e/g’ (anagrama de la innombrable :-/). En esta caso, el patrón de entrada es ‘^M[ \t]*\(.*\)’

Paso a explicar:

^ significa comienzo de línea.
M es la M de los archivos modificados (que va luego del comienzo de línea en la salida del svn stat -q).
[ \t]*: lo que va entre corchetes (en este caso un espacio y \t, que es equivalente a “tab”) son una “clase” de caracteres. El * significa que puede haber la cantidad que sea (0 o más, los que encuentre) de esos caracteres. Por lo tanto, [ \t]* busca espacios y/o tabs, si es que hay.
\(.*\): el . es un comodin, es decir cualquier caracter. .* busca el número que sea (0 o más) de cualquier caracter. Los paréntesis (que llevan un \ delante sólo para que bash no lo tome para sí….) lo que hacen es agrupar el contenido que llevan dentro.
\1 en el patrón de salida lo que hace es escribir el primer grupo de los hechos por los paréntesis en el patrón de entrada.

Resumiendo: ’s/^M[ \t]*\(.*\)/\1/g’ lo que hace es: buscar todas aquellas líneas que comiencen con una M, y luego tengan 0 o más espacios o tabs, seguidos de 0 o más caracteres, agrupar a estos últimos, y reemplazar toda esa línea por el grupo de los últimos caracteres.

Veamos entonces qué es lo que devuelve svn stat -q | sed -e ’s/^M[ \t]*\(.*\)/\1/g’:

    NetworkEditor/src/MainWindow.hxx
    NetworkEditor/src/NetworkCanvas.hxx
    CLAM/src/Flow/Networks/FlattenedNetwork.cxx
    CLAM/src/Flow/Networks/FlattenedNetwork.hxx
    CLAM/src/Flow/Networks/BaseNetwork.hxx
    CLAM/src/Flow/Networks/BackEnds/JACKNetworkPlayer.cxx

Ya hemos obtenido los nombres de los archivos que queríamos!!!

¿Qué hace entonces el último pipe (tr)?

El comando anterior ha devuelto la lista de archivos, pero uno por línea, mientras que el gvim necesita que se le pasen en una misma línea. Como los patrones de expresión regulares de sed se aplican sólo por línea, necesitamos luego remover los saltos de línea con algo más. Esa es la función de “tr ‘\n’ ‘ ‘”: que reemplace los ‘\n’ (saltos de línea) por ‘ ‘ (espacios).

Luego, svn stat -q | sed -e ’s/^M[ \t]*\(.*\)/\1/g’ | tr ‘\n’ ‘ ‘ devuelve:

    NetworkEditor/src/MainWindow.hxx NetworkEditor/src/NetworkCanvas.hxx CLAM/src/Flow/Networks/FlattenedNetwork.cxx CLAM/src/Flow/Networks/FlattenedNetwork.hxx CLAM/src/Flow/Networks/BaseNetwork.hxx CLAM/src/Flow/Networks/BackEnds/JACKNetworkPlayer.cxx

Como eso está entre las comillas invertidas en el comando original, es lo que se le pasa al gvim como parámetros, haciendo que se abran esos archivos.

No está bueno Linux? ;-D


[1] Me resultó cómica la definición de las expresiones regulares del sitio citado: You can think of regular expressions as wildcards on steroids. :-D

December 22, 2008

Una demo simple de MIDI con CLAM

Este es el primer screencast que hago acerca de mi proyecto para el Google Summer Of Code ‘08.
Se trata de una red simple en la cual cargo un MIDI Source (Processing que crea un puerto MIDI de entrada), un MIDINote2Freq (que transforma el mensaje MIDI Note en un par de numeros reales que indican la frecuencia de la nota y la amplitud), un oscilador sinusoidal simple, y un Audio Sink (que crea un puerto de salida de audio). Además agrego un osciloscopio para tener feedback visual.

Click here to view the embedded video.