Devel/Chord Extraction TODO's
From Clam
(Redirected from Chord Extraction TODO's)
Contents |
TODO's
Testing framework
Setting up a framework to compare the goodness of different implementations of a segmentation algorithms given a ground truth.
- write unit tests for the framework
- allow different attributes as Segmentation (not only Chords_Harte)
- allow different hop sizes
- generally there's room for improvement/rethinking:
- http://www.music-ir.org/mirex2007/index.php?title=Audio_Chord_Detection
- http://en.wikipedia.org/wiki/Recall_(information_retrieval)
- so a clear path needs to be decided upon...
-
A simple script with fixed hopsize and two pool files given at the command line
-
Compute Recall and Precision
- Get the Beatles wavesurfer files
- Write a bash script for:
- computing the segmentation on the songs
- comparing with ground truth
- Integrate into testfarm
Network editor integration
- Make the parameters of TonalAnalysis configurable in NetworkEditor, one by one:
- Change BinsPerOctave into BinsPerSemitone...
- ...first making BinsPerOctave configurable, to check if everyting works
- Make the algorithm "change the tunning" when starting from a different minimum frequency then 98 Hz
- Quick (hopefully) and dirty (hopefully not) solution (hopefully) - change the reference tunning while reconfiguring TonalAnalysis
- Find the reason of the "delete _implenentation; _implementation = new..." crash
- Change BinsPerOctave into BinsPerSemitone...
- Make a precomputed SparseKernel for the default configuration
- Seperate the inner workings of TonalAnalysis into Processing Composites
- Add a time input control to TonalAnalysis
- Use this time input to inform ChordExtractor of the time position (if ever we use AudioFileReaders with seeking)
- Internal time in TonalAnalysis, reset to 0 when the network starts
Chordata (old Turnaround)
- Bugfixing
-
Issue an error when the configuration fails
-
Do not play if the processings are not configured
- Fix wrong number of visualised frames
- The segmentation algorithm places the segments without considering the initial window size.
- The cpu usage is 100% after loading even if no playback is done ->
Lots of timer events are queued during analysis - The cpu usage is 100% after loading during playback -> Widgets have their own timer to update themselves that should be stoped during analysis or even when stopped.
- Investigate the semantics of isEnabled. Just if the semantics are suited we could reuse it for that.
- After any modification we should assure that the views still work well in NetworkEditor, Prototyper and Annotator.
- Segmentations are moved forward half a window (to fix on the analysis or to kludge for the app)
- On analysis, the configuration should be sensible to the file samplerate or we should resample the input
- On playback, the file should be adapted to the backend samplerate
- Check Id3 tags with encoded strings
-
- Release task:
- Generalization
- New spectrogram to access a block of the pcp data storage
- Make the TonalAnalysis configuration editable as settings for the application reusing NetworkEditor Configurators and storing them as QSettings or as an xml file at ~/.turnaroundrc or similar
- Disscuss with David how these features could be generalized for other similar applications (offline computing, dumping into an in-memory storage, syncing real-time and in-memory storage...) with the goal of providing more support from the framework.
- Extract the widgets into a shared library
- Converge Pool based data sources and array based data sources
- Generalize storages for its use in regular networks
-
Add more backends (e.g. JACK)
- Convenience and usability
- Define a convenient layout (low)
- Consider whether to show more information than title and artist (low)
- More keyboard Shortcuts (which ones)
- Looping the song
- Saving and loading analysis data (low)
- Future
- Use the new file loader when available
- Improve analysis
- Realtime mode (mid)
- Use a multichannel audioloader (very low)
Detection algorithm enhancements
Several algorithm enhancements are to be considered:
- Preprocessing
- [done] Compute instant tunning by fasor addition on chroma peaks mapped to a semitone
- [done] Limit the time scope of the global tunning computation (done but improvements needed)
- Improve the limited scope tunning
- Processing
- Find faster and more precise algorithms than the current one
- Emilia's algorithm (peak detect before folding)
- Wavelet based
- Self-Correlation based
- Find faster and more precise algorithms than the current one
- Postprocessing
- [done] Consider the None chord (all pitches) so that we can detect non chord segments and use it as reference for pitchness.
- Symbolical analysis: Instead of correlation, analyze the pcp content using heuristic reasoning (Harte did plain filtering and some )
- Double scope for analysis: Too large windows difuminate transitions but small ones fail to detect arpeggios based chords. We could choose depending on the number of high pitches on the PCP.
- Onset alignment: Use realtime onset detection (aubio?) to 'quantize' the chord segments limits.
Helper information
Enrich algorithm output so that the user may take profit of non-perfect algorithm or music that is not using recognized cords (fifths, rare chords...)
- Diffuse guessing: Minimize false positive impact to the user by computing a confidence value for each guess.
- Keeping several candidates so the user may view that he has more than one option.
- Rectified guess: Do a first realtime guess and correct it later if needed as the song goes on.
Visualization
In parallel to enhance the algorithm to realtime some views must be developed. Some views ideas:
- [done] KeySpace (Emilia and Jordi's)
- [done] Tonnetz (pcp)
- Add chord figures to Tonnetz
- Chord torus (map pcp into the tonal torus space)
- Vectors in chord torus (needed to disambiguate dim chords)
- [done] Chord ranking: all chords displayed as sorted probability bars
- Highlight or filter candidates on chord ranking
- Chord candidates: just the ones before the first strong decay
- Realtime segment construction:
- Instant chord segment: Display segment based on the best one on each instant.
- Delayed segment: don't display a segment until we have enough information on the future to make a post processing
- Guessed segments: Until sure, display the guess
- [done] Tunner displays the deviation from the central note
- Instrument fingering suggesting several forms
- Integrate a tunner in prototyper and chordata
Stand alone application (GSoC Pawel)
Big project milestones
-
Having an application that seeks allong the song.
-
Launching the analysis whenever you change the song
-
The playback controlling the time displayed on the views
-
Additional features
Done
-
Add a second tonnetz view sharing the same dataSource object
-
Add a second dataSource for the ChordCorrelation port
-
Bind it to a ChordRanking view
-
Proceed with KeySpace (notice that KeySpaceMonitor limits to 24 the number of bins so you might need to provide a solution)
-
Consider a new type not being a vector<float> such as PolarPeaks
-
Use the proper labels for the ChordRanking and the full vector (including quatriads and weird chords) copy the label generation code from the ChordRankingMonitor.
-
Store data in Storages, use DataSources only for passing data to widgets
-
Add separate DataSource for KeySpace with proper offset and labels
-
Build a segmentation view
-
Consider using the progress control to build a progress bar for the offline computing.
-
A Spectrogram widget to display the PCP instead the vectorView
-
Make available the metadata information (artist, title...) somewhere (a new panel, status bar, title bar...)
-
Make pause button functional
-
Enabling and disabling views
-
"Open recent files" menu
-
Saving view visibility settings
-
Starting and stopping event timer when necessary
-
Help/About -> shows an about box like the one in Network Editor
-
Help/Tutorial -> opens the wiki tutorial
-
- Preparation
-
Implement seeking (Simple hack)
-
Take a look at the FileReader classes
-
Write a processing which loads whole files using internal MonoAudioFileReader
-
Add internal MonoAudioFileReader error handling
-
Add position variable to it
-
Add output of current position to an OutControl
-
Add a position InControl to it
-
Add support for the loop option
-
Build a Control sender with feedback (to build a progress slider)
-
-
splitting monitors and widgets in NetworkEditor/src/monitors
Realtime segmentation (GSoC 2007 Roman)
- Improve segmantation in ChordExtractor
-
Extract the code responsible for segmentation and make a separate class out of it
- Setup framework for comparisons between different implementation of segmentation algorithms
-
Decide on a method for allowing use of different chord extraction algorithms
- A chord similarity based implementation
- Removing small segments
- Joining segments with identical chords
-
- Realtime segmentation in chord extraction
Some short term goals for getting accustomed with the code:
- Make the parameters of TonalAnalysis configurable in NetworkEditor, one by one:
Old Done Tasks
Exercise in using test with cppunit:
-
Adapt InstantTunningEstimator's tests to the changes in the class
-
Change assertFoundCenterIs to use the vector<pair> interface
-
Divide any position (positions and expectedCenter) by 3 and pass 1 as the second constructor parameter
-
Adapt the last two tests to use the vector<pair> interface
-
Remove the useless doIt
-
Change any occurrences of _binsPerSemitone within the class to a 1
-
Adapt user interface and user code
-
Change last two tests to use a special helper function
-