Abstract
![CDATA[Feature aggregation is an important step in semantic music retrieval that accumulates features obtained from local frames to produce a global song-level representation. A good aggregation scheme should capture both feature correlations and temporal information, while existing schemes only focus on one of the two respects and lack in the other. In this paper, we present a new feature aggregation scheme to model the dependencies in both feature and temporal domains. This is achieved by augmenting local feature vectors with second-order monomials that capture the correlations between different variables and performing temporal integration over the augmented features. To cope with increased feature dimensions, we further employ an embedded technique for feature selection by training an $ell_{2,1}$ regularized linear classifier model for all label classes. The use of $ell_{2,1}$ regularization produces a group sparse solution for classifier weight vectors, thus automatically eliminating irrelevant feature variables with varnishing weights. Our preliminary results demonstrate the effectiveness of the proposed feature aggregation scheme over existing aggregation schemes for large-scale music retrieval and annotation.]]
Original language | English |
---|---|
Title of host publication | MM '15: Proceedings of the 23rd ACM Multimedia Conference, 26-30 October 2015, Brisbane, Australia |
Publisher | ACM |
Pages | 1019-1022 |
Number of pages | 4 |
ISBN (Print) | 9781450334594 |
DOIs | |
Publication status | Published - 2015 |
Event | ACM International Conference on Multimedia - Duration: 26 Oct 2015 → … |
Conference
Conference | ACM International Conference on Multimedia |
---|---|
Period | 26/10/15 → … |
Keywords
- acoustics
- linear classifiers
- machine learning
- semantics