Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
…
4 pages
1 file
In this paper, we present a novel approach for music summarization based on music structure analysis. From the audio signal, we first extract the note onset representing the time tempo of the song and the music structure analysis can be performed based on this tempo information. After music content has been structured into different semantic regions such as Introduction (Intro), Verse, Chorus, Ending (Outro), etc., the final music summary can be created with chorus and music phrases which are included anterior or posterior to selected chorus to get the desired length of the final summary. In this way, we can guarantee that the summaries begin and end at meaningful music phrase boundaries, which is a difficult problem for existing music summarization methods. Experiments show our proposed method can capture the main theme of the music compared to the ideal summaries selected by music experts and user subjective evaluation indicates our proposed method has a good performance.
Ismir, 2002
We present methods for automatically producing summary excerpts or thumbnails of music. To find the most representative excerpt, we maximize the average segment similarity to the entire work. After window-based audio parameterization, a quantitative similarity measure is calculated between every pair of windows, and the results are embedded in a 2-D similarity matrix. Summing the similarity matrix over the support of a segment results in a measure of how similar that segment is to the whole. This can be maximized to find the segment that best represents the entire work. We discuss variations on the method, and present experimental results for orchestral music, popular songs, and jazz. These results demonstrate that the method finds significantly representative excerpts, using very few assumptions about the source audio.
Archives of Acoustics, 2000
In the paper, various approaches to automatic music audio summarization are discussed. The project described in detail, is the realization of a method for extracting a music thumbnail - a fragment of continuous music of a given duration time that is most similar to the entire music piece. The results of subjective assessment of the thumbnail choice are presented, where four parameters have been taken into account: clarity (representation of the essence of the piece of music), conciseness (the motifs are not repeated in the summary), coherence of music structure, and overall quality of summary usefulness.
2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684), 2003
We present a framework for summarizing digital media based on structural analysis. Though these methods are applicable to general media, we concentrate here on characterizing repetitive structure in popular music. In the first step, a similarity matrix is calculated from inter-frame spectral similarity. Segment boundaries, such as verse-chorus transitions, are found by correlating a kernel along the diagonal of the matrix. Once segmented, spectral statistics of each segment are computed. In the second step, segments are clustered based on the pairwise similarity of their statistics, using a matrix decomposition approach. Finally, the audio is summarized by combining segments representing the clusters most frequently repeated throughout the piece. We present results on a small corpus showing more than 90% correct detection of verse and chorus segments.
IEEE Transactions on Speech and Audio Processing, 2005
Automatic music classification and summarization are very useful to music indexing, content-based music retrieval and on-line music distribution, but it is a challenge to extract the most common and salient themes from unstructured raw music data. In this paper, we propose effective algorithms to automatically classify and summarize music content. Support vector machines are applied to classify music into pure music and vocal music by learning from training data. For pure music and vocal music, a number of features are extracted to characterize the music content, respectively. Based on calculated features, a clustering algorithm is applied to structure the music content. Finally, a music summary is created based on the clustering results and domain knowledge related to pure and vocal music. Support vector machine learning shows a better performance in music classification than traditional Euclidean distance methods and hidden Markov model methods. Listening tests are conducted to evaluate the quality of summarization. The experiments on different genres of pure and vocal music illustrate the results of summarization are significant and effective.
2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763), 2000
Automatic music summarization is very useful for music indexing, content-based music retrieval and on-line music distribution, but it is a challenge to automatically extract the most common and salient themes from unstructured music raw data. In this paper, we propose an effective approach to automatically summarize music content. Firstly, a number of features are extracted to characterize the music content. Based on extracted features, an adaptive clustering algorithm is then applied to structure the music content. Finally, the music summary is created in terms of the clustering results and domain-related music knowledge. User study is conducted to evaluate the quality of summarization. The experiments on different genres of music illustrate the results of summarization are significant and effective to actual expectation.
IEEE Signal Processing Letters, 2015
Several generic summarization algorithms were developed in the past and successfully applied in fields such as text and speech summarization. In this paper, we review and apply these algorithms to music. To evaluate this summarization's performance, we adopt an extrinsic approach: we compare a Fado Genre Classifier's performance using truncated contiguous clips against the summaries extracted with those algorithms on 2 different datasets. We show that Maximal Marginal Relevance (MMR), LexRank and Latent Semantic Analysis (LSA) all improve classification performance in both datasets used for testing.
ACM Transactions on Multimedia Computing, Communications, and Applications, 2006
In this article, we propose a novel approach for automatic music video summarization. The proposed summarization scheme is different from the current methods used for video summarization. The music video is separated into the music track and video track. For the music track, a music summary is created by analyzing the music content using music features, an adaptive clustering algorithm, and music domain knowledge. Then, shots in the video track are detected and clustered. Finally, the music video summary is created by aligning the music summary and clustered video shots. Subjective studies by experienced users have been conducted to evaluate the quality of music summaries and effectiveness of the proposed summarization approach. Experiments are performed on different genres of music videos and comparisons are made with the summaries generated based on music track, video track, and manually. The evaluation results indicate that summaries generated using the proposed method are effective in helping realize users' expectations.
Proceedings / ICIP ... International Conference on Image Processing
In this paper, a new automatic summarization approach for music videos is presented. The proposed method detects and recognizes lyric captions appearing commonly in Karaoke music video and uses the captions to analyze music video structure and identify the most salient music part. The summary of music video is created based on the salient part. The experiment result shows our proposed method is promising.
Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429)
In this paper, we propose a novel approach to automatically summarize musical videos. The proposed summarization scheme is different from the current methods used for video summarization. The musical video is separated into the musical and visual tracks. A music summary is created by analyzing the music content based on music features, adaptive clustering algorithm and musical domain knowledge. Then, shots are detected and clustered in the visual track. Finally, the music video summary is created by aligning the music summary and clustered video shots. Subjective studies by experienced users have been conducted to evaluate the quality of summarization. The experiments on different genres of musical video and comparisons with the summaries only based on music track and video track indicate that the results of summarization using proposed method are significant and effective to help realize user's expectation.
2002
This paper presents a music summarization system called "Papipuun" that we are developing. Papipuun performs quick listening in a manner similar to a stylus skipping on a scratched record, but the skipping occurs correctly at punctuations of musical phrases, not arbitrarily. First, we developed a method for representing polyphony based on time-span reduction in the generative theory of tonal music (GTTM) and the deductive object-oriented database (DOOD). The operation, least upper bound, plays an important role in similarity checking of polyphonies represented in our method. Next, in a preprocessing phase, a user analyzes a set piece by the time-span reduction, using a dedicated tool, called TS-Editor. For a real time phase, the user interacts with the main system, Summarizer, to perform music summarization. Summarizer discovers a piece structure by similarity checking. When the user identifies the fragments to be skipped, Summarizer deletes them and concatenates the rest. Papipuun can produce the music summarization of good quality, reflecting the atmosphere of an entire piece through interaction with the user.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '05, 2005
2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
2005 IEEE International Symposium on Circuits and Systems
Proceedings of the 12th annual ACM international conference on Multimedia - MULTIMEDIA '04, 2004
IEEE Multimedia, 2006
2012 IEEE International Symposium on Multimedia, 2012
Proceedings of 25th International AES Conference London, 2004
2006 IEEE International Conference on Multimedia and Expo, 2006
Proc. ISMIR, 2005
2009 IEEE International Symposium on Circuits and Systems, 2009
International Journal of Computer Science and Mobile Computing (IJCSMC), 2023