Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Applied Sciences
…
13 pages
1 file
There are insufficient datasets of singing files that are adequately annotated. One of the available datasets that includes a variety of vocal techniques (n = 17) and several singers (m = 20) with several WAV files (p = 3560) is the VocalSet dataset. However, although several categories, including techniques, singers, tempo, and loudness, are in the dataset, they are not annotated. Therefore, this study aims to annotate VocalSet to make it a more powerful dataset for researchers. The annotations generated for the VocalSet audio files include fundamental frequency contour, note onset, note offset, the transition between notes, note F0, note duration, Midi pitch, and lyrics. This paper describes the generated dataset and explains our approaches to creating and testing the annotations. Moreover, four different methods to define the onset/offset are compared.
Applied Sciences
This paper introduces a new method for detecting onsets, offsets, and transitions of the notes in real-time solo singing performances. It identifies the onsets and offsets by finding the transitions from one note to another by considering trajectory changes in the fundamental frequencies. The accuracy of our approach is compared with eight well-known algorithms. It was tested with two datasets that contained 130 files of singing. The total duration of the datasets was more than seven hours and had more than 41,000 onset annotations. The analysis metrics used include the Average, the F-Measure Score, and ANOVA. The proposed algorithm was observed to determine onsets and offsets more accurately than the other algorithms. Additionally, unlike the other algorithms, the proposed algorithm can detect the transitions between notes.
National Conference on Signal and Image Processing Applications
Music information retrieval is currently an active research area that addresses the extraction of musically important information from audio signals, and the applications of such information. The extracted information can be used for search and retrieval of music in recommendation systems, or to aid musicological studies or even in music learning. Sophisticated signal processing techniques are applied to convert low-level acoustic signal properties to musical attributes which are further embedded in a rulebased or statistical classification framework to link with high-level descriptions such as melody, genre, mood and artist type. Vocal music comprises a large and interesting category of music where the lead instrument is the singing voice. The singing voice is more versatile than many musical instruments and therefore poses interesting challenges to information retrieval systems. In this paper, we provide a brief overview of research in vocal music processing followed by a description of related work at IIT Bombay leading to the development of an interface for melody detection of singing voice in polyphony.
CMMR 2017 - 13th International Symposium on Computer Music Multidisciplinary Research - Music Technology with Swing. 25-28 September, 2017
In this paper we present a database of fundamental frequency series for singing performances to facilitate comparative analysis of algorithms developed for singing assessment. A large number of recordings have been collected during conservatory entrance exams which involves candidates’ reproduction of melodies (after listening to the target melody played on the piano) apart from some other rhythm and individual pitch perception related tasks. Leaving out the samples where jury members’ grades did not all agree, we deduced a collection of 1018 singing and 2599 piano performances as instances of 40 distinct melodies. A state of the art fundamental frequency(f0) detection algorithm is used to deduce f0 time-series for each of these recordings to form the dataset. The dataset is shared to support research in singing assessment. Together with the dataset, we provide a flexible singing assessment system that can serve as a baseline for comparison of assessment algorithms.
2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763)
One of the most important characteristics of music is the singing voice. Although the identity and characteristics of the singing voice are important cues for recognizing artists, groups and musical genres, these cues have not yet been fully utilized in computer audition algorithms. A first step toward this direction is the identification of segments within a song where there is a singing voice. In this paper, we present some experiments in the automatic extraction of singing voice structure. The main characteristic of the proposed approach is that the segmentation is performed specifically for each individual song using a process we call bootstrapping. In bootstrapping a small random sampling of the song is annotated by the user. This annotation is used to learn the song-specific voice characteristics and the trained classifier is subsequently used to classify and segment the whole song. We present experimental results on a collection of pieces with jazz singers that show the potential of this approach and compare it with the traditional approach of using multiple songs for training. It is our belief that the idea of song-specific bootstrapping is applicable to other types of music and audio computer-supported annotation.
2010
In this article we describe the approximation we follow to analyze the performance of a singer when singing a reference song. The idea is to rate the performance of a singer in the same way that a music tutor would do it, not only giving a score but also giving feedback about how the user has performed regarding expression, tuning and tempo/timing characteristics. Also a discussion on what visual feedback should be relevant for the user is discussed. Segmentation at an intra-note level is done using an algorithm based on untrained HMMs with probabilistic models built out of a set of heuristic rules that determine regions and their probability of being expressive features. A real-time karaoke-like system is presented where a user can sing and visualize simultaneously feedback and results of the performance. The technology can be applied to a wide set of applications that range from pure entertainment to more serious education oriented.
This paper describes the challenges that arise when attempting to automatically extract pitchrelated performance data from recordings of the singing voice. The first section of the paper provides an overview of the history of analyzing recorded performances. The second section describes an algorithm for automatically extracting performance data from recordings of the singing voice where a score of the performance is available. The algorithm first identifies note onsets and offsets. Once the onsets and offsets have been determined, intonation, vibrato, and dynamic characteristics can be calculated for each note.
2009
In this article we describe the approximation we follow to analyze the performance of a singer when singing a reference song. The idea is to rate the performance of a singer in the same way that a music tutor would do it, not only giving a score but also giving feedback about how the user has performed regarding expression, tuning and tempo/timing characteristics. Also a discussion on what visual feedback should be relevant for the user is discussed. Segmentation at an intra-note level is done using an algorithm based on untrained HMMs with probabilistic models built out of a set of heuristic rules that determine regions and their probability of being expressive features. A real-time karaoke-like system is presented where a user can sing and visualize simultaneously feedback and results of the performance. The technology can be applied to a wide set of applications that range from pure entertainment to more serious education oriented.
Psychology of Music, 2000
In psychological and cross-cultural (e.g. ethnomusicological) research the analysis of song-singing had always been an intricate and serious obstacle. Singing is a transient and mostly unstable patterning of vocal sounds that is organised by applying more or less linguistic and musical rules. Traditionally, a sung performance has been analysed by mere listening and by using the western musical notation for representing its structure. Since this method neglects any in-between categories with respect to pitch and time, it proves to be culturally biased. However, acoustic measures as used in speech analysis have had limited application and were primarily used to quantify isolated parameters of sung performances. For analysing and representing the organisation of pitch in relation to the syllables of the lyrics, and its temporal structure, we devised a computer-aided method in combination with a new symbolic representation. The computer program provides detailed acoustic measures on pit...
Computer evaluation of singing interpretation has traditionally been based exclusively on tuning and tempo. This article presents a tool for the automatic evaluation of singing voice performances that regards on tuning and tempo but also on the expression of the voice. For such purpose, the system performs analysis at note and intra-note levels. Note level analysis outputs traditional note pitch, note onset and note duration information while Intra-note level analysis is in charge of the location and the expression categorization of note's attacks, sustains, transitions, releases and vibratos. Segmentation is done using an algorithm based on untrained HMMs with probabilistic models built out of a set of heuristic rules. A graphical tool for the evaluation and fine-tuning of the system will be presented. The interface gives feedback about analysis descriptors and rule probabilities.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Studies in Classification, Data Analysis, and Knowledge Organization, 2006
Aes 121th Convention, 2006
Adaptive Multimedia Retrieval. Large-Scale Multimedia Retrieval and Evaluation, 2013
Multimedia tools and applications, 2024