Abstract—Automatic singer identification (SNID) aims to determine who among a set of singers perform a given music recording. So far, most existing SNID methods follow a framework stemming from speaker identification (SPID) research, which models each person's characteristics using his/her voice data. This framework, however, is impractical in many SNID applications, because acquiring solo a cappella from each singer is usually not as feasible as collecting spoken data from each speaker in SPID applications. In view of the easy availability of spoken data, this work investigates the possibility of modeling singers’ voices using spoken data instead of singing data. However, our experiment found it difficult to replace singing data fully by using spoken data in singer voice modeling, due to the significant difference between singing and speaking for most people. Thus, we propose an alternative solution based on the use of few available singing data. The idea is to modify speech-derived voice models using MAP adaptation on few singing data, so that the adapted voice models can cover performers' singing characteristics. Our experiments found that most of the singing clips can be correctly identified using the adapted voice models.
—Model adaptation, singer identification, speaker identification.
W. H. Tsai and H. C. Lee are with the Department of Electronic Engineering and Graduate Institute of Computer and Communication Engineering, National Taipei University of Technology, Taipei, Taiwan, (Tel.: +886-2-27712171 x 2257; fax: +886-2-27317120; e-mail: email@example.com)
Cite: Wei-Ho Tsai and Hsin-Chieh Lee, "Automatic Singer Identification Based on Speech-Derived Models," International Journal of Future Computer and Communication
vol. 1, no. 2, pp. 94-96, 2012.