STRUCTURED ABSTRACT

Introduction

Each of the human voice types, which are distinguished as female and male voices, is traditionally classified in three main categories (Lycke, 2013). These types are soprano, mezzo-soprano, and alto for female voices and tenor, baritone and bass for male voices.

The success of singers in their voice education duration and professional life is in direct proportion with the correct identification of the sound types as well as many other elements (Otacıoglu, 2012). Sound classifications in classical vocal training are performed according to the sound colors, intervals, type, and quality ((Kazancıoğlu, 2008; Lycke, 2013). Sound classification, which is a subject that should be approached carefully in the early stages of sound education and also should not be rushed, is important for correct use of sound, sound health and sound protection (Ekici, 2016). Although it is important to make the classification as early as possible and the structuring of education according to it (Ekici, 2016), it should not be forgotten that the negative effects of an incorrect classification on the training and development stages of the person. Furthermore, this situation can cause waste of time paid for voice training, various sound health problems, frustration and anxiety.

Formant analysis is an important tool for analyzing sounds. Different definitions can be found in the literature about formant concept (Polrolniczak and Kramarczyk, 2017). For example, it is defined as the peaks in the spectrum of the human audio signal (Tan and Dong, 2011), and is defined as an acoustic resonance of the human voice track in phonetics (Cleveland, 1977). Since the formants modeled by the articulators in the formation mechanism of the sound (language, lip, jaw, etc.) (Dinler and Karabiber, 2017) are distinctive frequency components of the acoustic signal produced by speech or singing, the presence of these components is important for the detection of the audio signal. Formants are often measured as amplitude peaks in the frequency spectrum of the sound using a spectrogram or a spectrum analyzer. Formant frequencies used frequently for various speech analysis are shown as F1, F2, F3, F4, F5. While the formant frequency numbers of the voiced letters vary between four and six, three formant frequencies (F1, F2, F3) are sufficient to make a voiced letter understandable (Önen, 2012).

Determining the persons’ voice group in which they are involved is revealed by a subjective evaluation of an expert voice educator with a piano. With such an evaluation, the voice types can be detected, but sometimes this may lead to different results. Therefore, it is important to determine a computer-aided objective evaluation which can be used as an aid in determining the voice type. This study aims to find some results for a computer-aided objective evaluation by using the fundamental frequency analysis and formant analysis to classify tenor and baritone voices.

Materials and Methods

The study group consisted of 10 tenors and 10 baritones determined by purposive sampling method in the 3rd grade of the conservatory opera department of a state university in the Western Black Sea Region in Turkey. Participants in the study stated that they had received sound training between 6 months and 1 year in preparation for conservatory examinations. In addition, only 3 participants with a baritone type were graduated from fine arts high school, other 7 baritones and 10 tenors are from high school. In addition to this information, all participants in the tenor and baritone voice types have the same color (lyrical (high pitched and bright sound).

Within the scope of the research, the first stage is composed of vocalizing “a” and “e” vocals by 20 participants in respond to different voices given by the piano and making professional recordings by using iTrack Dock Studio system. “a” and “e” vocals have been selected in this study because they are counted among the basic vocals used in sound education (Appelman, 1986; Aycan and Neimetzade, 2018; Gendrot and Adda-Decker, 2007; McGinnis, Elnick and Kraichman, 1951; McKinney, 2005; ; Saruhan, 2014; Sundberg, 2013; Titze, 1994). The sounds played by the piano are: C4 (261.6 Hz) – E4 (329.6 Hz) – G4 (392 Hz) – C5 (523.3 Hz) for tenor voices and A3 (220 Hz) – C#4 (277 Hz) – E4 (329.6 Hz) – A4 (440 Hz) for baritone voices.

The sound recordings are analyzed by using the Praat v6.0.28 software (Boersma and Weenink, 2017) to find the fundamental frequency and formant values. Then, Matlab 2017a software (Mathworks, 2017) is used to obtain all figures in this study from the data obtained via the Praat software.

Results and Discussion

In this study, fundamental frequency analysis in terms of F0 and formant analysis in terms of F1, F2, and F3.

Firstly, by examining the results of fundamental frequency analysis, it is seen that fundamental frequency values of the tenors and baritones for the same note are very low compared to the ideal piano fundamental frequency values, which is an expected result because the frequency spectrums covered by instrument sounds and the human voices are different. The significant result from this analysis was that the participants with both the tenors and the baritones were able to increase their sounds in frequency in parallel with the sounds played by the piano. In addition, all fundamental frequency values of the vocals ‘a’ and ‘e’ vocalized by the tenors and baritones were consistent with the corresponding ranges given in Table-1. When the mean values of the fundamental frequencies of the vocals ‘a’ and ‘e’ voiced by the two voice types for E4 were compared, it was seen that the ones voiced by the tenors were slightly higher (0.45 Hz) than the ones voiced by the baritones.

When the results of F1 formant analysis were examined, the range of 711-818 Hz for the tenors and the range of 681.4-710.7 Hz for the baritones were obtained when vocalizing the vocal ‘a’. As for vocalizing the vocal ‘e’,  the range of 575.5-663.5 Hz for tenors and the range of 524.5 – 592 Hz for baritones were obtained. Therefore, F1 formant values of the vocal ‘a’ are higher than that of the vocal ‘e’. In addition, F1 range for the tenors is much larger than that for the baritones, especially in the vocal ‘a’. When F1 formant frequencies of the vocals voiced by the two types of sound for E4 were compared, the tenors for the vocal ‘a’ had an F1 value higher than the baritones with 32.9 Hz, whereas F1 values for the ‘e’ vocals were higher for the boritones than tenors with an amount of 5.2 Hz.

In the examining of F2 formant analysis, the range of 1290.2-1352.8 Hz for the tenors and the range of 1272-1371.9 Hz for the baritones were obtained when vocalizing the vocal ‘a’. As for vocalizing the vocal ‘e’,  the range of 1640.1-1699.7 Hz for tenors and the range of 1595.4-1632.4 Hz for baritones were obtained. Therefore, it is concluded that the F2 formant values for the vocal ‘a’ are lower than those of the vocal ‘e’ (the opposite of the situation in F1 formant). In addition, for the tenors and baritones, there was no significant difference in the F2 range, but the highest range was obtained for baritones in the vocal ‘a’. When the F2 formant frequencies of the vocals voiced by the two types for E4 were compared, the tenors had a higher F2 value than the baritones. This difference is 12.5 Hz for the vocal ‘a’ and 59.5 Hz for the vocal ‘e’.

Finally, when the F3 formant analysis results are examined, the range of 2456.6-2530.1 Hz for the tenors and the range of 2285.8-2330.1 Hz for the baritones were obtained when vocalizing the vocal ‘a’. As for vocalizing the vocal ‘e’,  the range of 2298.7-2384.3 Hz for tenors and the range of 2230-2245.7 Hz for baritones were obtained. Therefore, F3 formant values for the vocal ‘a’ are higher than those of the vocal ‘e’. In addition, the range of F3 for the tenors is much larger than that of the baritones. When the F3 formant frequencies of the vocals voiced by the two types for E4 were compared, it is observed that the tenors had a higher F3 value than the baritones. This difference is 209.2 Hz Hz for the vocal ‘a’ and 104.2 Hz for the vocal ‘e’.

Conclusion and Suggestions

In this study, it is concluded that the fundamental frequency analysis to be performed on the same note (E4 in this study) over the vocals ‘a’ and ‘e’ cannot be a meaningful tool for distinguishing tenors and baritons, whereas formant analysis in terms of F1, F2 and F3 formants can be used as a successful tool in distinguishing two voice groups. The results of the analysis showed that the tenors have higher F1, F2 and F3 formants in both vocals ‘a’ and ‘e’ compared to the baritones, except for the F1 formant in the vocal ‘e’. In addition, in a formant analysis performed for E4, it was observed that F1 and F3 formants were more effective in the vocal ‘a’ while F2 and F3 formants in the vocal ‘e’ in order to separate the two voice types. Therefore, in addition to subjective auditory assessment, computer-aided formant analysis for the vocals ‘a’ and ‘e’ is a successful objective tool that can be used for the classification of tenor and baritone types.

This research is a pilot study, and with more participants and inclusion of other voice types (bass, soprano, mezzo soprano, and alto) can lead to more comprehensive results and an increase in generalizability. Also, a study can also be conducted to cover the participants having a higher singing experience. In this research, only the vocals ‘a’ and ‘e’ have been analyzed, but other vocals such ‘i’ and ‘o’ used in voice training can be also studied for a similar study with different musical notes.

Keywords: Voice Type, Tenor, Baritone, Formant, Fundamental Frequency.

<<< Journal HomePage