The Effect of Sore Throat on Changes of Vowel Sounds

The aim of this research is to determine the effect of sore throat on changes of speech characteristics. The sound characteristics in this research are limited to pitch, formant, and spectrum pattern. The observed sounds is limited to the vowel sounds /a/, /i/, /u/, /e/, and /o/. There are seven male respondents of 35 to 50 years old. A sound characteristic change is obtained by comparing the pitch, formant and spectrum pattern in the healthy condition and the sore throat conditions. The change is grouped in two categories i.e. shifting into the lower frequency and shifting to the upper frequency. From the analysis obtained that, in general, 71.4% of pitches are shifted to lower frequencies, while 25.7% shifted to higher frequencies, and the rest does not change. From the formant analysis, the largest change of formant pattern occurs in the sound of vowel /a/ while the smallest change is in vowel /i/. Furthermore, using cross correlation, the change of the pattern is derived. The cross-correlation result shows that vowels /a/ is the most suffering pattern due to the sore throat.


INTRODUCTION
Sound is a form of effect of an energy propagation in the air. Voice analysis is widely used in various fields, one of which is in the field of health. One of the uses of sound analysis in the health service was developed by Puspita (Puspasari, 2015) for the characteristics of the coronary heart sound frequency. The results of this study show that the average of coronary heart rate frequency is more than 200Hz.
Unlike the heartbeat, the voice of conversation in humans comes from the vibration of the vowel cords caused by the air flow through it. Furthermore, the sound produced by the vibration of the vowel cords propagates outside the mouth through the oral cavity. Because each person has different structure of the oral cavity, the sound produced by each person will seem different even though it is attempted to be at the same tone frequency (Rabiner & Schafer, 2010) and (Rahim & Malik, 2015).
Theoretically, the sound frequency heard by humans is between 20-20,000 Hz, but the sound that can be produced by humans from the vowel cords is only between 85 to 1.1 kHz (Li, et al, 2017). Commonly, normal human conversations do not exceed 2 kHz.
Pitches and formants are some parameters used to recognize the characteristics of a human voice. According to (de Cheveigné, 2005), pitch is the main frequency of a sound. Pitch has the highest amplitude compared to the other constituent frequencies while formant is a combination of several frequencies that contribute in shaping the color of a sound. The extraction process to obtain pitch and formant can refer to (Schnupp, 2014) (Aadit, et al, 2017) and .
Many researches in the field of acoustics use pitch and formant for various purposes, including for gender classification (Rahim & Malik, 2015), identification of conversation characteristics of a nation (Aadit, et al, 2017) (Suyudi & Saptono, 2017), up to one's emotional predictions (Mohanta & Mittal, 2016). For the health sector, sound analysis is mostly used to predict the lung disorders such as respiratory syndrome (Gutierrez, et al, 2010), asthma (Batra, et al, 2015), and pneumonia (Maulidin, 2018). However, there has not found special studies yet for sound analysis for health problems in the upper throat or larynx.
Throat as one of the parts involved in sound generation is not always in a healthy condition. Throat disorders are usually in the form of inflammation that results in coughing or hoarseness. These disorders have effect on changes in the characteristics of the sound produced. There are several studies that have been carried out by other researchers relating to the throat disorders, including research (Prasetya, et al, 2015) that uses digital image processing for the detection of tonsillitis. In this study, the decision of abnormalities was carried out based on the color and extent of the tonsils with the accuracy up to 90.6% Other research related to respiratory disorders is carried out by (Shrivastava, et al, 2018) who use pitch and formant parameters to distinguish healthy people from those who experience respiratory problems. The result shows that the average value of the F1 formant frequency for people who experience respiratory problems is higher than normal people while the formant F2, F3 and pitch have a lower value than normal people.
There are several aspects that need to be clarified or discussed further related to (Shrivastava, et al, 2018). First, the number of respondents and the condition of the disease are not clear. Second, the method of comparing healthy and unhealthy sounds from two different people is inappropriate because everyone has a different sound color. Third, there are 20 words spoken by respondents, but it was not clearly stated in the text. Fourth, how far the change in pitch and formant values is also not explicitly stated. This research was conducted to obtain a more in-depth study of changes in pitch, formant and spectrum patterns between healthy people and people who have health problems, especially larynx disorders. The analysis is conducted on vowels which are considered as the dominant sounds in the conversation in Bahasa, i.e. /a/, /i/, /u/, /e/, and /o/.

MATERIAL AND METHODS
The steps taken in this study include data collection, computational data and analysis of results as shown in Figure 1. There were seven respondents involved in the data collection process. The seven respondents were male ranging from 35 to 50 years old. Voice data taken at first is vowel sound data when the respondents are not in a healthy condition. After the respondent is healthy, another data is taken for the healthy condition. In collecting the data, each respondent was asked to repeat the pronunciation of vowels for 10 times. Since this research focus on sore throat effect, sound characteristic comparison is required. The comparison is made in term of pitch, formant and pattern. To get those kind of information, frequency domain analysis is indispensable. Therefore, fast Fourier transform (FFT) by Matlab software is involved to carried out the computation process. Generally, the computational process consists of data extraction, pitch and formant detections, and frequency domain transformation. The result of this process is data pitch, formant and sound spectrum patterns for both healthy and sore throat conditions. Then an analysis of pitch and formant changes is carried out. Whereas to find out changes in spectrum patterns, cross-correlation calculations were performed between healthy spectrum patterns and sore throat for each vowel of each respondent. The value of cross-correlation indicates the similarity of the two signals.

Data Collection
Data collection from respondents is carried out twice separately in 2017. The first stage is recording for sound with sore throat. Raw data is a signal in the time domain. Figure 2 shows the signal snippet for the vowel /a/ of respondent 1 when there is a sore throat. The recording process is also carried out for other vowels with 10 repetitions for each. After the doctor claimed that respondent 1 was healthy, the voice recording process was carried out again for all vowels. Figure 3 shows the sound pieces of vowel / a / from the respondent 1 when the condition is healthy. The same process is applied to all other respondents. Signal data from all respondents are stored for later analysis in the next stage.

Characteristic Extraction
In this step, we want to get information including pitch, formant, and changes in spectrum patterns. Pitch periods can be obtained in the time zone by observing the signal patterns as shown in Figure 4. Furthermore, pitch periods can be obtained by using auto-correlation method as given in Equation (1) (Piet M.T. Broersen, 2006).
Where, X(k): signal in frequency domain x(n): signal in time domain N: numbers of points in FFT process is twiddle factor that equal to / k is index 0, 1, 2, … N-1 Figure 6 shows the FFT result for vowel /a/ of respondent 1. Since the sound spectrum is below 2kHz, spectrum display is limited to frequency 2kHz. There are several peaks can be recognized as the formant of the signal. We only focus on four highest peaks called formant-1 (F1) until formant-4 (F4) arranged from the lowest frequency to the highest one.  When the peak value is lower than 1 (2 2) ⁄ , it will not be considered as the formant anymore. It is illustrated in Figure 7 for vowel /i/ of respondent 6. In further, all formant data are summarized in Table 2 where FXY is the formant X from respondent Y.  Sound characteristics are then obtained from the spectrum pattern in the frequency domain. In two sounds even though they have the same pitch and formant, but when the peak values of each are different, they will give a different sound color. To figure out how big the sound spectrum when the respondents are healthy or having sore throat, it can be conducted by doing cross-correlation as given in Equation (3) (Piet M.T. Broersen, 2006).
where y (k) is the spectrum pattern when there is a sore throat and z(k) is the spectrum pattern when the respondent is healthy. The smaller correlation value means smaller similarity pattern between them and vice versa. Employing Equation (3) to all pattern, we get the crosscorrelation values as shown in Table 3.

Discussion
This section discusses the results of characteristic extraction that have been carried out in the previous section. The discussion focused on changing the value of pitch, formant and spectrum patterns.
(1) Change in Pitch: Based on the pitch frequency presented in Table 1, it is known that there is a change in the pitch value in almost all sounds, except for the vowel /e/ respondent 4. The next percentage change in pitch is presented in Table 4.   Table 4 it is known that 71.4% of the pitches change to a lower frequency when a throat disorder occurs. There is an overall change in pitch frequency of 5% towards a lower frequency.
From the respondent's point of view, six of seven respondents have a tendency in frequency shifting to a lower frequency. However, the only respondent that has a change into a higher frequency is respondent 2.
From the data and analysis above, it can be concluded that sore throat tends to result in a change of pitch to a lower frequency. This result is consistent with research in (Shrivastava, et al, 2018) where the pitch in people suffering from sore throat is lower than pitch in normal people. However, an exact pattern cannot be obtained from the given data. The frequency change is very wide which ranges from 0% to 83% with an average frequency change is 6%.
(2) Formant Change: To determine the change in formant position, the formant frequency is compared with the pitch frequency, therefore we get Table 5. Note that the values given in Table 5 are rounded. The value of '1' means that the formant frequency is the same as the pitch frequency, while '2' means that the formant frequency is about twice of the pitch frequency.
Based on Table 5, we can identify that there are 17 of 35 voice that the formants have a same frequency in healthy and sore throat conditions. It means around 48.57% formants are not affected by sore throat while the rest, around 51.43%, are shifted.  As shown in Table 6, there are 8 sounds (22.86%) which have formant addition, 3 sounds (8.57%) with a reduction in formant components, 7 sounds (20%) which have formant shifts and 17 sounds (48,57%) do not have any change. It can be concluded that sore throat effects the formant changes by 51.43%.  Table 6, the most affected vowel due to sore throat is vowel /a/ where all respondents showed the same tendency. However, vowel /i/ is the most invulnerable vowel in the sore throat condition.
From the respondent point of view, there is no respondent who have no change in its vowel sound. Unfortunately, the change pattern cannot be exactly determined.
The results have been discussed above are not in line with the result given in (Shrivastava, et al, 2018) that people who have sore throat inflict to higher F1, and lower F2 and F3. However, we cannot claim which result is more valid because there are many variables should be considered such as gender, age, and ethnic.
(3) Changes in Spectrum Patterns: The differences in spectrum patterns is indicated from the results of cross-correlation calculations using Equation (3) that given in Table 3. The greater value implies that the greater the similarity of the two patterns. From Table 3, the lowest similarity is found in the vowel /a/ spectrum pattern, while the highest similarity is obtained in the vowel /u/ and /i/ spectrum patterns. It means sore throat brings a big change in the vowel /a/ spectrum pattern, while the vowels /u/ and /i/ not so. This statement is consistent with the formant analysis that has been previously discussed.
From the discussion above, it can be emphasized that sore throat around the pharynx will have a lot of effect on the human speech of the vowels /a/. It makes sense since the origin of the vowel /a/ is from the larynx where the disorder is located. However, the vowel /i/ and /u/ come from the upper-front palate which is less affected by pharyngeal disorders.
Furthermore, a sore throat detection system can be made by utilizing the results of this study. The sore throat can be detected by recognizing the irregularities of the characteristics of the vowel /a/. To get more information of these aspects and their detection techniques, a more in-depth study is needed as the further work of this research

CONCLUSION
From the results of the research, it is found that the sore throat around the pharynx will affect the pitch frequency, formant and the spectrum patterns. Testing on seven respondents for the vowel sounds /a/, /i/, /u/, /e/, and /o/ obtained results that sore throat influence the pitch by 71.4% with a tendency towards lower frequencies. Unfortunately, the shift in pitch cannot be exactly determined yet. Observation result in the formant pattern, we get that vowel /a/ is the most affected by sore throat while the less affacted is vowel /i/. Furthermore, based on the calculation of cross correlation obtained that the biggest change in frequency spectrum pattern occurs in the vowel /a/, while the smallest change are the vowels /u/ and /i/. By obtaining the vowel /a/ as the most character change when there is a sore throat, it is very possible to detect sore throat based on the characteristics of the vowel /a/. However, further studies are needed to obtain the most appropriate detection technique.