Multi-Abnormal ECG Signal Classification using Dispersion Entropy and Statistic Feature

Electrocardiogram (ECG) is one of the most widely used medical devices to diagnose heart disease. Abnormal ECG signals have variations and some are similar to another. Therefore, in this study, proposed a method for classifying cardiac abnormalities based on ECG using first-order statistical features and Dispersion Entropy (DisEn) for feature extraction. Meanwhile, for the multiabnormal ECG signal classification stage, we compared the Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) methods. In this study, seven ECG classes were classified, namely Normal, Atrial Fibrillation (AFIB), Atrial Flutter (AFL), Atrial Premature Beats (APB), Begiminy, Left Bundle Branch Block (LBBB), and Premature Ventricular Contraction (PVC). From this simulation, the system can detect normal and abnormal signals with an accuracy of 85.1% using K-NN. Meanwhile, the classification simulation of seven classes of ECG signals produces an accuracy of up to 75.1%.


INTRODUCTION
One of the most commonly used procedures for identifying heart disease is the electrocardiogram (ECG) (Martis et al., 2013) (Kaur and Arora, 2012). By applying electrodes on the skin, the ECG monitors the heart's electrical activity over time. This signal is used to diagnose cardiovascular health and can be deciphered by understanding the basics of its four waves: T, U, P, and QRS complex (Manullang, Simanjuntak, and Ramdani 2019) (Pestana et al., 2020) (Sahoo et al., 2017). Our prior work used SVM algorithms as classifiers and built an ECG classification based on ECG signals (Aulia and Hadiyoso, 2021). The results' accuracy, sensitivity, and specificity were 81.1%, 89.8%, and 79.4%, respectively. However, the proposed method has still been refined to aid clinical diagnosis. As a result, we proposed in this paper that we optimize the performance from the previous experiment (Aulia and Hadiyoso, 2021) to classify ECG signals into seven categories: Normal, Atrial Fibrillation (AFIB), Atrial Flutter (AFL), Atrial Premature Beats (APB), Begiminy, Left Bundle Branch Block (LBBB), and Premature Ventricular Contraction (PVC).
Prior research for identifying ECG data generally used the Support Vector Machine (SVM) technique. The SVM approach was commonly used in previous research to classify ECG data and epileptic ECG signal categorization (Rizal and Hadiyoso, 2018). We achieve a satisfying result from both experiences with a 93.8% and precision of 97.7%, respectively. Another classifier that is mainly used is the K-NN method (Hassanat et al., 2014), because the technique is effortless and highly effective in the field of image processing, machine learning, text analysis, data mining, object recognition, and other fields (Aulia et al., 2015;Zhang et al., 2017). Based on state of the art above, in this study, we use DisEn, which optimizes with the statistic first order (mean, variance, skewness, kurtosis) for the feature extraction. As for the classifier method, we compare the SVM and the K-NN. Five-cross validation was used to divide training and test data in this study. The data originates from a single lead in mat format retrieved from the MIT-BIH Arrhythmia's PhysioNet service.

MATERIAL AND METHODS
The system overview is presented in Figure 1. The input ECG signal with a duration of 3 seconds will be extracted its essential features using statistical methods (mean, variance, skewness, and kurtosis) and DisEn. Each ECG signal generates 5 features which then become a feature set. The feature set then becomes a predictor in the classification stage. Descriptions of the material and methods used in the findings of this study are presented in the following sub-sections.

ECG Dataset
The ECG signal database used in this research was recorded with a gain of 200 amplification [adu/mV] at a sampling frequency of 360 Hz. Recording was performed on 45 patients with a minimum age of 23 years for women and 32 years for men with a maximum age limit of 89 years. There were 19 female patients and 26 male patients that were classified into 7 classes, i.e. Normal, Atrial Fibrillation (AFIB), Atrial Flutter (AFL), Atrial Premature Beats (APB), Begiminy, Left Bundle Branch Block (LBBB), and Premature Ventricular Contraction (PVC). The data comes from one lead in mat format obtained from the MIT-BIH Arrhythmia database from the PhysioNet service (http://www.physionet.org). A total of 794 ECG signals with a length of about 3 seconds were simulated in this study.

First Order Statistical Feature Extraction
First-order statistical features that are calculated include mean, variance, skewness, and kurtosis. These features are calculated using the following Equation (1)-(4) (Esmael et al., 2013).
with N is the number of sample, is mean, is the data, is variance, is skewness, and is kurtosis.

Feature Extraction Using Dispersion Entropy
Entropy in a signal is an effective measure used for the irregularity and uncertainty of time series. In 1948, Shannon introduced the concept of entropy for the measurement of the amount of regularity of time series which was previously measured using the concept of probability distribution. The system has a maximum entropy if a similar system has a different state otherwise the system has a minimum entropy if a similar system has the same state or the probability value is one (Rostaghi and Azami, 2016) (Richman and Moorman, 2000).
Sample entropy (SE) (Humeau, 2018;Zaylaa et al., 2015) (Sharma et al., 2015) and permutation entropy (PE) (Zanin et al., 2012) are commonly used entropy in biomedical signals. The disadvantage of SE is that it is not fast enough, especially for long signals, while the drawback of PE is that it does not consider the average value of the amplitude and the difference between the amplitude values (Redelico et al., 2017).
Dispersion entropy (DisEn) can overcome the shortcomings that exist in SE and PE. The method used is to convert the data into a new signal. Various signal parameters such as amplitude, frequency, noise power, and bandwidth are processed from randomness to periodic oscillation (Azami and Escudero, 2018). The signal processing process includes autoregressive process, MIX process, noise bandwidth increase, and additive noise power enhancement. The new signal obtained is formed into several patterns to calculate the probability of forming the pattern (Kafantaris et al., 2019).
The DisEn algorithm includes 4 main steps for a univariate signal of length : = { 1, 2, … , } 1. Take a number of linear and nonlinear approaches to map ( = 1, 2, … , ) to class from 1 to . Normal cumulative distribution function (NCDF) is used to map into ( = 1, 2, … , ) from 0 to 1. The signal has members and each member is an integer from 1 to . 2. The number of possible dispersion patterns for each time series , is equal to . Each embedding vector , has dimensions of length template, the delay time , and the number of classes that represents the number of patterns.
Selection of appropriate parameters is something that must be considered in the DisEn approach. The potential dispersion pattern number ( ) must be smaller than the signal length for signal reliability. When the value of does not match, the DisEn method will be sensitive to noise. The selection of the value of affects the detection of dynamic changes in the signal.

Performance Evaluation
For performance evaluation, there are several performance evaluations which are commonly used, including Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). (Madan and Gupta, 2014). SVM has a clearer concept mathematically compared to other classification techniques.

Support Vector Machine Support Vector Machine (SVM) is one of the methods in supervised learning which is usually used for linear and non-linear classification and regression
SVM is used to find a hyperplane, which is a function that can be used to separate the best classes by maximizing the distance between classes (Venkatesan et al., 2018). As shown in Figure 2, hyperplane 1 (H1) does not separate classes, (H2) separates classes with a small margin, and (H3) performs class separation with a maximum margin. Two classes are not always perfectly separated, so SVM needs to be reformulated using soft margin techniques. If the soft margin technique is still unable to find the separator in the hyperplane, a kernel is needed to transform the data to a higher dimensional space.

K-Nearest Neighbor
K-Nearest Neighbor (KNN) is an algorithm that is widely used in the world of machine learning for classification. This algorithm classifies data based on similarity or similarity or proximity to other data "neighbors" (Maniyan and Shivakumar, 2018). First determining the number of neighbors (K) that will be used for class determination considerations, then calculating the distance from the new data to each data point in the dataset and finally determining the class of the new data with the data reference with the closest distance. The KNN illustration is shown in Figure 3, with a value of K = 3, the new data will be included in the red class because in a circle with 3 members, there is more red than blue. If the value of K = 5 is used, the new green data will be included in the blue class.

RESULTS AND DISCUSSION
In this section, the results of feature extraction and performance evaluation of proposed methods in the classification of ECG abnormal signals are discussed. The results of the average and standard deviation of the statistical characteristics and DisEn for each ECG signal are presented in Figures 4, 5, 6, 7, and 8. Figure 4 shows the feature mean characteristics of each ECG signal where the ECG signals have a difference of one with each other with overlapping standard deviations. The AFL signal generates the highest mean characteristic compared to others. Figure 5 presents the variance characteristic where this feature tends to generate similar values as indicated by a high standard deviation. The characteristics of kurtosis and skewness as presented in Figures 6 and 7 show the differences between ECG types. However, it has a fairly high standard deviation as the mean feature. Meanwhile, the DisEn feature presented in Figure 8 shows a difference with a low standard deviation. With this result it is thought that all features will be used in the classification simulation so that there is no feature selection scenario as a predictor. The final stage is evaluating the performance of the proposed method using a classifier. At this stage, several classifier methods are used to test the robustness of the proposed method. SVM with kernel variations and K-NN were used in this evaluation. Meanwhile 5cross validation was employed to share the training and test data. Cross-validation was chosen to avoid overfitting in the case of classification with a relatively small dataset where in this study there were a small number of ECG signal types. Table 1 presents the accuracy of each classifier. Quadratic SVM and K-NN generate the highest accuracy. The highest accuracy produced is 75.1%. Other classification methods also produce similar accuracy (>70%) except for linear SVM. These results indicate that the proposed feature extraction method is robust. The confusion matrix for the highest accuracy case is presented in Table  2. From Table 2 it can be seen that AFL and APB contribute to generating the highest misclassification. This is because the two types of ECG have similar characteristics, besides that the amount of training data is less than other types. Another test scenario that is simulated is the evaluation of the performance of the proposed method in classifying normal and abnormal ECG. Abnormal ECG consists of six types of ECG including PVC, bigeminy, AFL, AFIB, LBBB, and APB. The results of testing this scenario are shown in Table 3. From this simulation, the system is able to detect normal and abnormal signals with an accuracy of 85.1% using K-NN. Meanwhile, SVM yields slightly lower accuracy.
The results of this study yield lower accuracy compared to studies (Estananto, 2018) (Wijayanto et.al., 2022) (Aulia and Hadiyoso, 2021). However, this study presents a system that can classify ECG signals with more signal types. Other studies only present two or three class classifications. Some of the limitations of this study are the small amount of data on some types of ECG. This contributes to lowering the accuracy of the system. Future studies are wide open in the exploration of other feature extraction methods to improve classification accuracy.

CONCLUSION
This study presents a classification system of ECG signals into seven classes. The feature extraction method proposed in this study uses the mean, variance kurtosis, skewness, and DisEn. The result of feature extraction shows that the standard deviation overlaps with a relatively small value, so there is no specific feature to be used as a predictor. As a performance test on this classification system, SVM and KNN are used, with 5-cross validation being used to separate the training data from the test data. From the simulations performed, the highest accuracy in the seven-class classification case is 75.1%. Meanwhile, the proposed method can classify normal ECG and abnormal ECG with an accuracy of 85.1%. Limited data makes system accuracy not optimal, as in AFL and APB, this contributes to generating the highest misclassification. For future research, there is an opportunity to explore other feature extraction methods so that accuracy can be improved. Deep learning methods are also thought to generate higher classification accuracy. Classification of normal and pre-ictal EEG signals using permutation entropies and a generalized linear model as a classifier.