EEG analysis – automatic spike detection

Abstract. In the diagnosis and treatment of epilepsy, an electroencephalography (EEG) is one of the main tools. However visual inspection of EEG is very time consuming. Automatic extraction of important EEG features saves not only a lot of time for neurologist, but also enables a whole new level for EEG analysis, by using data mining methods. In this work we present and analyse methods to extract some of these features of EEG – drowsiness score and centrotemporal spikes. For spike detection, a method based on morphological filters is used. Also a database design is proposed in order to allow easy EEG analysis and provide data accessibility for data mining algorithms developed in the future.


Introduction
Electroencephalography (EEG) is a widely used medical technique, for monitoring electrical brain's activity produced by neurons.Technically an EEG consists of multiple channels that monitor neurons' activities in a region, each channel represent an electrode on a patient's scalp.
Epilepsy is a neurological disorder manifesting in uncontrolled seizures.These seizures lead to a discharge in the brain, generating a disturbance in the EEG.The analysis of EEG looking for these spikes over background activity is the main method of epilepsy diagnosis and treatment.A regular EEG can have from up to 20 electrodes and last more than an hour [1].Currently visual methods are used for this analysis, qualified doctors visually inspect the whole EEG.This is a very time consuming task.To improve EEG analysis automatic tools are required.
The most time consuming task in EEG analysis is spike detection.For this purpose many algorithms were suggested.The most recent overview of such algorithms is given in [2], providing numerous list of most essential approaches to spike detection and to analysis of their performance.However the author concentrates on algorithms performance measurement and comparison, by using combination of statistical parameters, however he gives almost no mathematical formulations.According to performance reported in this article as well as results of review in the article [3], we decided to experiment with mimetic algorithms, combining with template matching.Therefore an automatic method based on morphological filter is presented in this work.Main purpose of this investigation is to develop and evaluate the mathematical/logical architecture of such approach.The filter developed was used on data provided by Vilnius University Children's Hospital.The filter was found to be susceptible to high frequency noise.To solve this problem a finite response filter was used.Also two filters based on spike definition are introduced to improve the results: spike rejection based on length as well as on neighbourhood.The resulting filter detect all known spikes in the provided data, but it turned out to be susceptible to events of noise such as eye movement.
Epileptic spikes are precipitated by sleep.Therefore an important factor in EEG analysis is a person drowsiness.A method widely used for measuring drowsiness is Karolinska drowsiness score (KDS) [4].This score is usually calculated by hand, however can be easily implemented for computer detection.This method is described in this work.Unfortunately KDS is not always accurate.
These and a lot of other EEG feature extraction methods can be combined to provide additional data in epilepsy diagnosis.Moreover they can be used with data mining algorithms to extract additional information, for example a new more accurate drowsiness score.For this reason a system to access and store EEG features is required.In this work we present a document based database design.It provides storage with the possibility to add additional extracted EEG features as they become available as well as ease of access to them.

Electroencephalography (EEG)
EEG is the recording of electrical activity along the scalp produced by the firing of neurons within the brain [5].
EEG was discovered by the German psychiatrist, Hans Berger, in 1929.Electrical activity recorded by electrodes placed on the scalp mostly reflects summation of excitatory and inhibitory postsynaptic potentials in apical dendrites of pyramidal neurons in the more superficial layers of the cortex.Quite large areas of cortex -in the order of a few square centimeters -have to be activated synchronously to generate enough potential for changes to be registered at electrodes placed on the scalp [6] EEG plays a central role in diagnosis and management of patients with seizure disorders.
Routine EEG is used in the following clinical circumstances: • Epilepsy (to determine epileptic activity, epileptic focus, to diagnose epileptic syndrome).
• To differentiate encephalopathy, neurodegenerative disorders, to evaluate comatose patients.• To serve as an adjunct test of brain death.

Brain rhythms
The frequency of brain waves can differ based on the state of the person being monitored.These brain waves are categorised into 5 brain rhythms based on their frequency [5]: • Delta rhythm.3.5 Hz or lower.Detected during deep sleep.

Electrode location by the international 10-20 system
This system ensures that the naming of electrodes is consistent across EEG laboratories.In most clinical applications, 19 recording electrodes (plus ground and system reference) are used.The modified combined nomenclature derived from the 10-20 system should be used for electrode location [5].
Before recording electrodes are placed on the scalp with a conductive gel or paste, usually after preparing the scalp area by light abrasion to reduce impedance.A routine EEG should (at least) include bipolar montages with longitudinal and transverse chains.These chains should be used with equal electrode distances and side-to-side symmetry to avoid the artifact of false amplitude asymmetry [7].

Epilepsy
Epilepsy is the tendency to experience repeated seizures, which stem from activity originating in the brain [8].The seizures happen because of abnormal, excessive or synchronous neuronal activity in the brain.People may have strange sensations and emotions or behave strangely.They may have violent muscle spasms or lose consciousness.Epilepsy has many possible causes.It could be illness, brain injury and abnormal brain development.In many cases, the cause is unknown.There are over 40 different types of epilepsy.

Benign epilepsy of childhood with centrotemporal spikes (rolandic epilepsy)
Benign epilepsy of childhood with centrotemporal spikes is the most common focal epilepsy in childhood.This disorder is also called rolandic epilepsy.
Loiseau and Duche provided five criteria for the diagnosis of benign childhood epilepsy with centrotemporal spikes: 1) onset between the ages of 2 and 13; 2) absence of neurologic or intellectual deficit before the onset; 3) partial seizures with motor signs, frequently associated with somatosensory symptoms or precipitated by sleep; 4) a spike focus located in the centrotemporal (rolandic) area with normal background activity on the interictal EEG; and 5) spontaneous remission during adolescence [7].
Rolandic epilepsy can start as early as 1 year of age or as late as 15 years of age, but mostly it have onset of seizure between 7 and 10 years.Boys are more often affected, with a ratio of 3:2.

Centrotemporal spikes
The cornerstone of the diagnosis of benign childhood epilepsy with centrotemporal spikes lies in the characteristic interictal EEG pattern: centrotemporal spikes on normal background activity.The centrotemporal spikes are typically seen independently on both sides of the head.Despite their name, these are usually high amplitude sharp and slow wave complexes localized to the central (C3/C4) electrodes or midway between the central and temporal electrodes (C5/C6).They are broad, diphasic, high-voltage (100-microvolts to 300-microvolts) spikes, with a transverse dipole, and they are often followed by a slow wave.The spikes may occur isolated or in clusters, with a rhythm of about 1.5 Hz to 3 Hz [7] Sharp and slow wave complexes in areas outside the centrotemporal regions, such as occipital, parietal, frontal, and midline regions, may occur concurrently with centrotemporal spikes.They are of similar morphology to centrotemporal spikes.Normal sleep architecture is preserved.Sometimes the generalized spikes could be found [9].
The spikes retain the same morphology in spite of change in state of vigilance but the number of spikes is highly activated by sleep.So it is a good model for automatic spike detection.
The main characteristics of centrotemporal spikes in rolandic epilepsy are: • Length of 40-200 ms.
• Amplitude two times higher then that of the base line.
• Must be detected in at least two neighbouring electrodes.

Database design
To reduce time needed to develop EEG analysis algorithms and to allow large scale data processing, an efficient database was needed.The main requirements for this database are: • Store EEG signals and allow fast access to them.
• Allow algorithms to store data related to an EEG signal (KDS, spikes, etc.), which might later be used in advanced analysis such as data mining algorithms.
• Should be very scalable, since the number of EEG signals can be quite large.

Database management system
Traditional relational database management systems (RDBMS) were found to be slow and not flexible enough for the given task.The analysis of EEG requires the ability for new data types to be added as they become available (such as KDS, spike locations and etc.).RDBMS systems require a scheme to be defined and altering it is very costly.Therefore schema-less object databases were used.Schema-less allow more flexible ways to define data types and require almost no time needed for the setup.MongoDB1 was found most fit for the given task.MongoDB's MapReduce framework also provides an efficient way to distribute data processing over a large amount of computer nodes, which does shorten the time needed to process large amounts of EEG data.

Data
Our datasets are EEG records, approximately 1 hour long each.Each record contains recorded data over 20 or even more EEG and non-EEG channels, such as EKG.It was chosen to subdivide the EEG records to 30 s intervals.This length was chosen since it is the time interval usually analyzed by doctors, also it provides enough data for simple analysis (KDS, spike detection, etc.) while still being small enough to be processed efficiently.
Each interval is stored in a document.Along with the signal itself meta-data, such as sampling rate, length and patient information is stored.A tag property is used to identify which algorithms have been applied to this EEG.
Each database element is a document, therefore it is very easy to add additional data such as spike information or KDS.The analysis algorithm only needs to add a new property representing the data to the document and add its name to the tag property.By using the tag property it can be easily distinguished which documents have additional data calculated by analysis algorithms.During the analysis of KDS and spike detection algorithms this database design proved it self to be very efficient and flexible to work with.

EEG data analysis
In this section a method to calculate Karolinska drowsiness scale will be discussed.
To calculate the required brain rhythm DFT is applied over the signal.All the frequencies not corresponding to the required rhythm are set to zero.IDFT is applied over the resulting data.
After filtering out all the waves except the required rhythm we can calculate its power.To do this we use a statistical value -root mean square.Which is defined as: Here x is value of the signal at a discrete time.For every interval of EEG all of the brain rhythms and their powers are calculated.Out of these a rhythm is said to be dominant in an interval if its power is the highest.

Karolinska drowsiness scale
Karolinska drowsiness scale (KDS) [4] is an objective method to calculate the drowsiness of a person based on an 20-30 s EEG interval.The scale assigns a score from 1 (completely awake) to 9 (very drowsy) to a EEG interval according to a persons drowsiness.The score is calculated as follows: Input: an EEG interval I Output: KDS score BEGIN Split I into 10 subintervals in.
Calculate the dominant brain rhythm for every in.
If 3 or more in have theta as the dominant rhythm the person is asleep, return 10.Else return the number of times alpha is the dominant rhythm in in.END

Automatic spike detection
Due to the need of an automatic spike detection method in electroencephalograms there are a few algorithms [3] proposed.One of the most reliable of these is an algorithm based on mathematical morphology.This method is shown to be accurate on 91.62% of centrotemporal spikes [10].Due to this high reliability morphological filtering was chosen for automatic spike detection.

Morphological operations
In this section we give a short explanation of mathematical morphology used in the filter.
In the filter two main operations are used erosion and dilation.These two operations are also called Minkowski addition and subtraction respectively [10].Let's say f (t) is a time series of time t (i.e. a single EEG channel) and g(t) is a function defining a structural element.Then g s (t) is defined as a reflection of g(t), g s (t) = g(−t).D is the domain of f (t).Then we can define: Using these operations we can define two new operators [10].
Opening operation: Closing operation: The opening operation smoothes f (t) from below by cutting of its spikes, while closing smoothes the function from above by filling up valleys between spikes.Therefore opening and closing can be used to detect spikes and valleys in function f (t).

Morphological operation combination
Centrotemporal spikes can have both positive and negative amplitudes.Since the closing and opening operations can detect spikes of only positive or negative amplitude, two additional operators need to be defined: Close-opening [10]: Here g 1 (t) and g 2 (t) are two distinct structural elements.Both of these operations distort the amplitude of the function.Open-closing has a lower amplitude while close-opening has a higher amplitude.These distortions can cause false positive identifications or hide centrotemporal spikes.The effect of these operations is opposite to each other.Therefore their average can be used instead [10]:

Structural element
In order to distinguish between background activity and centrotemporal spikes a structuring element is needed.This element needs to fit in regular EEG waves but not into spikes.An element that matches the morphology of a single EEG wave is a parabola.Therefore the two parabola shaped structuring elements are defined [11]: Parameters a and b control the width and amplitude of the parabola.Since the width and amplitude of an EEG signal varies depending on a variety of factors, it is impossible to select a set of values that match every interval of a patient's EEG.Therefore each structuring element must be fitted to a small time frame of a single EEG channel.Another reason for this approach is that doctors detect centrotemporal spikes by their neighbouring data.
Let us define widths as an array where the width of arcs is defined.The width of an arc is the distance between two extrema (min or max) in f (t).|f | -an array of amplitudes of the signal f (t).Then the width w i and height h i of structural elements g i can be calculated as: The structuring element g 1 (t) is applied to the original signal and needs to fit the waves of the original.The second structuring element g 2 (t) is applied over a signal modified by a closing or opening operation therefore the values need to be increased.When the width and height of the parabolas is known, parameters a i and b i can be calculated as: 5 * median(widths) .

Application
After fitting the structural element, the morphological filter can be applied: The resulting signal x(t) will have its background activity diminished, while the spikes will be exposed.To identify the centrotemporal spikes, a threshold needs to be defined.This threshold is the limit which needs to be exceeded in order for a spike to be considered a sign of epilepsy [11]: Here the function extrema(x(t)) is defined as the amplitude of x(t) extrema.

Finite impulse response filter
After applying the morphological filter over real life data it can be seen that it is very susceptible to high frequency noise.To remedy this problem a finite impulse response filter (FIR), based on finite impulse response, can be applied over the EEG signal.FIR is a discrete time filter whose impulse response is finite -it settles to zero in a finite number of steps.It is low-pass filter, which means that it only adjusts signals higher than the cut off frequency.FIR is defined by a difference equation [12]: where x[n] is the input signal, y[n] is the output signal, b i are the filter coefficients and N is the order of the filter g.

Filter tests on real life data
The morphological filter was applied on data provided by Vilnius University Children's Hospital.The data included 15 EEGs of children with Rolandic epilepsy.Each EEG was around an hour long and included signals gatthered by electrodes placed in the 10-20 system.
For the tests the previously described threshold was used.While the filter worked on some of data, on other it returned spikes on every extrema.This was due to a 35 Hz noise distorting the orignal signal.The 35 Hz noise is of neurophysiological origin.Due to noise the threshold would be set very close to 0. This problem was solved by applying a FIR filter with a cutoff frequency of 35 Hz.Since the length of a centrotemporal spike is 40-200 ms it did not affect the detection of spikes in any other way.Test results showed that a 64th degree FIR filter was enough to process the signal while not incurring a large performance penalty.Unfortunately the FIR filter also caused the morphological filter to identify false positive spikes in noisy regions.Especially electrodes F7 and F8 during eye movement.The morphological filter can not handle noisy signals therefore a different algorithm should be used to identify or correct these intervals.
Another approach to filter out the 35 Hz noise is to change the threshold to: While this does solve the problem it also causes a lot of false negatives.After testing the threshold with average and median as well as adjusting the multiplication constant it was observed that a combination of the FIR filter and a threshold with a median produced the best results.The method described produced good results.However some identified spikes where a lot shorter than the definition of an centrotemporal spike.By definition the spike should be 40-200 ms long.Therefore another filter was applied which rejected spikes based on their length.This filter decreased the number of identified spikes by 12,53%, all of them false positives.
A centrotemporal spike is defined as occurring in at least two neighbouring electrodes.After analysing the data processed by the morphological filter it was noticed that a lot of the spikes occurred only in one channel.Therefore a method to reject single spikes needed to be defined.This algorithm uses a graph of electrode positioning according to the 10-20 system.Split each ln into subgroups lnn, where each spike in ln is either a neighbour or a transitive neighbour to all other spikes in nln according to G.
Remove lnn elements with only one spike.Present the remaining list of lnn.END

Results
The resulting filter recognised all of the known epileptic spikes in clear signals.Unfortunately it also recognised spikes not related to epilepsy and detected a lot of false positives in noisy areas.Detecting epileptic spikes in noisy signals is extremely hard or impossible even for trained doctors.Therefore a method for detecting noisy areas and rejecting them is required.Such a method is not discussed in this work.

Conclusion
In this work we analyzed methods for EEG analysis.Methods for spike detection and KDS calculations were analyzed.The database design to store EEG data was proposed.
The centrotemporal spike detection filter detected all known spikes in EEG signals.Unfortunately the filter does not function in noisy areas.To solve this problem a algorithm should developed to recognise and exclude noisy areas.The current filter also only recognises spikes and does not classify according to other features of morphology, such as a spike-slow wave complex.In the future a methodology for analysing spike shapes should be added.This analysis could also provide the ability to separate spikes caused by epilepsy and normally occurring spikes.
Such parameters of the quality of spike detection as sensitivity/specificity/selectivity will be evaluated later, in a future research, as the algorithm developed will be incompassed into a framework of other methods, essential for better diagnosis support and finer selection of drugs.The algorithms in this article were not tested on any other data except of preselected ones.
The KDS algorithm was inserted into the database were it is used to analysis spikes according to drowsiness levels.
The suggested database was filled with data provided by Vilnius University Children's Hospital and used together with the previously described algorithms.In the future it will be used to store information extracted from EEG signals by new algorithms as well as provide accessibility for calculating derivative data.
To further advance automatic EEG analysis these work items are planned for the future: • Develop methods for noise detection (i.e.rapid eye movements).
• Develop methods for spike classification according to their shape.
• Separate spikes caused by epilepsy and naturally occurring spikes.
• High level analysis methods (i.e.epilepsy classification, a new drowsiness scale, etc.)
Input: A list of detected spikes L A graph of electrode placement G Output: Clusters of neighbouring spikes BEGIN Split L into sublists ln, where ln contains spikes where each spike is in a 20 ms window to another spikein ln.