Finished segmentation lit review section

This commit is contained in:
Sam Perry
2017-08-05 14:19:09 +01:00
parent 5cf125210f
commit af522867ae
+118 -36
View File
@@ -127,11 +127,15 @@ I'd like to thanks anyone and everyone...
\tableofcontents
\newpage
\section{Introduction}
% TODO: Write brief overview of history of PCG signal analysis
% TODO: Explain fundamental heart sounds
\section{Related Work}
There are currently a wide variety of methods employed for the analysis and
classification of PCG signals. Current research can be divided into 3 areas,
classification of PCG signals. Current research can be divided into 4 areas,
each of which are combined to create full classification system. These areas
are: signal preprocessing, signal segmentation and feature extraction methods,
are: signal preprocessing, signal segmentation, feature extraction methods,
and classification methods.
The performance and evaluation of complete systems are also discussed in
section~\ref{performance}
@@ -139,61 +143,133 @@ section~\ref{performance}
\subsection{Signal Preprocessing}
There are a large number of factors that lead to variation in quality of PCG
recordings: stethescope type, make and model, its microphone/sensors used for
recordings: stethoscope type, make and model, its microphone/sensors used for
recording of the data, the position used to record (i.e.\ lower left sternal
border, apex, pulmonic area, aortic area), built in filters/signal processing
used by the stethescope (i.e.\ noise filters, anti-tremor filters), medication that
a pacient may be taking, as well as many other factors that may influence the
used by the stethoscope (i.e.\ noise filters, anti-tremor filters), medication that
a patient may be taking, as well as many other factors that may influence the
recorded signal~\parencite[p.4]{Pavlopoulos2004}. This presents a significant
issue when attempting to analyse and compare a dataset of signals, as
variations in recordings and artefacts caused by factors other than heart
sounds will most likely interfere with analysis and comparison methods. To
account for this, pre-processing methods are widely used aiming to standardize
account for this, pre-processing methods are widely used, aiming to standardize
a dataset. This is also used as a way to accentuate features of the data that
are expected to be relevant during classification.\\
A common method employed is the use of decimation and a static filter to remove
unwanted spectral content that is most likely noise~\parencite{Liang1997a,
Homsi2016, Springer2016, Gupta2007}. This helps reduce higher frequency noise
such as speech, microphone movement and other interference caused externally.
Decimation tends to downsample to around 1--4KHz, with anti-aliasing filter
specifications varying across the literature. Generally, highpass chebychev or
butterworth filters are favoured with cutoff frequencies ranging from
400--750Hz.\\
such as speech, microphone movement, breething and other interference caused
externally. Decimation tends to downsample to around 1--4KHz, with
anti-aliasing filter specifications varying across the literature. Generally,
highpass chebychev or butterworth filters are favoured with cutoff frequencies
ranging from 400--750Hz.\\
In addition, many methods decompose the filtered signal using wavelet based
methods such as the discrete wavelet transform
(DWT)~\parencite{Liang1997a, Pavlopoulos2004}, continuous
wavelet transform (CWT)~\parencite{Langley2016} or wavelet
package decomposition (WPD)~\parencite{Liang1998}.
Wavelet transforms are popular as unlike Fourier transforms, they are well
Wavelet transforms are popular as, unlike Fourier transforms, they are well
localized in both the time and frequency domain. This allows for the analysis
of PCG signals across multiple frequency bands whilst maintaining transient
temporal events in the resulting decomposition~\parencite[p.93]{Ari2008}.
This may be used for analysis of transient events such as murmurs, that may
consist of higher frequency components than normal heart sounds.
% TODO: Add reference to table of methods
\subsection{Signal Segmentation}
Algorithms for the segmentation of PCG data aim to extract the structure of
the signal over time. This is a key stage in the analysis of PCG signals as the
structure and relationships between the fundamental heart sounds (FHSs) form
the basis for much of the further analysis performed on PCG data. A number of
methods exist for the extraction of FHSs. Tradiational methods rely on direct
extraction of peaks in the time domain to determine the structure of a signal.
These methods perform various transformation in order to accentuate the
transient events with the intention of isolating them~\parencite{Liang1997b}.
However, these methods tend to suffer significantly from background noise and
so perform poorly in sub-optimal conditions.\\ More recent methods use spectral
representations to assist in the splitting of the FHSs, in particular using
wavelet decomposition~\parencite{Liang1997a, Vepa2008}. These methods tend to
perform more robustly on signals of varying conditions\\ In addition, Machine
learning algorithms have been employed, such as $k$-Nearest
Neighbour classifiers~\parencite{Gupta2007}, Neural
Networks~\parencite{Sepehri2010}, and Hidden Markov
Models (HMMs)~\parencite{Ricke2005} to improve segment classification. Particular
success has been observed in Springer's use of logistic regression and Hidden
semi-Markov models (HSMM)~\citeyearpar{Springer2016}.
the basis for much of the further analysis performed on PCG data.\\
% TODO: insert segmented graph of PCG cycle
A number of methods exist for the extraction of FHSs. Traditional methods rely
on direct extraction of peaks from envelopes in the time domain to determine
the structure of a signal. These methods perform various transformation in
order to accentuate the transient events with the intention of isolating
them.\\
Liang et.\ al propose a method using the popular Shannon energy
envelope, achieving good accuracy across 37 recordings of
children~\citeyearpar{Liang1997b}. The algorithm aims to segment the data by
first extracting the envelope, then applying adaptive rule based thresholds to
determine peaks corresponding to segmentation points. When comparing results to
hand annotated ground truth, the system achieves a reported accuracy score of
84\%. However, due to the small sample size, and potential lack of noise in the
dataset used, this may not translate to a larger dataset recorded in
sub-optimal conditions.\\
More recent methods use spectral representations to assist in the splitting of
the FHSs, in particular using wavelet decomposition. These methods tend to
perform more robustly on signals of varying conditions.\\
Building on previous work, Liang et.\ al present an improved method, using the
discrete wavelet transform to decompose and reconstruct the signal into 7
distinct frequency bands~\citeyearpar{Liang1997a}. Applying a similar method
of envelope extraction and peak picking to each frequency band, the best
estimate of all frequency bands is then chosen as the final result. Criterion
for this choice is based on number of S1s and S2s detected, and the number of
artefacts discarded for each frequency band. This method achieved an improved
accuracy of 93\% accuracy across a larger dataset of 77 recordings. This
suggests that the algorithm is as robust if not more so than previous work by
Liang et\ al.\\
Vepa et.\ al proposed a wavelet decomposition based method that uses a
combination of simplicity and envelope features~\citeyearpar{Vepa2008}. This
approach attempts to improve robustness when analysing signals of varying
quality by using multiple complimentary features, allowing the method to base
decisions on a variety of statistical properties. Evaluating the algorithm on a
collection of 160 heart cycles from a variety of sources, a reported accuracy
of 84\% was achieved.\\
A variety of machine learning methods have been implemented with reasonable
success. Gupta et.\ al present a method that applies $k$-means clustering to
replace standard threshold based methods for determining peak classification in
a standard envelope based segmentation algorithm~\citeyearpar{Gupta2007}. This achieved a reported
accuracy of 90.29\%. Due to the standard envelope based method for feature
extraction, this method is still suceptible to noise and artefacts that occur
within the frequency bands of the heart sounds.\\
Sepehri et.\ al propose a method that combines neural networks with Power
Spectral Density (PSD) estimates~\citeyearpar{Sepehri2010}. This method
exploits the periodic nature of S1 and S2 heart sounds, combined with their
narrow frequency range, to train a neural network to separate these sounds from
other sounds and murmurs. This method achieves a reported 93.6\% accuracy on a
significantly larger database than other methods detailed.\\
Most significant success in segmentation algorithms has been observed through use
of probabilistic Models such as Hidden Markov Models (HMMs). Early research
using these models by Ricke et.\ al utilized embedded HMMs to model the 4
states of the PCG and their transitions~\citeyearpar{Ricke2005}. MFCCs and
Shannon Energy are used as feature vectors for the models. Results of
98\% accuracy were reported, although this was tested on only a small database
of signals.\\
Gill et.\ al achieve similar results, most notably with specific consideration
for the duration of each state in the HMM~\citeyearpar{Gill2005}. This is
handled through the extraction of 6 duration features based primarily on peaks,
which are then used as feature vectors for the HMM. Results of 98.6\%
sensitivity, 96.9\% positive predictivity for S1 sounds and 98.3\% sensitivity,
96.5\% positive predictivity were reported.
The issue of state duration is further addressed by Schmidt et.\ al through use
of a duration-dependent hidden Markov (DHMM)~\citeyearpar{Schmidt2015}. The
DHMM is a modified HMM that considers the duration of the current state when
calculating the probability of transition to another state. This modification
scored a reported sensitivity of 98.8\% and a positive predictivity of
98.8\%.\\
Building on previous work using HMMs, Springer et.\ al presents a segmentation
algorithm by using hidden semi-markov models (HSMMs) in combination with
logistic regression~\citeyearpar{Springer2016}. Use of Hidden semi markov model
allows for a priori information on the duration of the current state to be used
in probability calculation of the subsequent state. In this case, the knowlege
that there is an upper and lower limit on the duration of each component is
used in calculation of transition probabilities. A modified viterbi algorithm
is then used to calculate the most likely set of transitions based on observed
features. Logistic regression is then used to improve discrimination between
state features when compared to discriminatory methods used by previous work.
Performance was evaluated on a significantly larger database than previous
methods and achieved a reported accuracy of $95.63\% \pm 0.85\%$. Due to it's
rigorous evaluation and high accuracy, this method is currently considered the
state-of-the-art for PCG signal segmentation.\\
Table~\ref{SegmentationTable} provides a brief overview of significant research
into PCG segmentation. For a more complete summary of the current state of PCG
@@ -203,26 +279,32 @@ segmentation, please refer to Liu et.\ al~\citeyearpar{Liu2016}
\begin{landscape}
\begin{table}[htbp]
\captionof{table}{Summary of Segmentation Algorithms} \label{SegmentationTable}
\footnotesize
\scriptsize
%\centering
\rowcolors{1}{gray!15}{white}
\doublespacing
\begin{tabulary}{\linewidth}{LLLLL}
\dtoprule
Author & Method & Datasets & \mbox{Reported} Results & Notes \\ \midrule
Springer et.\ al (2016) & HSMM/Logistic regression & 10,172s of recordings from 112 patients. 12,181 first and 11,627 second heart sounds. & $95.63\pm0.85\%\;Ac$ & Supervised algorithm. \\
Huiying et.\ al (1997b) & Normalised average shannon energy envelope/peak picking & 37 recordings, 14 pathological murmurs and 23 physiological murmurs. 515 cycles & $91.03\%\;Ac$ & Unsupervised Algorithm. Dataset consists entirely of child recording. Optimized on full dataset \\
Vepa et.\ al (2008) & Wavelet decomposition, energy and simplicity measurement & 160 heart cycles collected from a variety of sources (training CDs, web resources) & $84\%\;Ac$ & Unsupervised Algorithm, Optimized on full dataset \\
Sun et.\ al & Viola integral envelope extraction, short-time modified Hilbert transform, peak picking & 6949s of recordings, from 121 patients & $97.37\%\;Ac$ & Supervised algorithm. Tolerance for segmentation accuracy not specified \\
Sepehri et.\ al & Spectral density estimation, auto-regressive parameters, multi-layer perceptron neural network & 120 recording, from 60 patients & $93.6\%\;Ac$ & Supervised algorithm \\
Ricke et.\ al (2005) & Shannon energy (and related features), HMM & 9 Recordings, from 9 patients & $98\%\;Ac$ & Supervised algorithm \\
Gupta et.\ al (2007) & Homomorphic filtering/K-means clustering & 41 recordings (340 cycles). Mix of normal (32\%), systolic (36\%) and diastolic murmurs (32\%) & $90.29\%\;Ac$ & Unsupervised Algorithm. \\ \bottomrule
Springer et.\ al \citeyearpar{Springer2016} & HSMM, Logistic regression & 10,172s of recordings from 112 patients. 12,181 first and 11,627 second heart sounds. & $95.63\pm0.85\%$ & Supervised algorithm. \\
Huiying et.\ al \citeyearpar{Liang1997b} & Normalised average Shannon energy envelope, peak picking & 37 recordings, 14 pathological murmurs and 23 physiological murmurs. 515 cycles & $91.03\%\;Ac$ & Unsupervised Algorithm. Dataset consists entirely of child recording. Optimized on full dataset \\
Vepa et.\ al \citeyearpar{Vepa2008} & Wavelet decomposition, energy and simplicity measurement & 160 heart cycles collected from a variety of sources (training CDs, web resources) & $84\%\;Ac$ & Unsupervised Algorithm, Optimized on full dataset \\
Sun et.\ al \citeyearpar{Sun2014} & Viola integral envelope extraction, short-time modified Hilbert transform, peak picking & 6949s of recordings, from 121 patients & $97.37\%\;Ac$ & Supervised algorithm. Tolerance for segmentation accuracy not specified \\
Sepehri et.\ al \citeyearpar{Sepehri2010} & Spectral density estimation, auto-regressive parameters, multi-layer perceptron neural network & 120 recording, from 60 patients & $93.6\%\;Ac$ & Supervised algorithm \\
Ricke et.\ al \citeyearpar{Ricke2005} & Shannon energy (and related features), HMM & 9 recordings, from 9 patients & $98\%\;Ac$ & Supervised algorithm \\
Schmidt et.\ al \citeyearpar{Schmidt2015} & DHMM, Auto-correlation duration features, Homomorphic envelogram & 113 recordings, from 113 patients. 8s per recording. 15 abnormal recordings & $98.8\;Se,\;98.6\;P_+$ on test set & All data recorded ``lateral to the sternum in the fourth intercostal space on the left side''. Mix of noisy and clean recordings. 40 recording used for training, 73 for testing \\
Gill et.\ al \citeyearpar{Gill2005} & Homomorphic envelogram, Embedded HMMs & 44 recording, 17 subjects. 30-60s per recording & $98.6\%\;Ac, 96.9\;P_+$ for S1. $98.3\;Ac,\;96.5\;P_+$ for S2 & Recording taken in sub-optimal environments (noisy hospitals, offices etc...) \\
Gupta et.\ al \citeyearpar{Gupta2007} & Homomorphic filtering, $k$-means clustering & 41 patients, 340 heart cycles. 110 normal, 124 systolic murmur, 106 diastolic murmur & $90.29\%\;Ac$ & Unsupervised Algorithm. \\ \hline
\dbottomrule\\
% TODO: Add footnote explanation for Ac = Accuracy
% TODO: Add citeyearpar references to authors
\end{tabulary}
\end{table}
\end{landscape}
\restoregeometry
\doublespacing
\subsection{Feature Extraction}
A wide variety of methods exist for the extraction of statistical
features from PCG data. These features are used for the creation of