Finished segmentation lit review section
This commit is contained in:
+118
-36
@@ -127,11 +127,15 @@ I'd like to thanks anyone and everyone...
|
||||
\tableofcontents
|
||||
\newpage
|
||||
|
||||
\section{Introduction}
|
||||
% TODO: Write brief overview of history of PCG signal analysis
|
||||
% TODO: Explain fundamental heart sounds
|
||||
|
||||
\section{Related Work}
|
||||
There are currently a wide variety of methods employed for the analysis and
|
||||
classification of PCG signals. Current research can be divided into 3 areas,
|
||||
classification of PCG signals. Current research can be divided into 4 areas,
|
||||
each of which are combined to create full classification system. These areas
|
||||
are: signal preprocessing, signal segmentation and feature extraction methods,
|
||||
are: signal preprocessing, signal segmentation, feature extraction methods,
|
||||
and classification methods.
|
||||
The performance and evaluation of complete systems are also discussed in
|
||||
section~\ref{performance}
|
||||
@@ -139,61 +143,133 @@ section~\ref{performance}
|
||||
|
||||
\subsection{Signal Preprocessing}
|
||||
There are a large number of factors that lead to variation in quality of PCG
|
||||
recordings: stethescope type, make and model, its microphone/sensors used for
|
||||
recordings: stethoscope type, make and model, its microphone/sensors used for
|
||||
recording of the data, the position used to record (i.e.\ lower left sternal
|
||||
border, apex, pulmonic area, aortic area), built in filters/signal processing
|
||||
used by the stethescope (i.e.\ noise filters, anti-tremor filters), medication that
|
||||
a pacient may be taking, as well as many other factors that may influence the
|
||||
used by the stethoscope (i.e.\ noise filters, anti-tremor filters), medication that
|
||||
a patient may be taking, as well as many other factors that may influence the
|
||||
recorded signal~\parencite[p.4]{Pavlopoulos2004}. This presents a significant
|
||||
issue when attempting to analyse and compare a dataset of signals, as
|
||||
variations in recordings and artefacts caused by factors other than heart
|
||||
sounds will most likely interfere with analysis and comparison methods. To
|
||||
account for this, pre-processing methods are widely used aiming to standardize
|
||||
account for this, pre-processing methods are widely used, aiming to standardize
|
||||
a dataset. This is also used as a way to accentuate features of the data that
|
||||
are expected to be relevant during classification.\\
|
||||
|
||||
A common method employed is the use of decimation and a static filter to remove
|
||||
unwanted spectral content that is most likely noise~\parencite{Liang1997a,
|
||||
Homsi2016, Springer2016, Gupta2007}. This helps reduce higher frequency noise
|
||||
such as speech, microphone movement and other interference caused externally.
|
||||
Decimation tends to downsample to around 1--4KHz, with anti-aliasing filter
|
||||
specifications varying across the literature. Generally, highpass chebychev or
|
||||
butterworth filters are favoured with cutoff frequencies ranging from
|
||||
400--750Hz.\\
|
||||
such as speech, microphone movement, breething and other interference caused
|
||||
externally. Decimation tends to downsample to around 1--4KHz, with
|
||||
anti-aliasing filter specifications varying across the literature. Generally,
|
||||
highpass chebychev or butterworth filters are favoured with cutoff frequencies
|
||||
ranging from 400--750Hz.\\
|
||||
|
||||
In addition, many methods decompose the filtered signal using wavelet based
|
||||
methods such as the discrete wavelet transform
|
||||
(DWT)~\parencite{Liang1997a, Pavlopoulos2004}, continuous
|
||||
wavelet transform (CWT)~\parencite{Langley2016} or wavelet
|
||||
package decomposition (WPD)~\parencite{Liang1998}.
|
||||
Wavelet transforms are popular as unlike Fourier transforms, they are well
|
||||
Wavelet transforms are popular as, unlike Fourier transforms, they are well
|
||||
localized in both the time and frequency domain. This allows for the analysis
|
||||
of PCG signals across multiple frequency bands whilst maintaining transient
|
||||
temporal events in the resulting decomposition~\parencite[p.93]{Ari2008}.
|
||||
This may be used for analysis of transient events such as murmurs, that may
|
||||
consist of higher frequency components than normal heart sounds.
|
||||
|
||||
% TODO: Add reference to table of methods
|
||||
|
||||
\subsection{Signal Segmentation}
|
||||
Algorithms for the segmentation of PCG data aim to extract the structure of
|
||||
the signal over time. This is a key stage in the analysis of PCG signals as the
|
||||
structure and relationships between the fundamental heart sounds (FHSs) form
|
||||
the basis for much of the further analysis performed on PCG data. A number of
|
||||
methods exist for the extraction of FHSs. Tradiational methods rely on direct
|
||||
extraction of peaks in the time domain to determine the structure of a signal.
|
||||
These methods perform various transformation in order to accentuate the
|
||||
transient events with the intention of isolating them~\parencite{Liang1997b}.
|
||||
However, these methods tend to suffer significantly from background noise and
|
||||
so perform poorly in sub-optimal conditions.\\ More recent methods use spectral
|
||||
representations to assist in the splitting of the FHSs, in particular using
|
||||
wavelet decomposition~\parencite{Liang1997a, Vepa2008}. These methods tend to
|
||||
perform more robustly on signals of varying conditions\\ In addition, Machine
|
||||
learning algorithms have been employed, such as $k$-Nearest
|
||||
Neighbour classifiers~\parencite{Gupta2007}, Neural
|
||||
Networks~\parencite{Sepehri2010}, and Hidden Markov
|
||||
Models (HMMs)~\parencite{Ricke2005} to improve segment classification. Particular
|
||||
success has been observed in Springer's use of logistic regression and Hidden
|
||||
semi-Markov models (HSMM)~\citeyearpar{Springer2016}.
|
||||
the basis for much of the further analysis performed on PCG data.\\
|
||||
|
||||
% TODO: insert segmented graph of PCG cycle
|
||||
|
||||
A number of methods exist for the extraction of FHSs. Traditional methods rely
|
||||
on direct extraction of peaks from envelopes in the time domain to determine
|
||||
the structure of a signal. These methods perform various transformation in
|
||||
order to accentuate the transient events with the intention of isolating
|
||||
them.\\
|
||||
Liang et.\ al propose a method using the popular Shannon energy
|
||||
envelope, achieving good accuracy across 37 recordings of
|
||||
children~\citeyearpar{Liang1997b}. The algorithm aims to segment the data by
|
||||
first extracting the envelope, then applying adaptive rule based thresholds to
|
||||
determine peaks corresponding to segmentation points. When comparing results to
|
||||
hand annotated ground truth, the system achieves a reported accuracy score of
|
||||
84\%. However, due to the small sample size, and potential lack of noise in the
|
||||
dataset used, this may not translate to a larger dataset recorded in
|
||||
sub-optimal conditions.\\
|
||||
More recent methods use spectral representations to assist in the splitting of
|
||||
the FHSs, in particular using wavelet decomposition. These methods tend to
|
||||
perform more robustly on signals of varying conditions.\\
|
||||
Building on previous work, Liang et.\ al present an improved method, using the
|
||||
discrete wavelet transform to decompose and reconstruct the signal into 7
|
||||
distinct frequency bands~\citeyearpar{Liang1997a}. Applying a similar method
|
||||
of envelope extraction and peak picking to each frequency band, the best
|
||||
estimate of all frequency bands is then chosen as the final result. Criterion
|
||||
for this choice is based on number of S1s and S2s detected, and the number of
|
||||
artefacts discarded for each frequency band. This method achieved an improved
|
||||
accuracy of 93\% accuracy across a larger dataset of 77 recordings. This
|
||||
suggests that the algorithm is as robust if not more so than previous work by
|
||||
Liang et\ al.\\
|
||||
Vepa et.\ al proposed a wavelet decomposition based method that uses a
|
||||
combination of simplicity and envelope features~\citeyearpar{Vepa2008}. This
|
||||
approach attempts to improve robustness when analysing signals of varying
|
||||
quality by using multiple complimentary features, allowing the method to base
|
||||
decisions on a variety of statistical properties. Evaluating the algorithm on a
|
||||
collection of 160 heart cycles from a variety of sources, a reported accuracy
|
||||
of 84\% was achieved.\\
|
||||
|
||||
A variety of machine learning methods have been implemented with reasonable
|
||||
success. Gupta et.\ al present a method that applies $k$-means clustering to
|
||||
replace standard threshold based methods for determining peak classification in
|
||||
a standard envelope based segmentation algorithm~\citeyearpar{Gupta2007}. This achieved a reported
|
||||
accuracy of 90.29\%. Due to the standard envelope based method for feature
|
||||
extraction, this method is still suceptible to noise and artefacts that occur
|
||||
within the frequency bands of the heart sounds.\\
|
||||
|
||||
Sepehri et.\ al propose a method that combines neural networks with Power
|
||||
Spectral Density (PSD) estimates~\citeyearpar{Sepehri2010}. This method
|
||||
exploits the periodic nature of S1 and S2 heart sounds, combined with their
|
||||
narrow frequency range, to train a neural network to separate these sounds from
|
||||
other sounds and murmurs. This method achieves a reported 93.6\% accuracy on a
|
||||
significantly larger database than other methods detailed.\\
|
||||
|
||||
Most significant success in segmentation algorithms has been observed through use
|
||||
of probabilistic Models such as Hidden Markov Models (HMMs). Early research
|
||||
using these models by Ricke et.\ al utilized embedded HMMs to model the 4
|
||||
states of the PCG and their transitions~\citeyearpar{Ricke2005}. MFCCs and
|
||||
Shannon Energy are used as feature vectors for the models. Results of
|
||||
98\% accuracy were reported, although this was tested on only a small database
|
||||
of signals.\\
|
||||
Gill et.\ al achieve similar results, most notably with specific consideration
|
||||
for the duration of each state in the HMM~\citeyearpar{Gill2005}. This is
|
||||
handled through the extraction of 6 duration features based primarily on peaks,
|
||||
which are then used as feature vectors for the HMM. Results of 98.6\%
|
||||
sensitivity, 96.9\% positive predictivity for S1 sounds and 98.3\% sensitivity,
|
||||
96.5\% positive predictivity were reported.
|
||||
The issue of state duration is further addressed by Schmidt et.\ al through use
|
||||
of a duration-dependent hidden Markov (DHMM)~\citeyearpar{Schmidt2015}. The
|
||||
DHMM is a modified HMM that considers the duration of the current state when
|
||||
calculating the probability of transition to another state. This modification
|
||||
scored a reported sensitivity of 98.8\% and a positive predictivity of
|
||||
98.8\%.\\
|
||||
Building on previous work using HMMs, Springer et.\ al presents a segmentation
|
||||
algorithm by using hidden semi-markov models (HSMMs) in combination with
|
||||
logistic regression~\citeyearpar{Springer2016}. Use of Hidden semi markov model
|
||||
allows for a priori information on the duration of the current state to be used
|
||||
in probability calculation of the subsequent state. In this case, the knowlege
|
||||
that there is an upper and lower limit on the duration of each component is
|
||||
used in calculation of transition probabilities. A modified viterbi algorithm
|
||||
is then used to calculate the most likely set of transitions based on observed
|
||||
features. Logistic regression is then used to improve discrimination between
|
||||
state features when compared to discriminatory methods used by previous work.
|
||||
Performance was evaluated on a significantly larger database than previous
|
||||
methods and achieved a reported accuracy of $95.63\% \pm 0.85\%$. Due to it's
|
||||
rigorous evaluation and high accuracy, this method is currently considered the
|
||||
state-of-the-art for PCG signal segmentation.\\
|
||||
|
||||
Table~\ref{SegmentationTable} provides a brief overview of significant research
|
||||
into PCG segmentation. For a more complete summary of the current state of PCG
|
||||
@@ -203,26 +279,32 @@ segmentation, please refer to Liu et.\ al~\citeyearpar{Liu2016}
|
||||
\begin{landscape}
|
||||
\begin{table}[htbp]
|
||||
\captionof{table}{Summary of Segmentation Algorithms} \label{SegmentationTable}
|
||||
\footnotesize
|
||||
\scriptsize
|
||||
%\centering
|
||||
\rowcolors{1}{gray!15}{white}
|
||||
\doublespacing
|
||||
\begin{tabulary}{\linewidth}{LLLLL}
|
||||
\dtoprule
|
||||
Author & Method & Datasets & \mbox{Reported} Results & Notes \\ \midrule
|
||||
Springer et.\ al (2016) & HSMM/Logistic regression & 10,172s of recordings from 112 patients. 12,181 first and 11,627 second heart sounds. & $95.63\pm0.85\%\;Ac$ & Supervised algorithm. \\
|
||||
Huiying et.\ al (1997b) & Normalised average shannon energy envelope/peak picking & 37 recordings, 14 pathological murmurs and 23 physiological murmurs. 515 cycles & $91.03\%\;Ac$ & Unsupervised Algorithm. Dataset consists entirely of child recording. Optimized on full dataset \\
|
||||
Vepa et.\ al (2008) & Wavelet decomposition, energy and simplicity measurement & 160 heart cycles collected from a variety of sources (training CDs, web resources) & $84\%\;Ac$ & Unsupervised Algorithm, Optimized on full dataset \\
|
||||
Sun et.\ al & Viola integral envelope extraction, short-time modified Hilbert transform, peak picking & 6949s of recordings, from 121 patients & $97.37\%\;Ac$ & Supervised algorithm. Tolerance for segmentation accuracy not specified \\
|
||||
Sepehri et.\ al & Spectral density estimation, auto-regressive parameters, multi-layer perceptron neural network & 120 recording, from 60 patients & $93.6\%\;Ac$ & Supervised algorithm \\
|
||||
Ricke et.\ al (2005) & Shannon energy (and related features), HMM & 9 Recordings, from 9 patients & $98\%\;Ac$ & Supervised algorithm \\
|
||||
Gupta et.\ al (2007) & Homomorphic filtering/K-means clustering & 41 recordings (340 cycles). Mix of normal (32\%), systolic (36\%) and diastolic murmurs (32\%) & $90.29\%\;Ac$ & Unsupervised Algorithm. \\ \bottomrule
|
||||
Springer et.\ al \citeyearpar{Springer2016} & HSMM, Logistic regression & 10,172s of recordings from 112 patients. 12,181 first and 11,627 second heart sounds. & $95.63\pm0.85\%$ & Supervised algorithm. \\
|
||||
Huiying et.\ al \citeyearpar{Liang1997b} & Normalised average Shannon energy envelope, peak picking & 37 recordings, 14 pathological murmurs and 23 physiological murmurs. 515 cycles & $91.03\%\;Ac$ & Unsupervised Algorithm. Dataset consists entirely of child recording. Optimized on full dataset \\
|
||||
Vepa et.\ al \citeyearpar{Vepa2008} & Wavelet decomposition, energy and simplicity measurement & 160 heart cycles collected from a variety of sources (training CDs, web resources) & $84\%\;Ac$ & Unsupervised Algorithm, Optimized on full dataset \\
|
||||
Sun et.\ al \citeyearpar{Sun2014} & Viola integral envelope extraction, short-time modified Hilbert transform, peak picking & 6949s of recordings, from 121 patients & $97.37\%\;Ac$ & Supervised algorithm. Tolerance for segmentation accuracy not specified \\
|
||||
Sepehri et.\ al \citeyearpar{Sepehri2010} & Spectral density estimation, auto-regressive parameters, multi-layer perceptron neural network & 120 recording, from 60 patients & $93.6\%\;Ac$ & Supervised algorithm \\
|
||||
Ricke et.\ al \citeyearpar{Ricke2005} & Shannon energy (and related features), HMM & 9 recordings, from 9 patients & $98\%\;Ac$ & Supervised algorithm \\
|
||||
Schmidt et.\ al \citeyearpar{Schmidt2015} & DHMM, Auto-correlation duration features, Homomorphic envelogram & 113 recordings, from 113 patients. 8s per recording. 15 abnormal recordings & $98.8\;Se,\;98.6\;P_+$ on test set & All data recorded ``lateral to the sternum in the fourth intercostal space on the left side''. Mix of noisy and clean recordings. 40 recording used for training, 73 for testing \\
|
||||
Gill et.\ al \citeyearpar{Gill2005} & Homomorphic envelogram, Embedded HMMs & 44 recording, 17 subjects. 30-60s per recording & $98.6\%\;Ac, 96.9\;P_+$ for S1. $98.3\;Ac,\;96.5\;P_+$ for S2 & Recording taken in sub-optimal environments (noisy hospitals, offices etc...) \\
|
||||
Gupta et.\ al \citeyearpar{Gupta2007} & Homomorphic filtering, $k$-means clustering & 41 patients, 340 heart cycles. 110 normal, 124 systolic murmur, 106 diastolic murmur & $90.29\%\;Ac$ & Unsupervised Algorithm. \\ \hline
|
||||
\dbottomrule\\
|
||||
% TODO: Add footnote explanation for Ac = Accuracy
|
||||
% TODO: Add citeyearpar references to authors
|
||||
\end{tabulary}
|
||||
\end{table}
|
||||
\end{landscape}
|
||||
\restoregeometry
|
||||
|
||||
|
||||
\doublespacing
|
||||
\subsection{Feature Extraction}
|
||||
A wide variety of methods exist for the extraction of statistical
|
||||
features from PCG data. These features are used for the creation of
|
||||
|
||||
Reference in New Issue
Block a user