squash commit
This commit is contained in:
+198
-133
@@ -1,5 +1,4 @@
|
|||||||
\documentclass[titlepage]{scrartcl}
|
\documentclass[titlepage, 12pt]{scrartcl} \usepackage{enumitem}
|
||||||
\usepackage{enumitem}
|
|
||||||
\usepackage[british]{babel}
|
\usepackage[british]{babel}
|
||||||
\usepackage[style=apa, backend=biber]{biblatex}
|
\usepackage[style=apa, backend=biber]{biblatex}
|
||||||
\DeclareLanguageMapping{british}{british-apa}
|
\DeclareLanguageMapping{british}{british-apa}
|
||||||
@@ -11,32 +10,36 @@
|
|||||||
\MakePerPage{footnote}
|
\MakePerPage{footnote}
|
||||||
\usepackage{abstract}
|
\usepackage{abstract}
|
||||||
\usepackage{graphicx}
|
\usepackage{graphicx}
|
||||||
|
\usepackage{setspace}
|
||||||
% Create hyperlinks in bibliography
|
% Create hyperlinks in bibliography
|
||||||
\usepackage{hyperref}
|
\usepackage{hyperref}
|
||||||
\usepackage{amsmath}
|
\usepackage{amsmath}
|
||||||
|
|
||||||
|
\usepackage[pass]{geometry}
|
||||||
|
\usepackage{graphicx}
|
||||||
|
|
||||||
\usepackage[T1]{fontenc}
|
\usepackage[T1]{fontenc}
|
||||||
\usepackage[utf8]{inputenc}
|
\usepackage[utf8]{inputenc}
|
||||||
\usepackage{blindtext}
|
\usepackage{blindtext}
|
||||||
\setkomafont{disposition}{\normalfont\bfseries}
|
\setkomafont{disposition}{\normalfont\bfseries}
|
||||||
|
|
||||||
|
\usepackage{etoolbox}
|
||||||
\graphicspath{{./resources/}}
|
\graphicspath{{./resources/}}
|
||||||
\addbibresource{~/Documents/library.bib}
|
\addbibresource{~/Documents/library.bib}
|
||||||
|
|
||||||
\newsavebox{\abstractbox}
|
%\newsavebox{\abstractbox}
|
||||||
\renewenvironment{abstract}
|
%\renewenvironment{abstract}
|
||||||
{\begin{lrbox}{0}\begin{minipage}{\textwidth}
|
% {\begin{lrbox}{0}\begin{minipage}{\textwidth}
|
||||||
\begin{center}\normalfont\sectfont\abstractname\end{center}\quotation}
|
% \begin{center}\normalfont\sectfont\abstractname\end{center}\quotation}
|
||||||
{\endquotation\end{minipage}\end{lrbox}%
|
% {\endquotation\end{minipage}\end{lrbox}%
|
||||||
\global\setbox\abstractbox=\box0 }
|
% \global\setbox\abstractbox=\box0 }
|
||||||
|
|
||||||
\usepackage{etoolbox}
|
%\makeatletter
|
||||||
\makeatletter
|
%\expandafter\patchcmd\csname\string\maketitle\endcsname
|
||||||
\expandafter\patchcmd\csname\string\maketitle\endcsname
|
% {\vskip\z@\@plus3fill}
|
||||||
{\vskip\z@\@plus3fill}
|
% {\vskip\z@\@plus2fill\box\abstractbox\vskip\z@\@plus1fill}
|
||||||
{\vskip\z@\@plus2fill\box\abstractbox\vskip\z@\@plus1fill}
|
% {}{}
|
||||||
{}{}
|
%\makeatother
|
||||||
\makeatother
|
|
||||||
|
|
||||||
\DeclareCiteCommand{\citeyearpar}
|
\DeclareCiteCommand{\citeyearpar}
|
||||||
{}
|
{}
|
||||||
@@ -67,133 +70,195 @@
|
|||||||
showspaces=false,
|
showspaces=false,
|
||||||
showstringspaces=false}
|
showstringspaces=false}
|
||||||
|
|
||||||
|
|
||||||
\begin{document}
|
\begin{document}
|
||||||
\title{ECS750P --- Final Project}
|
\newgeometry{lmargin=1.5cm}
|
||||||
\subtitle{\LARGE{Extraction of Statistical Features from PCG Signals for the
|
\begin{titlepage}
|
||||||
Classification of Heart Abnormalities}}
|
|
||||||
|
|
||||||
\author{Sam Perry --- EC16039}
|
\begingroup
|
||||||
|
|
||||||
\maketitle
|
\setlength{\tabcolsep}{1.5cm}
|
||||||
|
|
||||||
|
\begin{tabular}[c]{p{0.30\textwidth} | p{0.4\textwidth}}
|
||||||
|
|
||||||
|
{\vspace{1.2cm} \Large School of Electronic Engineering and Computer Science \par}
|
||||||
|
&
|
||||||
|
{\vspace{1.2cm} \large Sound and Music Computing \newline Project Report \the\year \par}\\
|
||||||
|
|
||||||
|
& {\vspace{0.5cm} \Large \textbf{Extraction of Statistical Features from PCG Signals for the
|
||||||
|
Classification of Heart Abnormalities} \par}\\
|
||||||
|
|
||||||
|
\vspace{0.4\textheight}
|
||||||
|
\includegraphics[width=5cm]{qmul_logo}
|
||||||
|
&
|
||||||
|
{\vspace{1cm} \large \textbf{Samuel Perry}}\\
|
||||||
|
|
||||||
|
&
|
||||||
|
\multicolumn{1}{|r}{August \the\year}
|
||||||
|
|
||||||
|
\end{tabular}
|
||||||
|
|
||||||
\section{Literature Review}
|
\endgroup
|
||||||
There are currently a wide variety of methods are employed for the analysis and
|
|
||||||
classification of PCG signals. Current research focuses on a number of areas,
|
|
||||||
the most relevant of which are:
|
|
||||||
\begin{itemize}
|
|
||||||
\item Algorithms for the pre-processing and segmentation of PCG data,
|
|
||||||
aiming to extract the structure of the signal over time. This is a key
|
|
||||||
stage in the analysis of PCG signals as the structure and relationships between the
|
|
||||||
fundamental heart sounds (FHSs) form the basis for much of the further
|
|
||||||
analysis performed on PCG data. A number of methods exist for the
|
|
||||||
extraction of FHSs. Some rely on direct extraction of peaks in the time
|
|
||||||
domain to determine the structure of a signal. These methods perform
|
|
||||||
various transformation in order to accentuate the transient events with
|
|
||||||
the intention of isolating them~\parencite{Groch1992, Liang1997}.
|
|
||||||
However, these methods tend to suffer significantly from background
|
|
||||||
noise and so perform poorly in sub-optimal conditions.\\
|
|
||||||
Other methods rely on spectral representations to assist in the
|
|
||||||
splitting of the FHSs, in particular using wavelet
|
|
||||||
decomposition~\parencite{LiangHuiying1997, Vepa2008}. This allows for
|
|
||||||
the separation of components based on their frequency content in place
|
|
||||||
of, or in addition to their temporal characteristics.\\
|
|
||||||
In addition, Machine learning algorithms have been employed, such as
|
|
||||||
$k$-Nearest Neighbour~\parencite{Gupta2007} and Neural
|
|
||||||
Networks~\parencite{Oskiper2002} to improve segment classification.
|
|
||||||
More recently, particular success has been observed in Springer's use
|
|
||||||
of logistic regression and Hidden semi-Markov
|
|
||||||
models~\citeyearpar{Springer2016}.
|
|
||||||
|
|
||||||
\item A wide variety of methods exist for the extraction of statistical
|
\end{titlepage}
|
||||||
features from PCG data. These features are used for the creation of
|
\restoregeometry
|
||||||
robust, meaningful representations of the data.\\
|
|
||||||
The use of spectral representations for PCG data are prominent in the
|
|
||||||
literature. The ability to separate activity across the frequency
|
|
||||||
spectrum reveals patterns that may not be attainable by analysing the
|
|
||||||
time domain signal alone.\\
|
|
||||||
Due to the need for low frequency analysis and the high noise levels
|
|
||||||
found in PCG signals, it has been found that the traditional FFT
|
|
||||||
method for extracting spectral information may not be
|
|
||||||
suitable~\parencite{Akay1990}. For this reason, parametric methods for
|
|
||||||
spectral estimation have been a popular choice for extraction of such information.
|
|
||||||
Methods such as AR, ARMA, AR-HOS and MUSIC have been shown to provide spectral
|
|
||||||
representations suitable for analysis and classification of heart
|
|
||||||
sound~\parencite{Ergen2001, Schmidt2015}.\\
|
|
||||||
Other methods such as Wavelet Decomposition and MFCCs have also been
|
|
||||||
successfully employed for extracting spectral data for purposes such
|
|
||||||
as heart valve disease identification and heart murmur
|
|
||||||
detection~\parencite{Quiceno-Manrique2010a, Maglogiannis2009}.\\
|
|
||||||
|
|
||||||
In addition to direct analysis on the signal, the ability to segment
|
|
||||||
and extract RR values from the signal allows for their statistical
|
|
||||||
analysis, both in the time and frequency domain, for use as features.\\
|
|
||||||
Dash et al.\ use a number of time-based statistical analysis on the RR
|
|
||||||
time series for the detection of atrial fibrillation. Statistical
|
|
||||||
analyses such as RMSSD, Shannon Entropy and Turning-point Ratio are
|
|
||||||
used as feature vectors for classification of
|
|
||||||
signals~\citeyearpar{Dash2009}. A similar approach is used by Yaghouby
|
|
||||||
et al.\ for the generalized classification of heart abnormality. Here,
|
|
||||||
a selection of linear and non-linear features are used for
|
|
||||||
classification with promising results~\citeyearpar{Yaghouby2009}.\\
|
|
||||||
Frequency domain analysis of RR values are also used by calculating the
|
|
||||||
PSD of the RR values via approaches such as VFCDM.\ This form of
|
|
||||||
approach allows for higher resolution time-frequency representations of
|
|
||||||
the RR data than approaches such as the FFT or wavelet transform~\parencite{Wang2006}.
|
|
||||||
From a spectral representations such as this, Yaghouby et al.\
|
|
||||||
demonstrate the use of such descriptors for the discrimination between
|
|
||||||
sympathetic and parasympathetic contents of the signal, not directly
|
|
||||||
detectable through time domain analysis~\citeyearpar{Yaghouby2009}.\\
|
|
||||||
Further in-depth analysis of statistical features for HRV can be found
|
|
||||||
in~\parencite{Electrophysiology1996}
|
|
||||||
|
|
||||||
\item Classification of signals for diagnostic purposes. The aim being to
|
\doublespacing
|
||||||
distinguish healthy signals from those with certain heart
|
\begin{abstract}
|
||||||
conditions/abnormality. This is most commonly achieved by extracting
|
Things and stuff and words...
|
||||||
sets of features vectors from PCG signals, followed by their
|
\end{abstract}
|
||||||
classification, most commonly using machine learning algorithms for
|
|
||||||
automatic classification. The features extracted and classification
|
|
||||||
algorithms applied vary across the literature based on factors such as
|
|
||||||
the diagnostic aims of the classification and computing performance
|
|
||||||
requirements.\\
|
|
||||||
|
|
||||||
Artificial neural networks and support vector machines have proven to
|
\renewcommand{\abstractname}{Acknowledgements}
|
||||||
be popular choices for classification. Much success has been seen in
|
\begin{abstract}
|
||||||
employing these machine learning techniques for classification across
|
I'd like to thanks anyone and everyone...
|
||||||
both PCG and ECG data for conditions such as chronic heart failure,
|
\end{abstract}
|
||||||
atrial fibrillation and flutter, diastolic murmurs, and for general
|
|
||||||
pathology detection~\parencite{Cathers1995, Wu1995, Bung2000,
|
\tableofcontents
|
||||||
Lubaib2016, Maji2014, Ari2010, Maglogiannis2009}. Results do vary based
|
\newpage
|
||||||
on the combination of features and exact classification methods used.
|
|
||||||
However, encouraging results are presented with highly accurate
|
\section{Related Work}
|
||||||
classifications for general abnormality detection and for more specific
|
There are currently a wide variety of methods employed for the analysis and
|
||||||
pathological condition detection.\\
|
classification of PCG signals. Current research can be divided into 3 areas,
|
||||||
|
each of which are combined to create full classification system. These areas
|
||||||
|
are: signal preprocessing and segmentation, feature extraction methods and
|
||||||
|
classification methods.
|
||||||
|
|
||||||
|
\subsection{Signal Preprocessing and Segmentation}
|
||||||
|
Due to factors such as recording conditions and
|
||||||
|
|
||||||
|
Algorithms for the pre-processing and segmentation of PCG data
|
||||||
|
aim to extract the structure of the signal over time. This is a key
|
||||||
|
stage in the analysis of PCG signals as the structure and relationships between the
|
||||||
|
fundamental heart sounds (FHSs) form the basis for much of the further
|
||||||
|
analysis performed on PCG data. A number of methods exist for the
|
||||||
|
extraction of FHSs. Some rely on direct extraction of peaks in the time
|
||||||
|
domain to determine the structure of a signal. These methods perform
|
||||||
|
various transformation in order to accentuate the transient events with
|
||||||
|
the intention of isolating them~\parencite{Groch1992, Liang1997}.
|
||||||
|
However, these methods tend to suffer significantly from background
|
||||||
|
noise and so perform poorly in sub-optimal conditions.\\
|
||||||
|
Other methods rely on spectral representations to assist in the
|
||||||
|
splitting of the FHSs, in particular using wavelet
|
||||||
|
decomposition~\parencite{LiangHuiying1997, Vepa2008}. This allows for
|
||||||
|
the separation of components based on their frequency content in place
|
||||||
|
of, or in addition to their temporal characteristics.\\
|
||||||
|
In addition, Machine learning algorithms have been employed, such as
|
||||||
|
$k$-Nearest Neighbour~\parencite{Gupta2007} and Neural
|
||||||
|
Networks~\parencite{Oskiper2002} to improve segment classification.
|
||||||
|
More recently, particular success has been observed in Springer's use
|
||||||
|
of logistic regression and Hidden semi-Markov
|
||||||
|
models~\citeyearpar{Springer2016}.
|
||||||
|
|
||||||
|
\subsection{Statistical Feature Extraction}
|
||||||
|
A wide variety of methods exist for the extraction of statistical
|
||||||
|
features from PCG data. These features are used for the creation of
|
||||||
|
robust, meaningful representations of the data.\\
|
||||||
|
The use of spectral representations for PCG data are prominent in the
|
||||||
|
literature. The ability to separate activity across the frequency
|
||||||
|
spectrum reveals patterns that may not be attainable by analysing the
|
||||||
|
time domain signal alone.\\
|
||||||
|
Due to the need for low frequency analysis and the high noise levels
|
||||||
|
found in PCG signals, it has been found that the traditional FFT
|
||||||
|
method for extracting spectral information may not be
|
||||||
|
suitable~\parencite{Akay1990}. For this reason, parametric methods for
|
||||||
|
spectral estimation have been a popular choice for extraction of such information.
|
||||||
|
Methods such as AR, ARMA, AR-HOS and MUSIC have been shown to provide spectral
|
||||||
|
representations suitable for analysis and classification of heart
|
||||||
|
sound~\parencite{Ergen2001, Schmidt2015}.\\
|
||||||
|
Other methods such as Wavelet Decomposition and MFCCs have also been
|
||||||
|
successfully employed for extracting spectral data for purposes such
|
||||||
|
as heart valve disease identification and heart murmur
|
||||||
|
detection~\parencite{Quiceno-Manrique2010a, Maglogiannis2009}.\\
|
||||||
|
|
||||||
|
In addition to direct analysis on the signal, the ability to segment
|
||||||
|
and extract RR values from the signal allows for their statistical
|
||||||
|
analysis, both in the time and frequency domain, for use as features.\\
|
||||||
|
Dash et al.\ use a number of time-based statistical analysis on the RR
|
||||||
|
time series for the detection of atrial fibrillation. Statistical
|
||||||
|
analyses such as RMSSD, Shannon Entropy and Turning-point Ratio are
|
||||||
|
used as feature vectors for classification of
|
||||||
|
signals~\citeyearpar{Dash2009}. A similar approach is used by Yaghouby
|
||||||
|
et al.\ for the generalized classification of heart abnormality. Here,
|
||||||
|
a selection of linear and non-linear features are used for
|
||||||
|
classification with promising results~\citeyearpar{Yaghouby2009}.\\
|
||||||
|
Frequency domain analysis of RR values are also used by calculating the
|
||||||
|
PSD of the RR values via approaches such as VFCDM.\ This form of
|
||||||
|
approach allows for higher resolution time-frequency representations of
|
||||||
|
the RR data than approaches such as the FFT or wavelet transform~\parencite{Wang2006}.
|
||||||
|
From a spectral representations such as this, Yaghouby et al.\
|
||||||
|
demonstrate the use of such descriptors for the discrimination between
|
||||||
|
sympathetic and parasympathetic contents of the signal, not directly
|
||||||
|
detectable through time domain analysis~\citeyearpar{Yaghouby2009}.\\
|
||||||
|
Further in-depth analysis of statistical features for HRV can be found
|
||||||
|
in~\parencite{Electrophysiology1996}
|
||||||
|
|
||||||
|
\subsection{Signal Classification}
|
||||||
|
Classification of signals for diagnostic purposes. The aim being to
|
||||||
|
distinguish healthy signals from those with certain heart
|
||||||
|
conditions/abnormality. This is most commonly achieved by extracting
|
||||||
|
sets of features vectors from PCG signals, followed by their
|
||||||
|
classification, most commonly using machine learning algorithms for
|
||||||
|
automatic classification. The features extracted and classification
|
||||||
|
algorithms applied vary across the literature based on factors such as
|
||||||
|
the diagnostic aims of the classification and computing performance
|
||||||
|
requirements.\\
|
||||||
|
|
||||||
|
Artificial neural networks and support vector machines have proven to
|
||||||
|
be popular choices for classification. Much success has been seen in
|
||||||
|
employing these machine learning techniques for classification across
|
||||||
|
both PCG and ECG data for conditions such as chronic heart failure,
|
||||||
|
atrial fibrillation and flutter, diastolic murmurs, and for general
|
||||||
|
pathology detection~\parencite{Cathers1995, Wu1995, Bung2000,
|
||||||
|
Lubaib2016, Maji2014, Ari2010, Maglogiannis2009}. Results do vary based
|
||||||
|
on the combination of features and exact classification methods used.
|
||||||
|
However, encouraging results are presented with highly accurate
|
||||||
|
classifications for general abnormality detection and for more specific
|
||||||
|
pathological condition detection.\\
|
||||||
|
|
||||||
|
However, there is a lack of research into other machine learning
|
||||||
|
techniques such as bayesian classification~\parencite{Lubaib2016},
|
||||||
|
$k$-Nearest Neighbour~\parencite{Quiceno-Manrique2010a, Lubaib2016} and
|
||||||
|
Linear Regression~\parencite{Orhan2013}. Studies that utilize these
|
||||||
|
methods for classification have generated promising results. There is
|
||||||
|
therefore the potential for further research into exploiting the
|
||||||
|
benefits of these techniques for heart abnormality detection.\\
|
||||||
|
|
||||||
|
The selection of features used for classification also depends
|
||||||
|
predominantly on the aims for the classification. For general
|
||||||
|
abnormality classification, spectral representations such as wavelet
|
||||||
|
transformations, VFCMD, FFTs and MFCCs are a popular
|
||||||
|
choice~\parencite{Bung2000, Wu1995, Yaghouby2009, Dash2009}. Their
|
||||||
|
multi-dimensional representation of the data reveals details in the
|
||||||
|
signal that cannot be seen through a 1 dimensional time series alone,
|
||||||
|
allowing for more accurate classification. Higher-level statistical
|
||||||
|
methods are also widely used for both time and spectral
|
||||||
|
representations~\parencite{Bung2000, Quiceno-Manrique2010a,
|
||||||
|
Schmidt2015, Dash2009, Yaghouby2009}. These allow for the
|
||||||
|
classification based on more specific statistical properties of the
|
||||||
|
data. It is highlighted by Orhan that Higher level statistical methods
|
||||||
|
may add considerable complexity to computations, and so care should be
|
||||||
|
taken, particularly when considering systems in a real-time
|
||||||
|
context~\citeyearpar{Orhan2013}.
|
||||||
|
|
||||||
|
\section{Dataset}
|
||||||
|
|
||||||
|
\section{Design}
|
||||||
|
The system aims to provide robust heart abnormality detection for PCG signals,
|
||||||
|
such that use of the system could reliably recommend further medical attention
|
||||||
|
when neccesary.
|
||||||
|
\subsection{Signal Segmentation}
|
||||||
|
\subsection{Choice of features}
|
||||||
|
\subsection{Feature selection method}
|
||||||
|
dimensionality reduction
|
||||||
|
\subsection{Classification Algorithm}
|
||||||
|
|
||||||
|
\section{Implementation}
|
||||||
|
\section{Evaluation}
|
||||||
|
Group cross-validation
|
||||||
|
Weighted specificity and weighted Accuracy measures
|
||||||
|
\section{Conclusion}
|
||||||
|
|
||||||
However, there is a lack of research into other machine learning
|
|
||||||
techniques such as bayesian classification~\parencite{Lubaib2016},
|
|
||||||
$k$-Nearest Neighbour~\parencite{Quiceno-Manrique2010a, Lubaib2016} and
|
|
||||||
Linear Regression~\parencite{Orhan2013}. Studies that utilize these
|
|
||||||
methods for classification have generated promising results. There is
|
|
||||||
therefore the potential for further research into exploiting the
|
|
||||||
benefits of these techniques for heart abnormality detection.\\
|
|
||||||
|
|
||||||
The selection of features used for classification also depends
|
|
||||||
predominantly on the aims for the classification. For general
|
|
||||||
abnormality classification, spectral representations such as wavelet
|
|
||||||
transformations, VFCMD, FFTs and MFCCs are a popular
|
|
||||||
choice~\parencite{Bung2000, Wu1995, Yaghouby2009, Dash2009}. Their
|
|
||||||
multi-dimensional representation of the data reveals details in the
|
|
||||||
signal that cannot be seen through a 1 dimensional time series alone,
|
|
||||||
allowing for more accurate classification. Higher-level statistical
|
|
||||||
methods are also widely used for both time and spectral
|
|
||||||
representations~\parencite{Bung2000, Quiceno-Manrique2010a,
|
|
||||||
Schmidt2015, Dash2009, Yaghouby2009}. These allow for the
|
|
||||||
classification based on more specific statistical properties of the
|
|
||||||
data. It is highlighted by Orhan that Higher level statistical methods
|
|
||||||
may add considerable complexity to computations, and so care should be
|
|
||||||
taken, particularly when considering systems in a real-time
|
|
||||||
context~\citeyearpar{Orhan2013}.
|
|
||||||
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
\pagebreak{}
|
\pagebreak{}
|
||||||
\printbibliography{}
|
\printbibliography{}
|
||||||
|
|||||||
Binary file not shown.
|
After Width: | Height: | Size: 14 KiB |
Reference in New Issue
Block a user