Final proof changes

2017-02-12 21:33:03 +00:00
parent 604132cb0b
commit f98231f9a2
1 changed files with 472 additions and 430 deletions
@@ -1,430 +1,472 @@
-\documentclass[titlepage]{scrartcl}
+\documentclass{scrartcl}
-\usepackage{enumitem}
+\usepackage{enumitem}
-\usepackage[british]{babel}
+\usepackage[british]{babel}
-\usepackage[style=apa, backend=biber]{biblatex}
+\usepackage[style=apa, backend=biber, maxnames=99]{biblatex}
-\DeclareLanguageMapping{british}{british-apa}
+\DeclareLanguageMapping{british}{british-apa}
-\usepackage{url}
+\usepackage{filecontents}
-\usepackage{float}
+\usepackage{url}
-\restylefloat{table}
+\usepackage{float}
-\usepackage{perpage}
+\restylefloat{table}
-\MakePerPage{footnote}
+\usepackage{perpage}
-\usepackage{abstract}
+\MakePerPage{footnote}
-\usepackage{graphicx}
+\usepackage{abstract}
-% Create hyperlinks in bibliography
+\usepackage{graphicx}
-\usepackage{hyperref}
+% Create hyperlinks in bibliography
-
+\usepackage{hyperref}
-\renewcommand{\familydefault}{\sfdefault}
+
-\usepackage{fontspec}
+\renewcommand{\familydefault}{\sfdefault}
-\setmainfont{Arial}
+\usepackage{fontspec}
-
+\setmainfont{Arial}
-\usepackage{blindtext}
+
-\setkomafont{disposition}{\normalfont\fontsize{12}{17}\bfseries}
+\usepackage{blindtext}
-\setkomafont{section}{\normalfont\fontsize{12}{17}\bfseries}
+\setkomafont{disposition}{\normalfont\fontsize{12}{17}\bfseries}
-\setkomafont{subsection}{\normalfont\fontsize{12}{17}\itshape}
+\setkomafont{section}{\normalfont\fontsize{12}{17}\bfseries}
-\setkomafont{subsubsection}{\normalfont\fontsize{12}{17}\itshape}
+\setkomafont{subsection}{\normalfont\fontsize{12}{17}\itshape}
-
+\setkomafont{subsubsection}{\normalfont\fontsize{12}{17}\itshape}
-\graphicspath{{./resources/}}
+
-\addbibresource{~/Documents/library.bib}
+\graphicspath{{./resources/}}
-
+\addbibresource{~/Documents/library.bib}
-\usepackage[affil-it]{authblk}
+
-
+% Hack to fix problem with underscores and other special charachters in
-% \usepackage{etoolbox}
+% Mendeley bibliography.
-% \makeatletter
+\DeclareSourcemap{% Used when .bib/Bibliography is compiled, not when document is
-% \expandafter\patchcmd\csname\string\maketitle\endcsname
+    \maps{
-%   {\vskip\z@\@plus3fill}
+        \map{% Replaces '{\_}', '{_}' or '\_' with just '_'
-%   {\vskip\z@\@plus2fill\box\abstractbox\vskip\z@\@plus1fill}
+            \step[fieldsource=url,
-%   {}{}
+                  match=\regexp{\{\\\_\}|\{\_\}|\\\_},
-% \makeatother
+                  replace=\regexp{\_}]
-% 
+        }
-
+        \map{% Replaces '{'$\sim$'}', '$\sim$' or '{~}' with just '~'
-\DeclareCiteCommand{\citeyearpar}
+            \step[fieldsource=url,
-    {}
+                  match=\regexp{\{\$\\sim\$\}|\{\~\}|\$\\sim\$},
-    {\mkbibparens{\bibhyperref{\printdate}}}
+                  replace=\regexp{\~}]
-    {\multicitedelim}
+        }
-    {}
+    }
-
+}
-\begin{document}
+
-    \title{Descriptor Driven Concatenative Synthesis Tool for Python}
+\usepackage[affil-it]{authblk}
-    % \subtitle{\LARGE{Abstract Draft}}
+
-    \author{S. Perry\thanks{E-mail: \texttt{\href{mailto:u1265119@unimail.hud.ac.uk}{u1265119@unimail.hud.ac.uk}}}}
+% \usepackage{etoolbox}
-    \date{Dated: \today}
+% \makeatletter
-
+% \expandafter\patchcmd\csname\string\maketitle\endcsname
-    \maketitle
+%   {\vskip\z@\@plus3fill}
-
+%   {\vskip\z@\@plus2fill\box\abstractbox\vskip\z@\@plus1fill}
-    \begin{abstract} 
+%   {}{}
-    A command-line tool and Python framework is proposed for the exploration of
+% \makeatother
-    a new form of audio synthesis known as ``concatenative synthesis'': A form
+% 
-    of synthesis that uses perceptual audio analyses to arrange small segments
+
-    of audio based on their characteristics.  The tool is designed to
+\DeclareCiteCommand{\citeyearpar}
-    synthesise representations of an input sound using a database of source
+    {}
-    sounds. This involves the segmentation and analysis of both the input sound
+    {\mkbibparens{\bibhyperref{\printdate}}}
-    and database, matching of input segments to their closest segment from the
+    {\multicitedelim}
-    database, and the re-synthesis of the closest matches to produce the final
+    {}
-    result. The  project aims to provide a tool capable of generating high
+\newenvironment{keywords}%
-    quality sonic representations of an input, to present a variety of examples
+   {\begin{trivlist}\item[]{\bfseries\sffamily Keywords:}\ }%
-    that demonstrated the breadth of possibilities that this style of synthesis
+   {\end{trivlist}}
-    has to offer and to provide a robust framework on which concatenative
+
-    synthesis projects can be developed easily.\\
+\begin{document}
-
+    \section*{Descriptor driven concatenative synthesis tool for Python}
-    Results demonstrate the wide variety of sounds that can be produced using
+    Sam Perry\\
-    this method of synthesis. A number of technical issues are outlined that
+    E-mail: \href{mailto:u1265119@unimail.hud.ac.uk}{u1265119@unimail.hud.ac.uk}
-    impeded the overall quality of results and efficiency of the software.
+    \section*{Abstract} 
-    However, the project clearly demonstrates the strong potential for this
+    A command-line tool and Python framework is proposed for the exploration of
-    type synthesis to be used for creative purposes.
+    a new form of audio synthesis known as `concatenative synthesis', a form
-    \end{abstract}
+    of synthesis that uses perceptual audio analyses to arrange small segments
-
+    of audio based on their characteristics.  The tool is designed to
-    \section*{Background}
+    synthesise representations of an input target sound using a source database
-    The concept of constructing a new sound by arranging collections of smaller
+    of sounds. This involves the segmentation and analysis of both the input
-    sounds has gained popularity in the past 30 years through the introduction
+    sound and database, the matching of input segments to their closest segment
-    of ``Granular Synthesis''. Granular synthesis works on the theory that any
+    from the database, and the re-synthesis of the closest matches to produce
-    sound can be described through the arrangement of smaller samples (referred
+    the final result.\\
-    to as ``grains''). This representation of sound allows for the temporal
+    The  project aims to provide a tool capable of generating high-quality
-    decomposition and re-arranging of real-world samples, with the potential to
+    sonic representations of an input, to present a variety of examples that
-    create new ``complex, dynamically-evolving
+    demonstrated the breadth of possibilities that this style of synthesis has
-    sounds.''~\parencite[p.1]{Roads1988}\\
+    to offer and to provide a robust framework on which concatenative synthesis
-
+    projects can be developed easily. The purpose of this project was primarily
-    Concatenative synthesis (CS) is a form of synthesis that has developed
+    to highlight the potential for further development in the area of
-    significantly over the past 15 years, driven by recent advancements in
+    concatenative synthesis, and to provide a simple and intuitive tool that
-    technology. Key advancements have been in easy access to large databases of
+    could be used by composers for sound design and experimentation. The
-    audio and the development of methods for extracting useful information from
+    breadth of possibilities for creating new sounds offered by this method of
-    these databases automatically~\parencite[p.1]{Schwarz2006a}.  CS utilises
+    synthesis makes it ideal for digital sound design and electroacoustic
-    these technologies to provide a content-based extension to granular
+    composition.\\
-    synthesis; by analysing a database of source grains, grains can be
+    Results demonstrate the wide variety of sounds that can be produced using
-    differentiated based on their characteristics.  These characteristics can
+    this method of synthesis. A number of technical issues are outlined that
-    then be used for grain selection in the process of synthesizing output for
+    impeded the overall quality of results and efficiency of the software.
-    a wide range of applications~\parencite[p.102]{Schwarz2007}.
+    However, the project clearly demonstrates the strong potential for this
-
+    type of synthesis to be used for creative purposes.
-    \section*{Related Works}
+    \begin{keywords}
-    A number of programs utilize CS to achieve various goals. The process has
+        Concatenative synthesis; Python; audio descriptor; audio analysis; command line tool; Python framework; Python sound;
-    been used for applications in areas such as speech synthesis, instrument
+    \end{keywords}
-    synthesis and for applications in creative sound design.\\
+
-    The wide range of applications demonstrates the versatility of this
+    \section*{Acknowledgments}
-    synthesis technique. It differs from traditional synthesis methods through
+    I would like to thank A Harker for his advice and guidance as a mentor
-    the use of real recorded samples, as opposed to traditional methods that
+    throughout the project, and A Harker and P Chen for access to their
-    focus on defining sets of rules for emulating real sounds. By transforming
+    vocal samples database.  Thanks also to D Chaplin for his creative input
-    samples that have been directly recorded from a source, the subtle nuances
+    in generating results.
-    of the source's sound are preserved. These would be difficult to reproduce
+    \pagebreak
-    using other synthetic methods for modelling an
+    
-    instrument~\parencite[p.24]{Maestre2009}.
+    \section*{Background}
-
+    The concept of constructing a new sound by arranging collections of smaller
-    \subsection*{Speech Synthesis}
+    sounds has gained popularity in the past 30 years through the introduction
-    Creating a natural and intelligible realisation is an important factor when
+    of granular synthesis, which works on the theory that any sound can be
-    developing a speech synthesis system. The Talkapillar project is one such
+    described through the arrangement of smaller samples (referred to as
-    example of how highly convincing results are possible with CS. Through
+    `grains'). This representation of sound allows for the temporal
-    careful analysis of a vocal database, the project aims to impose the
+    decomposition and re-arranging of real-world samples, with the potential to
-    qualities of the database voice on an input voice. This would result in the
+    create new `complex, dynamically-evolving
-    words of the input speaker being transformed to appear as if they were
+    sounds'~\parencite[p.1]{Roads1988}.\\
-    spoken by the voice in the database.~\parencite{Hueber}
+
-    
+    Concatenative synthesis (CS) is a form of synthesis that has developed
-    \subsection*{Instrument Synthesis}
+    significantly over the past 15 years, driven by recent advancements in
-    Progress has also been made in improving the quality of instrumental
+    technology. The key advancements have been in ease of access to large databases of
-    synthesis. As with speech synthesis, the use of samples directly allows for
+    audio and the development of methods for extracting useful information from
-    natural sounding results, which provides a method for reproducing real
+    these databases automatically~\parencite[p. 1]{Schwarz2006a}.  CS utilises
-    instruments convincingly. Another important aspect of instrumental synthesis is that of performer
+    these technologies to provide a content-based extension to granular
-    expression. The reproduction of performance qualities such as dynamics,
+    synthesis; analysis of a database of source grains enable them to be 
-    timbre and timing are essential when emulating a real instrument and CS has
+    differentiated based on their characteristics.  These characteristics can
-    been used to effectively reproduce these aspects. This is achieved through
+    then be used for grain selection in the process of synthesising output for
-    splicing of grains based on their expressive characteristics to form
+    a wide range of applications~\parencite[p. 102]{Schwarz2007}.
-    musical phrases.  For example, just as a violinist might transition
+
-    seamlessly from one articulation to the next, the CS software will join
+    \section*{Related works}
-    grains to produce the variation in articulations. This contrasts the
+    A number of programs utilise CS to achieve various goals. The process has
-    traditional approach to sampling, where samples are played in isolation,
+    been used for applications in areas such as speech synthesis, instrument
-    resulting in a discontinuity between adjacent samples~\parencite[p.82]{Lindemann2007}. 
+    synthesis and creative sound design.\\
-    The Catapillar project is one such example of this use of CS. 
+    The wide range of applications demonstrates the versatility of this
-    By using a Viterbi algorithm, the project is able to calculate the
+    synthesis technique. It differs from traditional synthesis methods as it
-    smoothest overall transition between grains across the output, resulting
+    uses real recorded samples, as opposed to traditional methods that focus on
-    in convincing synthesis of orchestral instrument performances~\parencite[p.5]{Schwarz2003}.
+    defining sets of rules for emulating real sounds. By transforming samples
-
+    that have been directly recorded from a source, the subtle nuances of the
-    \subsection*{Creative Sound Design}
+    source's sound are preserved. These would be difficult to reproduce using
-    The flexibility of CS allows for creativity in a broader context than simply
+    other synthetic methods for modelling an
-    emulating real-world instruments and speech. It can also be used as a tool
+    instrument~\parencite[p. 24]{Maestre2009}.
-    to explore the possibilities for synthesizing new abstract sounds for
+
-    creative purposes.\\
+    \subsection*{Speech synthesis}
-    A prominent project in this area of CS is IRCAM's CataRT
+    Creating a natural and intelligible realisation is an important factor when
-    project~\parencite{Schwarz2006a}. The project focuses on the playback of
+    developing a speech-synthesis system. The Talkapillar project is one such
-    source grains based on their proximity to a target in multi-dimensional
+    example of how highly convincing results are possible with CS. Through
-    descriptor space.  By providing a target point in the descriptor space, the
+    careful analysis of a vocal database, the project aims to impose the
-    user is able to navigate the database, playing selections of samples that
+    qualities of the database voice on an input voice. This would result in the
-    are nearest to the target. This allows the user to explore the database
+    words of the input speaker being transformed to appear as if they were
-    intuitively through a graphic user interface, selecting a point in
+    spoken by the voice in the database.~\parencite{Hueber}
-    2-dimensional space with the mouse. Grains are then played back in
+    
-    real-time to create an ``audio mosaic''.\\
+    \subsection*{Instrument Synthesis}
-    Alternatively, target audio can be provided and analysed to create a target
+    Progress has also been made in improving the quality of instrumental
-    location based on it's location in the descriptor space.  Tremblay and
+    synthesis. As with speech synthesis, the use of samples directly allows for
-    Schwarz's~\citeyearpar{Tremblay2010} use of CataRT to explore
+    natural-sounding results, which provides a method for reproducing real
-    electroacoustic sample banks demonstrates the creative potential of this
+    instruments convincingly. Another important aspect of instrumental synthesis is that of performer
-    method. CS is used in this context as a means for synthesizing matches in a
+    expression. The reproduction of performance qualities such as dynamics,
-    corpus database to real-time input from an electric bass.  Significance is
+    timbre and timing is essential when emulating a real instrument and CS has
-    placed on linking the playback of grains to the expressivity of the
+    been used to effectively reproduce these aspects. This is achieved through
-    performer. The use of perceptually based audio descriptors to match the
+    splicing of grains based on their expressive characteristics to form
-    source to the target allows the performer to navigate the database
+    musical phrases.  For example, just as a violinist might transition
-    naturally based on factors such as the pitch and timbre of the bass
+    seamlessly from one articulation to the next, the CS software will join
-    guitar. The result is a performance that mixes characteristics of both the
+    grains to produce the variation in articulations. This contrasts with the
-    bass guitar output and the qualities of the corpus database to create a
+    traditional approach to sampling, where samples are played in isolation,
-    hybrid of the two.\\
+    resulting in a discontinuity between adjacent samples~\parencite[p. 82]{Lindemann2007}. 
-
+    The Catapillar project is one such example of this use of CS. 
-    This is by no means an exhaustive overview of the projects and techniques
+    By using a Viterbi algorithm, the project is able to calculate the
-    that explore the vast possibilities of CS. For further information, please
+    smoothest overall transition between grains across the output, resulting
-    refer to: ``Concatenative Synthesis - The Early
+    in convincing synthesis of orchestral instrument performances~\parencite[p. 5]{Schwarz2003}.
-    Years''~\parencite{Schwarz2006b}
+
-
+    \subsection*{Creative sound design}
-    \section*{Concatenator}
+    The flexibility of CS allows for creativity in a broader context than simply
-    The concatenator project aims to provide an open source set of tools that
+    emulating real-world instruments and speech. It can also be used as a tool
-    allows composers to generate a variety of CS driven realisations for
+    to explore the possibilities for synthesising new abstract sounds for
-    sound design purposes.  In addition, the project aims to provide an
+    creative purposes.\\
-    intuitive API that Python programmers might use as the fundamental building
+    A prominent project in this area of CS is IRCAM's CataRT
-    blocks to build further concatenative synthesis applications on.  
+    project~\parencite{Schwarz2006a}. The project focuses on the playback of
-    The result is a framework and command-line interface, built in Python, for
+    source grains based on their proximity to a target in multi-dimensional
-    easy access to basic CS techniques.   
+    descriptor space. Providing a target point in the descriptor space enable the
-    The current implementation can be used for the concatenation of a source
+    user to navigate the database, playing selections of samples that
-    database onto target audio files, using a range of perceptual audio
+    are nearest to the target. This allows the user to explore the database
-    descriptors for matching. Database management, simple matching and
+    intuitively through a graphic user interface, selecting a point in
-    synthesis algorithms are used to achieve this, and are described in the
+    2-dimensional space with the mouse. Grains are then played back in
-    following sections.
+    real-time to create an `audio mosaic'.\\
-
+    Alternatively, target audio can be provided and analysed to create a target
-    \section*{Program Design and Implementation}
+    location based on it's location in the descriptor space.  Tremblay and
-    The Concatenator project consists of a number of components that work
+    Schwarz's~\citeyearpar{Tremblay2010} use of CataRT to explore
-    together to produce the final output. A complete description of all
+    electroacoustic sample banks demonstrates the creative potential of this
-    components and there usage in the concatenator project can be found in it's
+    method. CS is used in this context as a means of synthesising matches in a
-    complete documentation at:\\
+    corpus database to real-time input from an electric bass.  Significance is
-
+    placed on linking the playback of grains to the expressively of the
-    *PERMANENT URL FOR DOCUMENTATION NEEDED*\\
+    performer. The use of perceptually based audio descriptors to match the
-
+    source to the target allows the performer to navigate the database
-    Output is generated by analysing overlapping segments of audio (known as
+    naturally based on factors such as the pitch and timbre of the bass
-    grains) from both the target sound and the source database, then searching
+    guitar. The result is a performance that mixes characteristics of both the
-    for the closest matching grain in the source database to the target sound.
+    bass guitar output and the qualities of the corpus database to create a
-    Finally, the output is generated by applying a hanning window and
+    hybrid of the two.\\
-    overlap-adding the best matches. Each component will be discussed in detail
+    This is by no means an exhaustive overview of the projects and techniques
-    in the following sections.\\
+    that explore the vast possibilities of CS. Further information can be found
-
+    in the article by~\parencite{Schwarz2006b}
-    When designing the concatenator framework, ease of development, use and
+    \pagebreak
-    extensibility were primary considerations. It was for these reasons that
+
-    the framework was written in the Python programming language. Python has
+    \section*{Concatenator}
-    grown in popularity in the scientific community recently, primarily due to
+    The Concatenator project aims to provide an open source tool that allows
-    it's focus on productivity, readability and the large number of efficient
+    composers to generate a variety of CS driven realisations for sound design
-    numeric processing libraries available (Numpy, SciPy, Scikitlearn
+    purposes.  In addition, the project aims to provide an intuitive API that
-    etc...)~\parencite[p.11]{Fangohr2014}. This makes Python a good choice for
+    Python programmers might use as the fundamental building blocks on which to
-    quickly developing ideas in the context of audio signal processing.
+    build further CS applications.  The result is a framework and command-line
-    Unfortunately, the language does sacrifice processing speed for simplicity
+    interface, built in Python, for easy access to basic CS techniques. All
-    and as a result is not suitable for real-time signal processing. Other
+    relevant material including source code, results, and documentation can be
-    performance focused languages such as C++ are better suited to this type of
+    found in the official online project repository~\parencite{perry2016a}.
-    processing. However, it was decided that the increase in productivity, lack
+    The current implementation can be used for the concatenation of a source
-    of prior CS research in Python and the author's previous experience,
+    database onto target audio files, using a range of perceptual audio
-    made it the most suitable choice for this project.\\
+    descriptors for matching.  Database management, simple matching and
-
+    synthesis algorithms are used to achieve this, and are described in the
-    The choice to limit the project to offline processing has both positive and
+    following sections. \\
-    negative implications on the function of the project. A key disadvantage to
+
-    this type of processing is the lack of possibility for any live performance
+    The features and uses of this tool are most comparable to those of the
-    aspect. This method provides no way of exploring the feedback between
+    MATConcat project~\parencite{sturm2004}, which was developed to provide an
-    performer and system in a live environment, comparable to the work of
+    open source tool for generating similar representations of audio in MATLAB.
-    Tremblay and Schwarz's~\citeyearpar{Tremblay2010}.
+    Although there are technical differences such as the number of descriptors
-    However, there are advantages to offline processing that would not be
+    available for each project, both share a similar focus on the
-    possible in a real-time context.\\
+    electro-acoustic compositional applications of CS. Results produced for the
-    One significant advantage is that databases can afford to be far larger
+    MATConcat project are comparable to those of the Concatenator project, and
-    than they could in real time. Without the requirement to process output in
+    both work offline to produce results. The Concatenator project builds on
-    a short period of time, more time can be taken to search vast databases in
+    this by providing a wider variety of descriptors and the ability to
-    the hope that the closest match to a target will be found.\\
+    artificially enhance matches (as discussed in the~\hyperref[sat]{Synthesis
-    Another advantage is in the global view of a target that can be taken in an
+    and Transformations section}).
-    offline approach. Because the complete audio file is available from the
+
-    start of processing, techniques can be applied that consider the output as
+    \section*{Program design and implementation}
-    a whole rather than on a grain by grain basis. This allows for algorithms
+    The Concatenator project consists of a number of components that work
-    such as the Viterbi algorithm to find the sequence of grains that provide
+    together to produce the final output. A complete description of all
-    the best continuity, as demonstrated in the Catapillar
+    components and there usage in the Concatenator project can be found in it's
-    project~\parencite[p.4]{Schwarz2003} This would not be possible in
+    documentation.\\
-    real-time, as audio is processed on-the-fly.\\
+
-
+    Output is generated by analysing overlapping segments of audio (known as
-    An additional consideration was the method to be used for controlling the
+    grains) from both the target sound and the source database, then searching
-    target to be matched to. It was decided that the most interesting results
+    for the closest matching grain in the source database to the target sound.
-    would be produced through the matching of grains to a target audio file, as
+    Finally, the output is generated by applying a hanning window and
-    opposed to other approaches such as matching to MIDI scores. In this sense
+    overlap-adding the best matches. Each component is discussed in detail
-    the project is a form of offline audio-mosaicking tool similar to that of
+    in the following sections.\\
-    CataRT.
+
-    
+    When designing the Concatenator framework, ease of development, use and
-    \subsection*{Descriptor Implementation}
+    extensibility were primary considerations. It was for these reasons that
-    In order to differentiate between grains, a number of audio descriptors
+    the framework was written in the Python programming language. Python has
-    were implemented. Audio descriptors are used to measure a specific
+    grown in popularity in the scientific community recently, primarily due to
-    characteristic of a signal~\parencite[p.31]{Lerch2012}. For example, an RMS
+    its focus on productivity, readability and the large number of efficient
-    descriptor was implemented to give an indication of the overall intensity
+    numeric processing libraries available (\cite{Pedregosa2011,
-    of the grain. Another example is the F0 descriptor implemented to give a
+    Fangohr2014, Scipy}). This makes Python a good choice for
-    value relating to pitch for harmonic grains. These values could then be
+    quickly developing ideas in the context of audio signal processing.
-    used by the matching algorithm in order to find the best match between the
+    Unfortunately, the language does sacrifice processing speed for simplicity, 
-    source and target grains. A full description of all descriptors implemented
+    and as a result, is not suitable for real-time signal processing. Other
-    can be found in the Concatenator documentation.\\
+    performance-focused languages such as C++ are better suited to this type of
-    Due to time constraints on the project, only a limited number of basic
+    processing. However, it was decided that the increase in productivity, lack
-    descriptors were implemented. For this reason, it was ensured that new
+    of prior CS research in Python and the author's previous experience made
-    descriptors could be added easily to the project. The object oriented
+    it the most suitable choice for this project.\\
-    design of the descriptors provides the potential for quick development of
+
-    any future descriptors to be added to the project. 
+    The choice to limit the project to offline processing has both positive and
-
+    negative implications for the function of the project. A key disadvantage
-    \subsection*{Database Design}
+    of this type of processing is the lack of possibility for any live
-    When generating descriptors for large database, large amounts of data are
+    performance aspect. This method provides no way of exploring the feedback
-    produced and so an efficient method of storing and retrieving the data was
+    between performer and system in a live environment, as in the work
-    needed to manage this. The Python interface to the HDF5
+    of Tremblay and Schwarz~\citeyearpar{Tremblay2010}.
-    filesystem~\parencite{Collette2016} was chosen for it's simplicity and
+    However, there are advantages to offline processing that would not be
-    ability to compress the data automatically. Storing Numpy arrays of
+    possible in a real-time context.\\
-    descriptors in groups allowed for quick and easy access to analyses from a
+    One significant advantage is that databases can afford to be far larger
-    single, organized source.
+    than they could be in real time. Without the requirement to process output in
-
+    a short period of time, more time can be taken to search vast databases in
-    \subsection*{Matching Algorithms}
+    the hope that the closest match to a target will be found.\\
-    In order to match grains using the descriptor values, a matching algorithm
+    Another advantage is in the global view of a target that can be taken in an
-    was required. Initially a brute force matcher was used to compare each
+    offline approach. Because the complete audio file is available from the
-    descriptor value in the target to all values of the same descriptor type in
+    start of processing, techniques can be applied that consider the output as
-    the source. However, it quickly became apparent that this approach would be
+    a whole, rather than on a grain-by-grain basis. This allows for algorithms
-    far to slow, particularly for larger database.\\
+    such as the Viterbi algorithm to find the sequence of grains that provide
-    For this reason, a k-dimensional tree search algorithm was used in an
+    the best continuity, as demonstrated in the Catapillar
-    effort to improve matching efficiency.  This approach produced the same
+    project~\parencite[p. 4]{Schwarz2003} This would not be possible in
-    results as the brute force matcher, but by arranging descriptors in a tree
+    real-time, as audio is processed 'on the fly'.\\
-    structure, a far more efficient search to find the best match was possible.
+
-    This reduced matching time considerably.
+    An additional consideration was the method to be used for controlling the
-
+    target to which the grains would be matched. It was decided that the most
-    \subsection*{Synthesis and Transformations}
+    interesting results would be produced through the matching of grains to a
-    The final step in the program is to synthesize the matched output.
+    target audio file, as opposed to other approaches such as matching to MIDI
-    This process consisted of:
+    scores. In this sense the project is a form of offline audio-mosaicking
-    \begin{enumerate}
+    tool similar to that of CataRT.
-        \item Retrieving the best grain matches returned by the matching algorithm
+    
-        \item Applying a window function
+    \subsection*{Descriptor Implementation}
-        \item Overlapping the grains 
+    In order to differentiate between grains, a number of audio descriptors
-        \item Transforming grains to match target
+    were implemented. Audio descriptors are used to measure a specific
-        \item Saving the result to a file
+    characteristic of a signal~\parencite[p. 31]{Lerch2012}. For example, a
-    \end{enumerate}
+    root mean square (RMS) descriptor was implemented to give an indication of
-    Initially, grains were not transformed to better match the target.  This
+    the overall intensity of the grain. Another example is the fundamental
-    worked effectively for large databases, however it was observed that
+    frequency (F0) descriptor, which was implemented to give a value relating
-    results synthesized using small databases were of a lower quality as the
+    to pitch for harmonic grains. These values could then be used by the
-    chance of a closely matched grain was lower. To account for this, methods
+    matching algorithm in order to find the best match between the source and
-    for altering grains to better match their target were implemented.  It was
+    target grains.\\
-    decided that the two most significant characteristics to alter were the
+    Owing to time constraints on the project, only a limited number of basic
-    pitch and intensity of the grains.  By scaling the grains by the difference
+    descriptors were implemented. For this reason, the project was designed so that new
-    between the source and target RMS, it was possible to impose a closer
+    descriptors could easily be added. The object-oriented
-    intensity on a grain. Likewise, by shifting the pitch of a grain by the
+    design of the descriptors provides the potential for quick development of
-    difference, it was possible to better match the pitch contour of the output
+    any future descriptors to be added. 
-    to that of the target audio.  This improved the results significantly in
+
-    smaller databases, as poor matches could be improved to match the target
+    \subsection*{Database design}
-    more convincingly.
+    When generating descriptors for large databases, large amounts of data are
-
+    produced and so an efficient method of storing and retrieving the data was
-    \subsection*{Command line Interface}
+    needed in order to manage this. The Python interface to the HDF5
-    In order to make the framework accessible to users, a commandline interface
+    filesystem~\parencite{Collette2016} was chosen for it's simplicity and
-    was developed. By supplying arguments to the program, users could alter
+    ability to compress the data automatically. Storing Numpy arrays of
-    parameters and experiment freely with the tool.  Although this interface
+    descriptors in groups allowed for quick and easy access to analyses from a
-    was sufficient for testing and experimentation, it quickly became apparent
+    single, organised source.
-    that there were too many parameters to pass to the program via the command
+
-    line interface on each run. A configuration file parser was created to
+    \subsection*{Matching algorithms}
-    address this issue, allowing users to specify default parameters that would
+    In order to match grains using the descriptor values, a matching algorithm
-    be used by the program on each run. The combination of these interfaces
+    was required. Initially a brute-force matcher was used to compare each
-    provided an effective means for accessing all of the framework's features.
+    descriptor value in the target to all values of the same descriptor type in
-
+    the source. However, it quickly became apparent that this approach would be
-    \subsection*{Documentation and API}
+    far too slow, particularly for a larger database.\\
-    Complete documentation for the project was created in order to make the
+    For this reason, a k-dimensional tree search algorithm was used in an
-    project as user friendly as possible for both developers and users.  As a
+    effort to improve matching efficiency.  This approach produced the same
-    result, a full API is available alongside examples of use and instructions
+    results as the brute force matcher, but by arranging descriptors in a tree
-    for commandline operation. This was created in the hope that it might form
+    structure, a far more efficient search to find the best match was possible.
-    a usable package that developers can build on quickly and effectively to
+    This reduced matching time considerably.
-    build other CS projects, allowing for easier access to Python based CS than
+
-    is currently available. The command line interface is equally documented to
+    \subsection*{Synthesis and transformations} \label{sat}
-    allow users to create their own realisations quickly and easily so that
+    The final step in the program was to synthesise the matched output.
-    this project may be used for creative sound design purposes.
+    This process consisted of:
-
+    \begin{enumerate}
-    \section*{Results and Evaluation}
+        \item Retrieving the best grain matches returned by the matching algorithm
-    Overall, results generated by this project showed promise; a variety of
+        \item Applying a window function
-    transformations were generated using open source instrument databases to
+        \item Overlapping the grains 
-    demonstrate the projects potential for sound design application. This
+        \item Transforming grains to match the target
-    tested the project's ability to convincingly impose qualities of an
+        \item Saving the result to a file
-    instrument onto target sounds. A variety of examples are provided that
+    \end{enumerate}
-    outline the style of synthesis aimed for. These range from imposing
+    Initially, grains were not transformed to better match the target.  This
-    acoustic guitar qualities on an electric guitar to imposing stringed
+    worked effectively for large databases; however, it was observed that
-    instrument qualities on vocal melodies. Current results have a clear
+    results synthesised using small databases were of a lower quality, as the
-    synthetic nature, but still clearly exhibit some of the main
+    chance of a closely matched grain was lower. To account for this, methods
-    characteristics of the database used.\\
+    for altering grains to better match their target were implemented.  It was
-
+    decided that the two most significant characteristics to alter were the
-    \noindent
+    pitch and intensity of the grains.  By scaling the grains by the difference
-    Concatenator project examples that demonstrate current results can be found at:\\
+    between the source and target RMS, it was possible to impose a closer
-
+    intensity on a grain. Likewise, by shifting the pitch of a grain by the
-    *PERMENANT URL FOR RESULTS NEEDED*\\
+    difference, it was possible to better match the pitch contour of the output
-
+    to that of the target audio.  This improved the results significantly in
-    \section*{Research Limitations/Potential Development}
+    smaller databases, as poor matches could be improved to match the target
-    In retrospect, a great deal of time was spent trying to improve the
+    more convincingly.
-    efficiency of the project. Although this was necessary, as initial tests
+
-    were not feasible on most databases, it had a negative impact on the time
+    \subsection*{Command-line interface}
-    available for developing perceptual qualities of the output. As a result of
+    In order to make the framework accessible to users, a command-line interface
-    this, the overall quality of output may perhaps not be as natural as that of
+    was developed. By supplying arguments to the program, users could alter
-    other projects in this area. This is apparent in the vocal -> string
+    parameters and experiment freely with the tool.  Although this interface
-    instrument examples. Phrases tend to begin and end abruptly, failing to
+    was sufficient for testing and experimentation, it quickly became apparent
-    replicate any defined attack or decay of the string instruments, as would
+    that there were too many parameters to pass to the program via the command
-    be expected when hearing a string instrument naturally. Conversely, this
+    line interface on each run. A configuration file parser was created to
-    does give output it's own synthetic characteristic, which may be desirable
+    address this issue, allowing users to specify default parameters that would
-    as perfect reproduction of an instrument may not be the reason for using
+    be used by the program on each run. The combination of these interfaces
-    this tool.\\
+    provided an effective means for accessing all of the framework's features.
-    In Addition, the high computation required results in large amounts of time
+
-    needed to produce high quality results. An end user may not have the
+    \subsection*{Documentation and API}
-    patience required to to reach the quality of results that might be
+    Complete documentation for the project was created in order to make the
-    possible. This is in part a set back of the Python language, and could be
+    project as user friendly as possible for both developers and users.  As a
-    better accounted for with further work on profiling the performance of the
+    result, a full API is available alongside examples of use and instructions
-    tool.\\
+    for command-line operation. This was created in the hope that it might form
-    However, the fundamental concepts such as descriptor matching and
+    a usable package that developers can build on quickly and effectively to
-    transforming matches to better fit the target, that are used in the most
+    build other CS projects, allowing for easier access to Python-based CS than
-    sophisticated CS projects, have been implemented in this project to
+    is currently available. The command-line interface is equally documented to
-    satisfying creative effect. As a proof of concept, this project displays
+    allow users to create their own realisations quickly and easily so that
-    the possibilities for CS in Python and there is evidently potential for
+    this project may be used for creative sound design purposes.
-    further development in this area.\\
+
-
+    \section*{Results and evaluation}
-    There are a number of further improvements that could be made to this
+    Overall, the results generated by this project showed promise; a variety of
-    project in order to improve the quality of results and extend it's overall
+    transformations were generated using open source instrument databases to
-    usefulness. Some initial ideas for improvements are detailed in this
+    demonstrate the projects potential for sound design application. This
-    section below. These range from reasonably simple modifications that could
+    tested the project's ability to convincingly impose qualities of an
-    not be implemented purely due to time constraints, to more complex ideas
+    instrument onto target sounds. A variety of examples are provided that
-    that may take a considerable amount of work.\\
+    outline the style of synthesis aimed for. These range from imposing
-
+    acoustic guitar qualities on an electric guitar to imposing stringed
-    The current implementation uses only a small and relatively basic subset of
+    instrument qualities on vocal melodies. Current results have a clear
-    the audio descriptors available. This limits the analysis of audio and thus
+    synthetic nature, but still clearly exhibit some of the main
-    the quality of matches. Using a larger set of more advanced descriptors may
+    characteristics of the database used.
-    improve quality from this perspective. One way would be to incorporate the
+
-    open source Essentia audio descriptors~\parencite{Essentia2016} giving the
+    \section*{Research Limitations/Potential Development}
-    project access to a vast quantity of descriptors for analysis.\\
+    In retrospect, a great deal of time was spent trying to improve the
-
+    efficiency of the project. Although this was necessary, as initial tests
-    Replacing the hanning window function used for grain windowing with a short
+    were not feasible on most databases, it had a negative impact on the time
-    cross fade at grain overlaps should reduce amplitude modulation, resulting
+    available for developing perceptual qualities of the output. As a result of
-    in smoother transitions between grains. This might be further improved
+    this, the overall quality of output might not perhaps be as natural as that
-    through calculating the point of maximum similarity by cross-correlating
+    of other projects in this area. This is apparent in the
-    overlapping sections, as described by~\textcite[p.191-193]{Zolzer2011} in
+    vocal~\textrightarrow~string instrument examples. Phrases tend to begin and
-    the SOLA algorithm.\\
+    end abruptly, failing to replicate any defined attack or decay of the
-
+    string instruments, as would be expected when hearing a string instrument
-    A lack of continuity between grains was observed in results, most likely
+    naturally. Conversely, this does give output it's own synthetic
-    due to the lack of any comparison of selected grains. A Viterbi algorithm
+    characteristic, which may be desirable as perfect reproduction of an
-    could be used to account for this, allowing for a search to be done amongst
+    instrument may not be the reason for using this tool.\\
-    the top matches to find the optimal set of grains. This takes advantage of
+    In addition, the amount of computation required results in large amounts of time
-    the offline nature of the project and has been shown to work effectively in
+    needed to produce high quality results. An end user may not have the
-    the Talkapillar project~\parencite{Hueber}.
+    patience required to reach the quality of results that might be
-
+    possible. This is in part a drawback of the Python language, and could be
-    Although the HDF5 filesystem allows for easy storage of descriptor values,
+    better accounted for with further work on profiling the performance of the
-    it also has drawbacks that limits the functionality of the project. One
+    tool.\\
-    significant problem is that it is difficult to implement parallel
+    However, the fundamental concepts such as descriptor matching and
-    processing using the library and for this reason asynchronous processing was
+    transforming matches to better fit the target, which are used in the most
-    not implemented in the project. An alternative method of storage may
+    sophisticated CS projects, have been implemented in this project to
-    accommodate this more easily, allowing for the speed-ups possible through
+    satisfying creative effect. As a proof of concept, this project displays
-    asynchronous processing. The overall design of the database management was
+    the possibilities for CS in Python and there is evidently potential for
-    also relatively naive and may benefit from being replaced by a technology
+    further development in this area.\\
-    such as an SQL database or similar. This has been shown to work effectively
+
-    in work such as the CataRT project~\parencite[p.3]{Schwarz2006a}.
+    There are a number of further improvements that could be made to this
-    
+    project in order to improve the quality of results and extend it's overall
-    \section*{Conclusion}
+    usefulness. These range from reasonably simple modifications that could not
-    This project has provided a functioning Python based CS project with much
+    be implemented purely due to time constraints, to more complex ideas that
-    potential for further development. Given the number of technical issues
+    may take a considerable amount of work. The following is a list of some
-    faced with this style of synthesis (from the big data issues faced with
+    initial ideas for improvements.\\
-    analysis storage, to high efficiency requirements for processing the large
+
-    quantities of data), overall this project appears to work effectively. It
+    \begin{itemize}
-    provides a new and accessible means for tapping some of the vast amount of
+        \item The current implementation uses only a small and relatively basic
-    potential that concatenative synthesis has to offer.\\ With the ever
+            subset of the audio descriptors available. This limits the analysis
-    increasing quality of technology, it is predicted that new techniques such
+            of audio and thus the quality of matches. Using a larger set of
-    as concatenative synthesis may grow further in popularity, leading to an
+            more advanced descriptors may improve quality from this
-    increasing number of possibilities in this area of sound synthesis. It is
+            perspective. One way would be to incorporate the open source
-    hoped that this project might aid in the highlighting the possibilities
+            Essentia audio descriptors~\parencite{Essentia2016} giving the
-    offered by this form of synthesis and demonstrate some of the technical
+            project access to a vast quantity of descriptors for analysis.
-    obstacles that must be addressed to design a CS project successfully.
+
-
+        \item Replacing the hanning window function used for grain windowing
-    \section*{Acknowledgments}
+            with a short cross-fade at grain overlaps should reduce amplitude
-    The author would like to thanks A. Harker for his advice and guidance
+            modulation, resulting in smoother transitions between grains. This
-    as a mentor throughout the project, and to A. Harker and P. Chen for access
+            might be further improved through calculating the point of maximum
-    to their vocal samples database.  Thanks also to D. Chaplin for his
+            similarity by cross-correlating overlapping sections, as described
-    creative input in generating results.
+            by~\textcite[p.191-193]{Zolzer2011} in the Synchronus OverLap Add
-
+            (SOLA) algorithm.
-    \printbibliography
+
-\end{document}
+        \item A lack of continuity between grains was observed in results, most
            likely owing to the lack of any comparison of selected grains. A
            Viterbi algorithm could be used to account for this, allowing for a
            search to be done amongst the top matches to find the optimal set
            of grains. This takes advantage of the offline nature of the
            project and has been shown to work effectively in the Talkapillar
            project~\parencite{Hueber}.
        \item Although the HDF5 filesystem allows for easy storage of
            descriptor values, it also has drawbacks that limits the
            functionality of the project. One significant problem is that it is
            difficult to implement parallel processing using the library and
            for this reason asynchronous processing was not implemented in the
            project. An alternative method of storage may accommodate this more
            easily, allowing for the speed-ups possible through asynchronous
            processing. The overall design of the database management was also
            relatively naive and may benefit from being replaced by a
            technology such as an SQL database or similar. This has been shown
            to work effectively in work such as the CataRT
            project~\parencite[p.3]{Schwarz2006a}.
    \end{itemize}
    \section*{Conclusion}
    This project has provided a functioning Python based CS project with much
    potential for further development. Given the number of technical issues
    faced with this style of synthesis (from the big data issues faced with
    analysis storage, to high efficiency requirements for processing the large
    quantities of data), overall this project appears to work effectively. It
    provides a new and accessible means for tapping some of the vast amount of
    potential that concatenative synthesis has to offer.\\ 
    With the ever increasing quality of technology, it is predicted that new
    techniques such as concatenative synthesis may grow further in popularity,
    leading to an increasing number of possibilities in this area of sound
    synthesis. It is hoped that this project might aid in the highlighting the
    possibilities offered by this form of synthesis and demonstrate some of the
    technical obstacles that must be addressed to design a CS project
    successfully.
    \pagebreak
    \printbibliography
 \end{document}
 >>>>>>> Stashed changes