183 lines
8.8 KiB
TeX
Executable File
183 lines
8.8 KiB
TeX
Executable File
|
|
\documentclass{scrartcl}
|
|
\usepackage{enumitem}
|
|
\usepackage[british]{babel}
|
|
\usepackage[style=apa, backend=biber]{biblatex}
|
|
\DeclareLanguageMapping{british}{british-apa}
|
|
\usepackage{url}
|
|
\usepackage{float}
|
|
\restylefloat{table}
|
|
\usepackage{perpage}
|
|
\MakePerPage{footnote}
|
|
\usepackage{abstract}
|
|
\usepackage{graphicx}
|
|
% Create hyperlinks in bibliography
|
|
\usepackage{hyperref}
|
|
|
|
\renewcommand{\familydefault}{\sfdefault}
|
|
\usepackage{fontspec}
|
|
\setmainfont{Arial}
|
|
|
|
\usepackage{blindtext}
|
|
\setkomafont{disposition}{\normalfont\fontsize{12}{17}\bfseries}
|
|
\setkomafont{section}{\normalfont\fontsize{12}{17}\bfseries}
|
|
\setkomafont{subsection}{\normalfont\fontsize{12}{17}\bfseries\itshape}
|
|
\setkomafont{subsubsection}{\normalfont\fontsize{12}{17}\itshape}
|
|
|
|
\graphicspath{{./resources/}}
|
|
\addbibresource{~/PerryPerrySource/LaTeX/library.bib}
|
|
|
|
|
|
\usepackage{etoolbox}
|
|
\makeatletter
|
|
\expandafter\patchcmd\csname\string\maketitle\endcsname
|
|
{\vskip\z@\@plus3fill}
|
|
{\vskip\z@\@plus2fill\box\abstractbox\vskip\z@\@plus1fill}
|
|
{}{}
|
|
\makeatother
|
|
|
|
\DeclareCiteCommand{\citeyearpar}
|
|
{}
|
|
{\mkbibparens{\bibhyperref{\printdate}}}
|
|
{\multicitedelim}
|
|
{}
|
|
|
|
\begin{document}
|
|
\title{Descriptor Driven Concatenative Synthesis Tool}
|
|
% \subtitle{\LARGE{Abstract Draft}}
|
|
\author{Sam Perry}
|
|
|
|
\maketitle
|
|
|
|
\begin{abstract}
|
|
A command-line tool is proposed for the exploration of a new form of audio
|
|
synthesis known as ``concatenative-synthesis'' (CS): A form of synthesis that uses
|
|
perceptual audio analyses to arrange small segments of audio based on their
|
|
characteristics. The tool is designed to synthesise representations of an
|
|
input sound using a database of source sounds. This involves the
|
|
segmentation and analysis of both the input sound and database, matching of
|
|
input segments to their closest segment from the database, and the
|
|
re-synthesis of the closest matches from the database to produce the final
|
|
result.\\
|
|
|
|
The aim was to produce a tool capable of generating high quality sonic
|
|
representations of an input, and to present a variety of examples that
|
|
demonstrated the breadth of possibilities that this style of synthesis has
|
|
to offer. There are a number of other projects that use this form of
|
|
synthesis, however this project aims primarily to explore the further
|
|
potential offered through the offline processing of large databases, of
|
|
which considerably less research exists.\\
|
|
|
|
Results demonstrate the wide variety of sounds that can be produced using
|
|
this method of synthesis. A number of technical issues are outlined that
|
|
impeded the overall quality of results and efficiency of the software.
|
|
However, the project clearly demonstrates the strong potential for this
|
|
type synthesis to be used for creative purposes.
|
|
\end{abstract}
|
|
|
|
\section*{Background}
|
|
The concept of constructing a new sound by arranging collections of smaller
|
|
sounds has gained popularity in the past 30 years through the introduction
|
|
of ``Granular Synthesis''. Granular synthesis works on the theory that any
|
|
sound can be described through the arrangement of smaller samples (reffered
|
|
to as ``grains''). This representation of sound allows for the temporal
|
|
decomposition and re-arranging of real-world samples, with the potential to
|
|
create new ``complex, dynamically-evolving
|
|
sounds.''~\parencite[p.1]{Roads1988}\\
|
|
|
|
Concatenative synthesis is a form of synthesis that has developed
|
|
significantly over the past 15 years, driven by recent advancements in
|
|
technology. Key advancements have been in easy access to large databases of
|
|
audio and the development of methods for extracting useful information from
|
|
these databases automatically~\parencite[p.1]{Schwarz2006}. CS utilises
|
|
these technologies to provide a content-based extension to granular
|
|
synthesis; by analysing a database of source grains, grains can be
|
|
differentiated based on their charcteristics. These charachteristics can
|
|
then be used for grain selection in the process of synthesizing output for
|
|
a wide range of applications~\parencite[p.102]{Schwarz2007}.
|
|
|
|
\subsection*{Related Works}
|
|
A number of programs utilize CS to achieve various goals. The process has
|
|
been used for applications in areas such as Speech Synthesis, Instrument
|
|
synthesis and for applications in creative sound design.\\
|
|
The wide range of applications demonstrates the versatility of this
|
|
synthesis technique. It differs from traditional synthesis methods through
|
|
the use of real recorded samples, as opposed to traditional methods that
|
|
focus on defining sets of rules for emulating real sounds. By transforming
|
|
samples that have been directly recorded from a source, the subtle nuances
|
|
of the source's sound are preserved. These would be difficult to reproduce
|
|
using other synthetic methods for modeling an
|
|
instrument~\parencite[p.24]{Maestre2009a}.
|
|
|
|
\subsubsection*{Speech Synthesis}
|
|
Creating a natural and intelligible realisation is an important factor when
|
|
developing a speech synthesis system.*add part about continuity here* The
|
|
Talkapillar project is one such example of how highly convincing results
|
|
are possible with CS. Through careful analysis of a vocal database, the
|
|
project aims to impose the qualities of the database voice on an input
|
|
voice. This would result in the words of the input speaker being
|
|
transformed to appear as if they were spoken by the voice in the
|
|
database.~\parencite{Hueber}
|
|
|
|
\subsubsection*{Instrument Synthesis}
|
|
Progress has also been made in improving the quality of instrument
|
|
synthesis. As with speech synthesis, the use of samples directly allows for
|
|
natural sounding results, which provides a method for reproducing real
|
|
instruments convincingly. An important aspect of instrument synthesis is
|
|
that of performer expression. The reproduction of performance qualities
|
|
such as dynamics, timbre and timing are an important factor and CS has been
|
|
used to effectively reproduce these aspects. This is achieved through
|
|
splicing of grains based on their characteristics to form musical phrases.
|
|
Just as a performer might transition seamlessly from one musical phrase to
|
|
the next, the CS software will join grains to produce the varying
|
|
articulations and transitions. This contrasts the traditional approach to
|
|
sampling, where samples are played in isolation, resulting in a
|
|
discontinuity between adjacent samples. The comercial software synthesizer
|
|
``Synful'' (\url{www.synful.com}) successfully demonstrates the use of
|
|
CS to produce highly convincing recreations of orchestral instrument
|
|
performances.~\parencite[p.82]{Lindemann2007}.
|
|
|
|
\subsubsection*{Creative Sound Design}
|
|
The flexibilty of CS allows for creativity in a broader context than simply
|
|
emulating real-world instruments and speech. It can also be used as a tool
|
|
to explore the possibilities for synthesizing new abstract sounds for
|
|
creative purposes.
|
|
One example of this is Tremblay and Schwarz's~\citeyearpar{Tremblay2010}
|
|
use of ``audio mosaicing'' to explore electroacoustic sample banks. CS is
|
|
used in this context as a means for synthesizing matches in a corpus
|
|
database to real-time input from an electric bass. Significance is placed
|
|
on linking the playback of grains to the expressivity of the performer. The
|
|
use of perceptualy based audio descriptors to match the source to the
|
|
target allows the performer to navigate the database intuitively based on
|
|
factors such as the pitch and timbre of the bass guitar. The result is a
|
|
performance that mixes characteristics of both the bass guitar performance
|
|
and the qualities of the corpus database to create a hybrid of the two.
|
|
|
|
|
|
|
|
further forms of concatenative synthesis techniques include: Spectral resynthesis (see tremblay sect 4.1.2)
|
|
|
|
\section*{Concatenator Program Design and Implementation}
|
|
Aims:
|
|
instrument resynthesis onto a pre-existing source sound, rather than from scratch onto things like midi notes.
|
|
Offline processing to allow for large databases to be used - disadvantage: loss of feedback between performer and system, as described in PA's paper.
|
|
\subsection*{Framework Design}
|
|
\subsection*{Descriptor Implementation}
|
|
\subsection*{Matching Algorithms}
|
|
\subsection*{Synthesis and Transformations}
|
|
\subsection*{Command line Interface}
|
|
High quantity of parameters is very time consuming ~\parencite{Petrushin2007}
|
|
|
|
\section*{Results and Evaluation}
|
|
|
|
\section*{Research Limitations/Potential Development}
|
|
Given the limited time frame and complexity of modern approaches to this
|
|
form of synthesis, only a basic implementation was possible.
|
|
|
|
Use of RPM?~\parencite[p.82]{Lindemann2007}
|
|
|
|
\section*{Conclusion}
|
|
|
|
\printbibliography
|
|
\end{document}
|