Files
documents/DSP/Assignment_2_report/U1265119_Assignment2_report.tex
T

440 lines
20 KiB
TeX

\documentclass[titlepage]{scrartcl}
\usepackage{enumitem}
\usepackage[british]{babel}
\usepackage[style=apa, backend=biber]{biblatex}
\DeclareLanguageMapping{british}{british-apa}
\usepackage{url}
\usepackage{float}
\restylefloat{table}
\usepackage{perpage}
\MakePerPage{footnote}
\usepackage{abstract}
\usepackage{graphicx}
% Create hyperlinks in bibliography
\usepackage{hyperref}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{blindtext}
\setkomafont{disposition}{\normalfont\bfseries}
\graphicspath{
{./resources/},
}
\addbibresource{~/PerryPerrySource/LaTeX/DSP_Bibliography.bib}
\newsavebox{\abstractbox}
\renewenvironment{abstract}
{\begin{lrbox}{0}\begin{minipage}{\textwidth}
\begin{center}\normalfont\sectfont\abstractname\end{center}\quotation}
{\endquotation\end{minipage}\end{lrbox}%
\global\setbox\abstractbox=\box0 }
\usepackage{etoolbox}
\makeatletter
\expandafter\patchcmd\csname\string\maketitle\endcsname
{\vskip\z@\@plus3fill}
{\vskip\z@\@plus2fill\box\abstractbox\vskip\z@\@plus1fill}
{}{}
\makeatother
\DeclareCiteCommand{\citeyearpar}
{}
{\mkbibparens{\bibhyperref{\printdate}}}
{\multicitedelim}
{}
\begin{document}
\title{DSP Assignment 2\\Digital Audio Effects Implementation}
\subtitle{\LARGE{Technical Report}}
\author{Sam Perry\\U1265119}
\date{}
\begin{abstract}
This report outlines the implementation, testing and evaluation of
three real-time audio effects using the dsPIC 30F4013 digital signal
processor. Design choices, system specification, and final results
are analysed.\\
System specification is compared with alternative chips to understand
the quality of the processor in relation to the state of DSP
technology.\\
Design choices are compared to choices made in previous work to outline
the changes required to implement such effects outside unlimited
resource systems.\\
Final system performance is then discussed to determine further
changes that could be made to improve performance.
\end{abstract}
\maketitle
\section{Background/Literature:\\Digital Signal Processor/Microcontroller
Overview}
A digital signal processor (DSP) is a form of specialized microprocessor
designed specifically for the processing of signals (such as audio signals
in this case).~\parencite[p.11-12]{libtak2006ieh}. The DSP is used as part
of a microcontroller that provides an interface for components such as
Memory, data IO and peripherals that form the signal processing system.
When considering the quality of a DSP system, there are many component
specifications to consider, that contribute to the overall performance of
the system. These include:
\begin{itemize}
\item CPU
\item Memory
\item System bus
\item Bit depth
\item Samplerate
\item D/A \& A/D converters
\end{itemize}
In order to understand the specification of the
dsPIC~\parencite{mt2004dspic} in relation to the standard of DSPs
available, it will be compared to three other digital signal processors:
\begin{itemize}
\item Texas Instruments TMS320F2806~\parencite{ti2016pm}
\item Freescale 56F8025~\parencite{fs2006dsc}
\item Analog Devices ADSP-2126~\parencite{ad2012adsp}
\end{itemize}
\subsection{General Computing Factors}
Factors such as memory and CPU speed affect all computing systems. These
address the system's ability to perform calculations and handle data.
\subsubsection{Memory}
There are two types of memory that constitute the total memory of a
microcontroller: program memory and data memory.
\paragraph{Program Memory}~\\
Most modern microcontrollers use flash memory for the storage of code
executed at runtime. This is known as ROM (read-only memory) and may be
reffered to as the ``program memory''. The size of this memory determines
the amount of code that can be stored at any one time in the system.
This has implications with regards to the complexity of the program, as
insufficient program memory will limit the number of instructions that can
be used for programming.~\parencite[p.317]{raf2014fdlm}
\begin{table}[H]
\centering
\caption{DSP maximum available ROM memory}
\label{my-label}
\begin{tabular}{ll}
\textbf{Model} & \textbf{Program Memory}\\
dsPIC30F4013/PIC24 & 48kb \\
TMS320F2806 & 256kb \\
56F8025 & 32kb \\
ADSP-2126 & up to 4mb
\end{tabular}
\end{table}
These specifications indicate that the memory of the hardware used is
relatively small compared to other devices available. This limits the
complexity of programs that can be stored on the device. This accounts for
the limited complexity of designs described in section \ref{design}.
Clearly the ADSP has much more memory and has been designed for highly
complex applications, however the TMS32F2806 series would also have
sufficient memory for the programs designed in this task.
\paragraph{Data Memory}~\\
RAM (random-access memory) is volatile memory that is used for storing data
used when executing instructions. Unlike ROM memory, RAM can be both read
from and written to at runtime and is used for the storage of data that can
change as instructions are executed. This is used for the storage of data
such as audio buffers and parameter
variables.~\parencite[p.317]{raf2014fdlm} The amount of RAM available
determines the maximum size of data such as buffers for audio delays. The
speed of the RAM is also integral to the overall performance of the system,
as sufficient speed is required to read and write buffers as instructions
are executed by the CPU (see section \ref{CPU})
\begin{table}[H]
\centering
\caption{DSP maximum available RAM memory}
\label{my-label}
\begin{tabular}{ll}
\textbf{Model} & \textbf{RAM Size}\\
dsPIC30F4013/PIC24 & 2kb \\
TMS320F2806 & 100kb \\
56F8025 & 4kb \\
ADSP-2126 & up to 2mb
\end{tabular}
\end{table}
With the lowest memory of any processor in this comparisson, it is clear
that the dsPIC is lacking in this area. This creates severe limitations in
buffers and in the case of this project, resulted in severe delay line
limitations. Again, the TMS32F2806 series would have a far more acceptable
performance, allowing for multiple delays in a range of seconds as opposed
to the small fractions of a second possible with the dsPIC.
\subsubsection{System Bus}
The system bus handles data IO between components such as the CPU
and memory. For performance comparisson, the system bus's data width is
used to determine the maximum amount of memory that the CPU is able to
write to directly.~\parencite[p.318]{raf2014fdlm}
For example a 16BIT system can support a maximum of $2^{16}$ memory
addresses. This equals a maximum memory size of 64Kb of memory directly
accessible by the CPU. However, a 32BIT system can support $2^{32}$ memory
addresses which results in
{\raise.17ex\hbox{$\scriptstyle\mathtt{\sim}$}}4GB of potential
memory.~\parencite[p.34]{sd2006mfes}\\
(When analysing specifications of DSP systems it is important not to
confuse the processing architechuture with the bit depth of the DSP
components as they affect different aspects of the system.)
\begin{table}[H]
\centering
\caption{DSP bus architecture}
\label{my-label}
\begin{tabular}{ll}
\textbf{Model} & \textbf{Architecture Type}\\
dsPIC30F4013/PIC24 & 16BIT \\
TMS320F2806 & 32BIT \\
56F8025 & 16BIT \\
ADSP-2126 & 32/40BIT
\end{tabular}
\end{table}
The bus architecture is largely dictated by the total amount of memory
available to the system and this is reflected in the specification shown in
the table above. A 16BIT architecture is expected, given the 2kb of memory
supported by the dsPIC and is marginally faster than higher bit
architectures as there are less memory locations to allocate.
\subsubsection{CPU}\label{CPU}
The CPU (Central Processing Unit) is the component that executes
instructions and performs calculations on data. The speed at which the CPU
can execute instructions and store results is critical to the performance
of the system.
\paragraph{Clock Speed}~\\
Measured in cycles per second (Hz), the clock speed defines the number of
calculations that can be performed per second. A higher clock speed
indicates a higher number of calculations performed per second. This is
particularly significant in realtime DSP application as a sufficient number
of calculations must be performed in a set period of time in order to
process audio as quickly as it is provided to the system. If the clock
speed is not sufficient, this may result in instructions being missed due
to an interrupt before the processor has been able to complete them. This
can create artefacts in output audio.~\parencite[p.34]{sd2006mfes}\\
It should be noted that this is not an entirely accurate measurement for
speed as different manufacturers have different definitions of a
``calculation''.~\parencite[p.3-4]{bdti2000cdp} For a more accurate
representation, an impartial benchmark from BDTI has been included where
possible.~\parencite[p.1]{bdti2013pg}
\begin{table}[H]
\centering
\caption{DSP CPU clock speed}
\label{my-label}
\begin{tabular}{lll}
\textbf{Model} & \textbf{CPU Clock Speed}
&\textbf{BDTI(sim)mark2000\textsuperscript{TM}}\\
dsPIC30F4013/PIC24 & 70MHz & 190\\
TMS320F2806 & 90MHz & n/a\\
56F8025 & 80MHz & 110\\
ADSP-2126 & 200MHz & 1090\\
\end{tabular}
\end{table}
From this, it can be seen that the dsPIC and 56F8025 perform significantly
worse than the ADSP. This has a negative impact on samplerate as it must be
lowered to account for the lack of processing power. A sample rate of 8Khz
was used as opposed to the 44.1Khz samplerate that would be feasable on the
ADSP processor.
\paragraph{Architecture}~\\
The CPU architecture refers to the design of memory units and bus layouts.
The most common designs are von Neumann and Harvard architectures. The von
Neumann architecture combines program and memory data, allowing only serial
access of memory. By seperrating program and data memory, the Harvard
architecture allows for simulataneous access of data and program memory,
making it the more efficient of the two designs.~\parencite[p.320-321]{raf2014fdlm}
\begin{figure}[H]
\caption{von Neumann CPU architecture~\parencite[p.320]{raf2014fdlm}}
\makebox[\textwidth]{\includegraphics[width=0.75\textwidth]{neumann}}
\end{figure}
\begin{figure}[H]
\caption{Harvard CPU architecture~\parencite[p.321]{raf2014fdlm}}
\makebox[\textwidth]{\includegraphics[width=0.75\textwidth]{harvard}}
\end{figure}
\begin{table}[H]
\centering
\caption{DSP bus architecture}
\label{my-label}
\begin{tabular}{ll}
\textbf{Model} & \textbf{Architecture Type}\\
dsPIC30F4013/PIC24 & Modified Harvard\\
TMS320F2806 & Harvard\\
56F8025 & Dual Harvard\\
ADSP-2126 & Super Harvard
\end{tabular}
\end{table}
Most modern DSPs use variations on the Harvard architecture for it's
performance. Variations of this have been used in all examples.
\section{DSP Specific Factors}
DSP specific factors relate to components specifically affecting the system's
ability to handle audio signals. These will determine the quality of audio
manipulation and affect the computational requirements for the system. This
section briefly covers the computational impact of the DSP specific
factors.
\subsection{A/D \& D/A Converters}
A/D and D/A converters are required for audio input and output. Depending
on the microcontroller used, these may be integrated in the main circuit
board or available as peripherals that can be attached via ports (as is the
case for the PIC24 board used). The quality and accuracy of these
converters will clearly have an effect on the audio output and using
converters that can perform as transparently as possible is essential for a
high quality system.~\parencite[p.147-152]{kadis2012sosr}
\subsubsection{Samplerate}
The samplerate defines the frequency at which a measurement will be taken
from the input audio. This is significant due to the quantity of
information returned from the A/D converter for processing. Higher sample
rates generate more measurements per second and thus require more values
to be computed per second as discussed in section \ref{CPU}
\begin{figure}[H]
\caption{Illustration of sine wave sampling~\parencite[p.140]{kadis2012sosr}}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{quantization}}
\end{figure}
\begin{figure}[H]
\caption{Illustration of quantization error resulting from a low sample
rate.~\parencite[p.146]{kadis2012sosr}}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{sampling_error}}
\end{figure}
\subsubsection{Bit Depth}
The audio bit depth determines the accuracy to which amplitudes can be
differentiated. Higher bit depths result in a higher dynamic range in the
signal. This has implications for the converters as higher bit rates
require higher accuracy in generating values for each sample.~\parencite[p.143-145]{kadis2012sosr}
\section{Design/Analysis}\label{design}
Effect implementation was largely dicatated by the limitations of the
dsPIC. As the device had severe memory and processing limitations, it was
not possible to create effects to the standard of the first assignment. As
a result, effects were created to emulate the perceptual effect of an echo,
reverb and chorus under these limitations.
\subsection{Echo}
The echo was implemented using a single tap FIR filter. This was intended
to maximise available memory for the delay time. Through stripping out all
unnessesary features, a maximum delay size of 700 samples was acheived with
the addition of two UI switches that could be used for increasing and
decreasing delay size at runtime. At a samplerate of 8Khz, this allowed
for a single delay of \textgreater50ms (93.750ms) defined as the minimum
for the definition of an echo
by Z{\"o}lzer~\citeyearpar[p.]{zolzer2011dafx}
\begin{figure}[H]
\caption{Echo example}
\makebox[\textwidth]{\includegraphics[width=0.4\textwidth]{echo_measure}}
\end{figure}
\subsection{Chorus}
Three delays of variable size were used to emulate the multi-instrument
effect created by a chorus. This created three phase shifted versions of
the original signal which created the perception of multiple instruments.
The delay time modulation was not possible due to the computational power
required to implement a modulated delay line on a sample by sample basis.
\subsection{Reverb}
The reverb implementation involved a combination of an FIR and IIR filter,
as defined by Z{\"o}lzer~\citeyearpar{zolzer2011dafx}. This performed
poorly when compared to the moorer reverb structure used in assignment 1,
however the complexity of such a structure would require superior
performance in almost all aspects of the system.
The design used created a delayed echo that could act as a crude reverb.
\subsection{User Interface}
The UI was designed using eight switches and the LCD to create a
navigatable menu that can be used for the selection of effect, effect
parameters and volume control. The effect parameter menu is able to update
it's items dynamically based on the active effects. The desired effect can
then be selected by cycling through, using repeated presses of the effects
menu button. Parameter variables can then be increased and decreased for
the selected effect using two switches.\\
It was not possible to use the UI in conjunction with any actual audio
effect as the UI would need to be processed after any audio effects. in
practice this caused problems as the program did not always have time to
complete UI logic before the interupt, causing graphical errors and
unexpected results. As a result, the project presented is a prototype to
demonstrate possibilities given a capable system.
\section{Results}
Overall, the results produced were not of a high standard. The low signal
to noise ratio, low sample rate and over-simplification of designs made it
impossible to create results usable in a professional context. With a
sample rate of 8Khz, a cutoff sampling frequency of 4000khz was created.
This resulted in a telephony frequency response that removed higher
frequencies due to the low nyquist rate. Poor converters added significant
noise to the output which further degraded results. However, steps were
taken to create the best quality outcome with the resources available.
\subsection{Echo}
A maximum single tap delay of 750 samples was achieved through the
stripping of all unnecessary components. Without a visual UI,
this was controlled through the use of two switches that
incremented/decremented the delay in values of 10 samples every 100ms.
The lack of a feedback loop resulted in a very simplistic delay with
equally simplistic results in terms of perception. A feedback was not
implemented to differentiate this from the reverb design.
\subsection{Chorus}
The implementation provided a perceptually similar alternative to the
infeasible modulated delay line design from the previous assignment.
Each delay can be set to up to a maximum delay time of 250 samples.
This allows for manual shifting of phases to taste. Overall this
results in a functional alternative at the cost of perceptual quality.
Through the use of a spectrogram, it can be seen in figure
\ref{fig:cho_spec} that the signal is clearly noisy and distorted with
harmonics produced far above the nyquist rate. This is most likely due
to the poor converter performance
\begin{figure}[H]
\caption{Chorus spectrogram displaying distortion}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{Chorus_spectrogram}}
\label{fig:cho_spec}
\end{figure}
\subsection{Reverb}
A crude slapback reverb was created using an FIR/IIR filter
combination. The result was a short reverb tail with decaying
repetitions of the dry sample. This would have been vastly improved
with the addition of further parallel comb filters or all pass filters.
A clear decaying echo can be see in figure \ref{fig:rev_wav}
\begin{figure}[H]
\caption{Reverb waveform}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{Click_reverb_waveform}}
\label{fig:rev_wav}
\end{figure}
\section{Further Work}
Building on skills learnt in this project, results could be improved
significantly through the use of C code directly, allowing for greater
control over the underlying logic than is possible through Flowcode. This
would also offer greater scope for design as it would no longer be limited
to the relatively limited Flowcode macros.\\
The main improvement would be to use a more capable system. It is clear
that the limitations of the system have impeded the quality of results to
an unreasonable degree.
\section{Conclusions}
Overall this project has offered insight into the performance issues and
limitations involved when programming for integrated systems of finite
resources. It has also highlighted the limitations of real-time processing
as opposed to offline processing.\\
When planning a hardware system it is clear that the system specification
and design of it's code must be as efficient as possible in order to
produce the quality of results that are possible on unlimited resource
systems such as a standard computer.
\printbibliography
\end{document}