Research Seminars
Algorithms for discovering repeated patterns in multidimensional
representations of polyphonic music
David Meredith, City University, London
19 February 2003
In this talk I address the problem of designing an algorithm for
discovering repeated patterns in music. Such an algorithm could have
both scientific and engineering applications. For example, it could
form an important component in a computational model of expert music
cognition and it could also be used to build useful software tools
for music analysts and composers. A music analyst might find such
an algorithm useful for discovering characteristic structural features
in the works of a particular composer or for analysing the structure
of a work into its elemental `building blocks'. A composer who is
suffering from writer's block could use such an algorithm to find
significant repeated structures in an unfinished work thus discovering
a fresh perspective that might stimulate further progress on the work.
Another important practical use for such an algorithm is for music
database indexing.
In most previous approaches to repetition discovery in music, the
music to be analysed has been represented using strings. However,
there are certain types of interesting musical repetitions that cannot
be discovered using string algorithms. I propose a geometric approach
to repetition discovery in which the music is represented as a multidimensional
dataset (a set of points in a multidimensional space). Certain types
of interesting musical repetition that cannot be found using string
algorithms can efficiently be found using algorithms that process
multidimensional datasets.
My approach allows polyphonic music to be analysed as efficiently
as monophonic music and it can be used to discover polyphonic repeated
patterns "with gaps" in the timbre, dynamic and rhythmic
structure of a passage as well as its pitch structure. I shall present
three related algorithms: SIA, SIATEC and COSIATEC. SIA computes all
the maximal repeated patterns in a multidimensional dataset and SIATEC
computes all the occurrences of all the maximal repeated patterns
in a dataset. For a k-dimensional dataset of size n, the worst-case
running time of SIA is O(kn^2 log_2n) and the worst-case running time
of SIATEC is O(kn^3).
COSIATEC is a lossless compression algorithm based on SIATEC. It
generates an economical structural description of a dataset in terms
of maximal translatable patterns. The structural descriptions generated
by COSIATEC for music representations resemble motivic and thematic
analyses carried out by expert human music analysts.
|