Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Algorithmic Composition
Author: Supervisor:
Sarah King Dr. Andrea Schalk
1
Acknowledgements
Firstly, I would like to thank my supervisor Dr. Andrea Schalk for being a
constant support throughout this project and my time at university.
Thirdly, to Rebecca Doran for sticking with me through thick and thin.
Your friendship is worth the world to me, and I hope to never lose that.
Fourthly, to all my friends and family who have acted as coding ducks dur-
ing this past year: I could not have got to this stage without you all. Thank
you for keeping me sane.
Finally, to Richard Hartnell and Merle Calderbank — thank you for intro-
ducing me to the wonderful worlds of mathematics and music. You are both
a great inspiration and without your enthusiasm, the idea for this project
could never have been born.
2
Contents
Abstract 1
Acknowledgements 2
1 Introduction 5
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Aims and Objectives . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Structure of the Report . . . . . . . . . . . . . . . . . . . . . 6
2 Background Research 7
2.1 Previous Attempts . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Canonical Composition (c. 16th Century) . . . . . . . 7
2.1.2 Dice Music (c. 1780) . . . . . . . . . . . . . . . . . . . 7
2.1.3 Twelve–tone Music (c. 1910) . . . . . . . . . . . . . . 8
2.1.4 The Illiac Suite (c. 1955) . . . . . . . . . . . . . . . . 8
2.1.5 Musicomp (c. 1960) . . . . . . . . . . . . . . . . . . . 8
2.1.6 Formalised Music (c. 1960) . . . . . . . . . . . . . . . 9
2.1.7 Experiments In Music Intelligence (c. 1980) . . . . . . 9
2.1.8 Genetic Programming (c. 1995) . . . . . . . . . . . . . 9
2.2 Analysis and Lessons Learned . . . . . . . . . . . . . . . . . . 10
3 Design 12
3.1 Musical Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.1 Note Selection . . . . . . . . . . . . . . . . . . . . . . 12
3.1.2 Note Duration . . . . . . . . . . . . . . . . . . . . . . 12
3.1.3 Cadences . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Markov Modelling . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.2 The Mathematics Behind The Models . . . . . . . . . 15
3.2.3 Representing Music . . . . . . . . . . . . . . . . . . . 16
3.2.4 Selecting a Note . . . . . . . . . . . . . . . . . . . . . 17
3.3 Critic Function . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.1 Note Repetition . . . . . . . . . . . . . . . . . . . . . 18
3.3.2 Cadences . . . . . . . . . . . . . . . . . . . . . . . . . 19
3
Contents Contents
4 Implementation 20
4.1 Software Design . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Handling Music — The JFugue API . . . . . . . . . . . . . . 21
4.2.1 JFugue MusicString . . . . . . . . . . . . . . . . . . . 21
4.3 Improving Algorithm Output . . . . . . . . . . . . . . . . . . 22
4.3.1 Random Chance (A1) . . . . . . . . . . . . . . . . . . 22
4.3.2 Basic Markov Modelling (A2) . . . . . . . . . . . . . . 22
4.3.3 Markov & Critic Function (A3) . . . . . . . . . . . . . 23
4.3.4 Markov, Critic & Variable Note Lengths (A4) . . . . . 23
4.3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.4 Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Bibliography 36
A Music Terminology 40
B JFugue Details 42
4
Chapter 1
Introduction
1.1 Motivation
Mathematics is used in almost every academic discipline, so it follows that
mathematics is heavily involved in the creation of music. The Ancient Greek
mathematician Pythagoras is credited with the creation of the musical scale,
noticing that strings split in the ratio of 3 : 2 produce notes that are a perfect
fifth apart. Different ratios then produce different note intervals [Fra01].
Obviously, mathematics and mathematical theory are also heavily in-
volved within computer science. Without mathematics giving us the ability
to express ideas to a computer, new technologies and the field of computer
science itself would not be as widespread.
There are plenty of examples from history of people composing music
using a computer. The earliest example of this is the ILIAC computer at
The University of Illinois [Edw11]. The Iliac Suite for String Quartet was
completed in 1956 and makes use of Markov chains to generate random-
5
Chapter 1. Introduction 1.2. Aims and Objectives
• Input pieces from human composers and have the ability to load the
random probabilities with probabilities relating to a certain composer.
• Allow the user to change tempo, instruments and style of music cre-
ated.
6
Chapter 2
Background Research
7
Chapter 2. Background Research 2.1. Previous Attempts
8
Chapter 2. Background Research 2.1. Previous Attempts
9
Chapter 2. Background Research 2.2. Analysis and Lessons Learned
function which ‘listens’ to the music generated by the algorithm and decides
if it is a suitable representation of music [Mau99].
10
Chapter 2. Background Research 2.2. Analysis and Lessons Learned
11
Chapter 3
Design
There has been a lot of debate about what makes ‘good music’ good. This
chapter looks at some of the bigger concepts in this debate, and then discuss
how these concepts were combined into a single Critic Function used to judge
a piece of music generated by a computer.
12
Chapter 3. Design 3.1. Musical Theory
a minim. Quavers and semi-quavers are the quickest moving notes, with a
quaver having half the duration of a crotchet and a semi–quaver having half
the duration of a quaver. This hierarchy is shown in Figure 3.1.
3.1.3 Cadences
A cadence is a sequence of notes or chords that generally signifies the end of
a musical piece or phrase. There are 4 types of cadence that are commonly
used in music. Chord progressions are generally written as Roman numerals,
where major chords are upper case numerals, and minor chords are lower
case numerals. Finally, diminished chords have a small circle to signify that
they are diminished.The application of Roman numerals to the C major
scale is depicted in Figure 3.2.
13
Chapter 3. Design 3.2. Markov Modelling
Finished Cadences
A perfect cadence is a chord progression from V to I. This creates the feeling
that the music has come to a definitive end, and as such are usually used at
the end of a piece of music.
A plagal cadence is a chord progression from IV to I. This also creates
the feeling that the music has come to a definitive end, and can also be found
at the end of a piece of music. The plagal cadence was traditionally used in
plainchant songs that emerged around 100 A.D. [Est15] as it is commonly
sung at the end of hymns to the ‘A–men’.
Unfinished Cadences
An imperfect cadence is a chord progression from I to V. Unlike the perfect
or plagal cadences, an imperfect cadence does not sound finished. They
are used at the end of movements (as the music is carrying on into another
movement) or in the middle of a piece at the end of a particular section.
Imperfect cadences sound as though they want to carry on to complete the
music properly.
An interrupted cadence is a chord progression from vi to vii◦ . An inter-
rupted cadence does not provide a satisfactory end to a piece of music, and
is used in the same way as an imperfect cadence.
3.2.1 Overview
Markov chains, named for Andrey Markov, are mathematical systems that
move between ‘states’ which represent a situation or some values. Alongside
state names, there is also a set of probabilities that represent the chance
of moving from one state to the next. Markov models take into consid-
eration the events that occurred immediately before (and the probabilities
of these events happening), implying that the outcome could be changed
dramatically depending upon the events that precede a particular event.
In a two-state system, there are 4 possible transitions that the model
must take into consideration: A → A, A → B, B → A, and B → B (as
states can always transition to themselves). In this simple system, depicted
14
Chapter 3. Design 3.2. Markov Modelling
A B
in Figure 3.3, the probability of transitioning from one state to any other
is 0.5, as at each state there are two places it can transition to (with even
weighting). Expanding this model, if a state has N links, there is a 1 /N
chance of transitioning to another state.
Of course, it can be that a certain path is more favourable than an-
other, and weight the transition probabilities accordingly. The skewing of
transition probabilities helps to model real–life situations accurately.
15
Chapter 3. Design 3.2. Markov Modelling
So,
P (A, B) = P (B | A) · P (A)
5 1
= ×
9 2
5
=
18
However, Markov models differ in the sense that they only consider the
event immediately prior in the calculation, and then sum all of these prob-
abilities. Applying this to Equation 3.2 gives (adapted from [Lee10]):
( n )
∩ ∏n
P Ak = P (Ak | Ak−1 ) (3.4)
k=1 k=1
C D E F G A B
C 2 0 0 0 3 0 0
D 3 2 0 0 1 0 0
E 0 4 4 0 0 0 0
F 0 0 4 4 0 0 0 (3.6)
G 0 0 0 4 4 2 0
A 0 0 0 0 2 2 0
B 0 0 0 0 0 0 0
16
Chapter 3. Design 3.3. Critic Function
C D E F G A B
C 0.4 0 0 0 0.6 0 0
D 0.5 0.3̇ 0 0 0.16̇ 0 0
E 0 0.5 0.5 0 0 0 0
F 0 0 0.5 0.5 0 0 0 (3.7)
G 0 0 0 0.4 0.4 0.2 0
A 0 0 0 0 0.5 0.5 0
B 0 0 0 0 0 0 0
17
Chapter 3. Design 3.3. Critic Function
18
Chapter 3. Design 3.3. Critic Function
3.3.2 Cadences
The application of a cadence is something that requires a little more thought.
As is discussed in subsection 3.1.3, there are a number of cadences that can
be applied at the end of a composition generated by a computer.
The cadence to be selected will be chosen using a random–number gen-
erator, with the emphasis on selecting a finished cadence. If an unfinished
cadence is selected, the Critic function will add on more notes (using Markov
modelling) and apply a new cadence at the end of this extended piece. This
continues until a finished cadence is added to the end of a piece. This process
guarantees that the composition will always end of a satisfying note.
Another thing to consider is the chord progression that is used to when
applying a cadence. Initially, the composition is in a fixed key, allowing us
to select the first, fourth, fifth, sixth, and seventh of a scale as required.
In order to select the correct octave to apply to the cadence, the octave of
the last note used in the composition is calculated. Finding the octave can
be achieved simply, by finding the floor value of the midi value of the note
divided by 12 (the number of octaves achievable by midi values).
Once the octave is calculated, the cadence is created by simply working
with the midi values and adding (or subtracting) intervals as required. The
pseudocode for this algorithm can be found in Algorithm 3.
19
Chapter 4
Implementation
20
Chapter 4. Implementation 4.2. Handling Music — The JFugue API
Notes
In order to specify a note, it is enough to specify the note name: ‘C’, ‘D’,
‘E’, ‘F’, ‘G’, ‘A’, ‘B’, or ‘R’ (to specify silence). After this specification, it
is simple to sharpen (by appending a ‘#’) or flatten (by appending a ‘b’) a
note. Appending a number in the range of 0 – 10 after the complete note
name selects the octave that the note will sound from. The notes available
in JFugue are shown in Appendix B, Figure B.1.
Duration
A duration or length of a note can be appended to the note in the Mu-
sicString after the octave marking. There are 8 different lengths of notes
that can be applied, which are shown in Appendix B, Figure B.2. As more
markings are added to the MusicString, it becomes less readable by humans.
But, the format is very easy to build up using String objects in Java. The
strict pattern that is followed in order to create a detailed music string is
easy to implement, allowing complicated strings to be created easily
21
Chapter 4. Implementation 4.3. Improving Algorithm Output
Limitations
There are, however, limitations created by using JFugue. Additional mark-
ings that a musician would typically expect in a piece of music, such as
markings showing accents placed on a note, are not yet supported by JFugue.
This lowers the realism of a composition that can be produced by the algo-
rithms.
22
Chapter 4. Implementation 4.3. Improving Algorithm Output
generate the next most likely note. Figure 4.2 shows a typical output from
this algorithm2 .
Some of the five qualities for a ‘good’ piece of music have been achieved.
The music now has some degree of tonality as all the probabilities in the
Markov matrices are realistic, and become increasingly realistic as more
pieces analysed. Secondly, the Markov models enable the algorithm to select
notes that commonly follow a particular note, achieving the second aim.
23
Chapter 4. Implementation 4.4. Parser
Figure 4.4: Markov Applied to Notes & Lengths, & Critic Function
4.3.5 Summary
This table summarises how each algorithm met the aims of ‘good music’.
4.4 Parser
An important aspect of the project is the Parser, as it allows the Markov
models to be updated in real time, and adds extra functionality to the overall
24
Chapter 4. Implementation 4.4. Parser
piece of software. When the program starts running, the user is presented
with a choice of generating a piece of music or adding information to the
database. The user inputs a piece of music in the format of a MusicString,
and in order to support the addition of Markov–modelled note lengths, the
notes and durations are separated. Temporary Markov matrix are created,
and then combined with the existing matrices.
As an example, consider the matrices used in eqs. (3.6) and (3.7), and
then combine these existing matrices with the temporary matrices created
when parsing ‘Frère Jacques’, the notes for which are:
{G4 A4 B4 G4 G4 A4 B4 G4 B4 C5 D5 B4 C5 D5 D5 E5 D5 C5 B4 G4
D5 E5 D5 C5 B4 G4 G4 D4 G4 G4 D4 G4}
Formulating a matrix based off the newly parsed piece gives:
C4 D4 E4 F4 G4 A4 B4 C5 D5 E5
C4 0 0 0 0 0 0 0 0 0 0
D4 0 0 0 0 2 0 0 0 0 0
E4 0 0 0 0 0 0 0 0 0 0
F4 0 0 0 0 0 0 0 0 0 0
G4 0 2 0 0 3 2 1 0 1 0
(4.1)
A4 0 0 0 0 0 0 2 0 0 0
B4 0 0 0 0 2 2 0 2 0 0
0 0
C5 0 0 0 0 0 2 0 2
D5 0 0 0 0 0 0 1 2 1 2
E5 0 0 0 0 0 0 0 0 2 0
The Parser function performs matrix addition on the new and existing ma-
trices of integers (eqs. (3.6) and (4.1)) and then recalculates the new prob-
abilities by summing the rows of this combined matrix and dividing each
element by the sum. This yields:
C4 D4 E4 F4 G4 A4 B4 C5 D5 E5
C4 0.4 0 0 0 0.6 0 0 0 0 0
D4 0.375 0.25 0 0 0.375 0 0 0 0 0
E4 0 0.5 0.5 0 0 0 0 0 0 0
F4 0 0 0.5 0.5 0 0 0 0 0 0
G4 0 0.105 0 0.211 0.367 0.211 0.053 0 0.053 0
A4 0 0 0 0 0.3̇ 0.3̇ 0.3̇ 0 0 0
B4 0 0 0 0 0.3̇ 0.3̇ 0 0.3̇ 0 0
0 0
C5 0 0 0 0 0 0.5 0 0.5
D5 0 0 0 0 0 0 0.16̇ 0.3̇ 0.16̇ 0.3̇
E5 0 0 0 0 0 0 0 0 1 0
(4.2)
Thus, the probabilities become more accurate for the notes A4 . . . D5 as
these are the most common notes in the parsed pieces. The new notes are
25
Chapter 4. Implementation 4.4. Parser
26
Chapter 5
This chapter focuses on testing the output of the algorithm, rather than
detailing software tests.
An online survey was created, which linked to a number of computer
and human generated pieces in an open Dropbox folder. The instructions in
the survey told the respondent which pieces to play for each question, and
invited them to select which they thought was computer generated (with a
option of ‘Can’t Decide’ for the indecisive). Each question had a human and
computer generated piece, so there were no ‘trick’ questions. In total, there
were 54 respondents of various musical ability. Raw data can be found in
Appendix C.
27
Chapter 5. Testing & Statistical Analysis 5.2. Basic Markov Modelling
28
Chapter 5. Testing & Statistical Analysis 5.3. Markov & Critic Function
29
Chapter 5. Testing & Statistical Analysis 5.4. All Improvements
30
Chapter 5. Testing & Statistical Analysis 5.5. Study of Musical Ability
31
Chapter 5. Testing & Statistical Analysis 5.5. Study of Musical Ability
32
Chapter 6
33
Chapter 6. Furthering The Project 6.1. Retrospective Look at Development
Time Signature
The time signature of a piece of music defines the amount and type of
notes that each bar contains. The time signature is usually expressed as a
fraction, with the numerator showing the number of beats in a bar, and the
denominator showing the division of a semibreve [The15].
Time signatures, again, give an idea of how a piece of music will sound
before it is played. A 3 /4 time signature, i.e. 3 crotchets in a bar, suggests
that the music will have traditional waltz feel to it. A time signature gives an
indication to the player how to play the piece, and so can heavily influence
the music that is generated.
This would give the composer a little more flexibility in the piece that
would be generated. However, it would be difficult to differentiate between
pieces of music with different time signatures without the implementation
of bars and phrases, as there would be no audible division from one beat (or
bar) to the next.
Rests
A rest represents a period of silence in a bar [The15], and each type of rest
has the same duration with a certain type of note. Rests can occur within
a bar or — as is more common in longer pieces of music — rests can last
multiple bars.
Rests in music can dramatically alter the timbre of the piece, as they
enforce silence on certain instruments whilst others continue playing. This
can make a section of music seem delicate or heavy depending on the instru-
ments left playing.
JFugue does support the inclusion of rests — simply add an ‘R’ to the
MusicString in the fashion that a note would be added. This was originally
implemented in the Markov models along with the normal notes, but upon
testing it was noted that the rests seemed too forced and obviously placed.
As a result, this feature was removed. However, if a method could be devised
to cleverly add rests into the music, it would certainly make the piece more
realistic. With the implementation of phrases, there could be a slight rest
at the end of a phrase before repetition or modulation.
Key Change/Modulation
All of the elements of music listed above are the building block of creating
a piece of music. However, only writing in a single key sounds boring and
predictable. Humans expect to hear the same tune repeatedly when listening
to music, so when something does change within the piece, it grabs the
attention of the listener [Dew14].
This change can be achieved by changing the key during the piece, chang-
ing the chord progressions used underneath the main melody, or changing
34
Chapter 6. Furthering The Project 6.2. Suggestions for Future Projects
the melodies used for the different sections of the piece. Composers often
take a main theme and then vary the theme throughout different sections
of the piece in order to keep the listener interested.
There is no way within JFugue to modulate a section of music (or Pat-
tern, in JFugue syntax). This would be an excellent addition to the API
and would easily create the illusion that the key has been changed.
• Build a data structure to contain music in the form of bars and phrases.
This would initially be difficult, but would enable more complex algo-
rithms to be created.
35
Chapter 6. Furthering The Project
6.4 Conclusion
In conclusion, there have been some small victories for computer gener-
ated music. In the survey, 87% of all respondents were unable to correctly
identify all four pieces of computer generated music, with the average per-
son identifying 2.39 pieces of music correctly. However, there were a lot
of respondents who commented that the generated pieces were ‘obviously
computer generated’. Whilst some advancements have been made, algorith-
mic composition is still not as effective as human composition. Humans
naturally look for progression and emotions in music, and as yet, even a
supercomputer is unable to meet these requirements [Wil13].
36
Bibliography
37
Bibliography
38
Bibliography
39
Appendix A
Music Terminology
This appendix serves as a glossary for all musical terms that are used
throughout the report. All definitions taken from the Oxford Pocket Dic-
tionary of Current English [CE15].
• Bar: Any of the short sections or measures, typically of equal time value,
into which a piece of music is divided
40
Appendix A. Music Terminology
• Note: A sign or character used to represent a tone, its position and form
indicating the pitch and duration of the tone.
• Octave: The interval between one musical pitch and another with half
or double its frequency.
• Quaver: A note with half the duration of a crotchet. Two quavers make
up the same length as a crotchet.
• Round: A minimum of three voices sing the same melody, but each voice
starts at a different time.
41
Appendix B
JFugue Details
42
Appendix C
43
Appendix C. Raw Data for Chapter 5
44
Appendix C. Raw Data for Chapter 5
– “Piece 3 feels more fluid like than Piece 4, more so that a human would
have composed it. The start of Piece 4 sounds quite ‘complex’ (?) but
it doesn’t have as much of a flow.”
– “I’m going to go with Piece 4 being the one made by the computer.
Again, it had less of a flow to it, and the tempo seemed to be a bit
off.”
– “Piece 4 was more complex and didn’t sound as ‘one note per sample’.
So I think Piece 3 was composed by a computer.”
– “They both sound plausible, the first one sounds cruel to play, but still
good.”
– “Piece 3 has way too random notes compared to 4th one, so I think
that Piece 3 was composed by a computer.”
45
Appendix C. Raw Data for Chapter 5
– “This is one I find hard to decide on. I’m inclined to go for Piece
6 being the one generated by the computer as the ending of Piece 5
ended with the note you’d expect the piece to end and it seemed in
place but I might be totally wrong.”
– “Piece 7 felt as though it had a rhythm and melody and was consistent
throughout (notes weren’t as random). Piece 8 feels generated because
some of the notes didn’t seem as though they fitted well into the piece.”
– “Piece 7 was a very nice little melody. Piece 8 didn’t seem to have
the melody which Piece 7 had, and I would have been surprised if a
computer was able to generate Piece 7.”
46
Appendix C. Raw Data for Chapter 5
– “The trills in Piece 8 sound like a human would have composed them
rather than being generated”
47
Appendix C. Raw Data for Chapter 5
48
Appendix C. Raw Data for Chapter 5
49