Sei sulla pagina 1di 440

Atti del

X Colloquio di
Informatica Musicale
Editors: Goffredo Haus & Isabella Pighi

TI
Milano
2-4 dicembre 1993

AIMI
Associazione di Informatica Musicale Italiana

LIM-DSI
Laboratorio di Informatica Musicale
Dipartimento di Scienze dell'Informazione
Universita degli Studi di Milano
Atti del

X Colloquio di
Informatica Musicale
Milano, 2-4 dicembre 1993

Editor: Goffredo Haus & Isabella Pighi

AIMI
Associazione di Informatica Musicale Italiana

LIM-DSI
Laboratorio di Informatica Musicale
Dipartimento di Scienze dell'Informazione
Universita degli Studi di Milano
COMITATO SCIENTIFICO

Mario Baroni (Universita di Bologna)


Antonio Camurri (Universita di Genova)
Jacques Chareyron (Universita di Milano)
Giovanni De Poli (Universita di Padova)
Goffredo Haus (Universita di Milano)
Aldo Piccialli (Universita di Napoli)
Sylviane Sapir (IRIS)

COMITATO MUSICALE

Lelio Camilleri (Conservatorio di Bologna)


Mauro Graziani (Universita di Padova)
Alessandro Melchiorre (Civica Scuola di Musica di Milano)
Angelo Paccagnini (Universita di Milano)
Nicola Sani (RAI-SAT)

COMITATO ORGANIZZATORE

Goffredo Haus; Angelo Paccagnini; Isabella Pighi; Dante Tanzi


LIM-DSI, Universita degli Studi di Milano

2
Il X Colloquio di Informatica Musicale epromosso da:

AIMI, Associazione di Informatica Musicale Italiana


e
LIM-DSI, Laboratorio di Informatica Musicale
Dipartimento di Scienze dell'Informazione
Universita degli Studi di Milano

con il patrocinio di:

IEEE Computer Society


Task Force on Computer Generated Music
&
North Italy Section

con i1 contributo di:

Consiglio Nazionale delle Ricerehe,


Comitato Scienze e Tecnologie
dell'Informazione e Progetto Finalizzato
"Sistemi Informatici e Calcolo Parallelo"

SiliconGraphics
ComputerSystems

Civica Seuola di Musica


Comune di Milano

3
4
INTRODUZIONE

Goffredo Halls
Direttore Scientifico
LIM - Laboratorio di Informatica Musicale
Dipartimento di Scienze dell'Informazione
Universita degli Studi di Milano

Dopo l'ormai lontana edizione del 1977, il Colloquio di Informatica


Musicale toma all'Universita degli Studi di Milano. Molto tempo e
passato e molte cose sono cambiate. L'informatica musicale e cresciuta
come disciplina scientifica e artistica e gli ambiti applicativi che la
caratterizzano si sono sempre pill diversificati e specializzati. In Italia i
Colloqui di Informatica Musicale hanno assunto un ruolo sempre pill
efficace nella comunicazione nazionale e intemazionale dei pill avanzati
risultati della ricerca nel campo dell'informatica musicale.
La X edizione del Colloquio e, come accade dal 1981, promossa
dall'AIMI - Associazione di Informatica Musicale Italiana ed e
organizzata da1 LIM - Laboratorio di Informatica Musicale del
Dipartimento di Scienze dell'Informazione dell'Universita degli Studi di
Milano.
La sede di questa edizione e attiva nel settore dell'informatica musicale
dal 1975 ed e un polo di ricerca scientifico-tecnologica principalmente
dedicato alIa definizione e sperimentazione di metodi formali per il
trattamento dell'informazione musicale, sia a livello simbolico che sub-
simbolico. Accanto alle attivita di ricerca, principalmente sostenute dal
Consiglio Nazionale delle Ricerche, si svolge annualmente un corso
complementare di "Informatica Musicale" nell'ambito del secondo biennio
del Corso di Laurea in Scienze dell'Informazione.
II patrocinio di due entita della IEEE Computer Society, una
intemazionale e specifica del settore (la Task Force on Computer
Generated Music) e una nazionale di interesse generale (la North Italy
Section) incoraggia i ricercatori del campo informatico-musicale e
testimonia l'ormai avvenuto riconoscimento intemazionale della maturita
disciplinare raggiunta.

5
I vari contributi presentati al Colloquio nelle varie forme di paper,
poster, video, dimostrazioni e composizioni musicali sono tutti
documentati negli Atti e organizzati per capitoli corrispondenti alle
principali tematiche. Di notevole entita, come consueto, i capitoli
sull'elaborazione numerica del segnale audio e sui sistemi per
l'elaborazione musicale (workstation e strumenti software e hardware) e
di attuale particolare interesse i capitoli sulla percezione e sui sistemi
multimediali, realm virtuale e spazializzazione.
Alla realizzazione del X Colloquio di Informatica Musicale hanno
contribuito molti enti, societa e soprattutto persone; desidero qui
ringraziare vivamente: il Prof. Denis Baggi, chair della IEEE Computer
Society Task Force on Computer Generated Music; il Prof. Roberto
Negrini, chair della IEEE Computer Society North Italy Section; la
Civica Scuola di Musica di Milano; il Comitato Scienze e Tecnologie
dell'lnformazione del Consiglio Nazionale delle Ricerche; la Direzione
del Progetto Finalizzato "Sistemi Informatici e Calcolo Parallelo" del
Consiglio Nazionale delle Ricerche; la Silicon Graphics spa, nella persona
del Dr. Pierpaolo Muzzolon; la Intersoft sri, nella persona del Dr. Rubens
Malloggi; i membri dei Comitati Scientifico e Musicale del X Colloquio
di Informatica Musicale; i membri del Comitato Organizzativo, MO
Angelo Paccagnini, Dr.ssa Isabella Pighi e Dr. Dante Tanzi; i
collaboratori e studenti del LIM - Laboratorio di Informatica Musicale
che hanno dato una mana per gli aspetti organizzativi; il direttore, Prof.
Giancarlo Mauri, e il personale non docente del Dipartimento di Scienze
dell'lnformazione; tutti gli autori dei contributi che, con il loro lavoro,
hanno permesso di svolgere un'edizione del Colloquio cosl ricca di
contenuti scientifici e musicali innovativi e di alto livello.

Milano, 20 ottobre 1993

6
PRESENTAZIONE

Lelio Camilleri
Presidente
AIMI - Associazione di Infonnatica Musicale ItaHana

L'AIMI (Associazione di Infonnatica Musicale Italiana) compie il


dodicesimo anna di vita e vede nella stesso anna l'organizzazione del X
Colloquio di Infonnatica Musicale. Non e che i numeri abbiano molta
importanza, rna la longevita e il continuo sviluppo delle attivita
coordinate dall'AIMI indica come questa associazione non e nata solo per
riunire un gruppo di ricercatori e musicisti che lavorano in questo
settore. n suo obiettivo principale e quello di promuovere la crescita e 10
sviluppo delle attivita scientifiche e musicali nel campo dell'applicazione
delle nuove tecnologie alla musica.
In questi anni l'AIMI ha organizzato, in prima persona 0 come co-
organizzatore, workshop e convegni nazionali e intemazionali. Ha,
ovviamente, organizzato i Colloqui di Infonnatica Musicale i1 cui
contenuto si e via via integrato con contributi di ricercatori e musicisti
stranieri che hanno dato sempre pin un carattere intemazionale a queste
manifestazioni. Infatti, delle 28 comunicazioni che compongono Ie
sessioni di questo colloquio un terzo sono di ricercatori stranieri. E vi e
una presenza di questi contributi anche nelle dimostrazioni, nei poster e
nei concerti.
Questo e una prima realizzazione dell'auspicio di un'apertura dei nostri
incontri ad un respiro europeo per poter conoscere e confrontarci con i1
lavoro svolto in altri paesi, che il precedente Presidente dell'AIMI,
Giovanni De Poli, faceva nel IX Colloquio svoltosi a Genova. A De Poli
devono andare i ringraziamenti dei soci per aver ben operato, fra tante
cose, a intraprendere la giusta strada per i1 raggiungimento di questo
obiettivo.
Per quanta riguarda il contenuto del convegno, esso si articola in una
serie di sessioni che interessano sia argomenti prettamente tecnologici che
musicali. Due di queste focalizzano l'attenzione su due sistemi di notevole
interesse quali la Stazione di Lavoro Musicale Intelligente e la MARS. La

7
parte scientifica del colloquio viene completata da una serie di
dimostrazioni e sessioni di poster.
Un altro aspetto di sicuro interesse e il Tutorial sugli "Standards in
Computer Generated Music" seguito da un panel sponsorizzato dalla
IEEE Computer Society Task Force on Computer Generated Music. La
presenza come sponsor del convegno della Task Force, non e un caso dato
che la sua recente nascita e dovuta anche aHa collaborazione della
comunita infonnatico/musicale Italiana.
Per quanto riguarda la parte musicale, i1 colloquio prevede
l'esecuzione di 12 lavori che rappresentano distinti aspetti del rapporto
fra tecnologia e composizione.
Per finire, vorrei ringraziare il comitato organizzatore, i comitati
scientifico e musicale, gli sponsor, oltre che gli autori, per il loro
prezioso contributo alIa riuscita di questa manifestazione.

Firenze, 20 ottobre 1993

8
INDICE

Introduzione
Goffredo Haus, Direttore Scientifico LIM-DSI 5
Presentazione
Lelio Camilleri, Presidente AIMI 7

Indice 9

Capitolo 1: TUTORIAL

D. Sloan
From DARMS to SMDL, and Back Again 19

Capitolo 2: TEORIA MUSICALE, COMPOSIZIONE


E MUSICOLOGIA

M. Baroni, L. Finarelli
Alcune osservazioni sulla esecuzione della Quinta Sinfonia
di Beethoven 31

L. Camilleri, F. Carreras, F. Giomi


Sistemi esperti in musicologia: un prototipo per l'analisi
TIME-SPAN reduction 36

A. De Matteis, G. Haus
Formalizzazione di strutture generative all'interno de
"La sagra della primavera" 48

B. Fagarazzi, M. Sebastiani
Using Self-Affine Fractal Coding to Model Musical Sequences 55

U. Merlone
Analisi statistiche nel riconoscimento degli intervalli 59

S. Sargenti
Definizione di reti di Petri per l'analisi della musica
elettroacustica 63

9
N. Zahler
The Compositional Process and Technological Tools:
an Appraisal ofAlgorithmic Composition as it Relates to
Compositional Process 67

Capitolo 3: PERCEZIONE

D. R. Keane, L. L. Cuddy, C. A. Lunney, J. Dufton


The Perception of Musical Structure and Time 79

M. Leman
Tone Center Attraction Dynamics: an Approach to
Schema-Based Tone Center Recognition of Musical Signal 86

Capitolo 4: RETI NEURALI

G. U. Battel, R. Bresin, G. De Poli, A. Vidolin


Automatic Performance of Musical Scores by Means of Neural
Networks: Evaluation with Listening Tests 97

G. De Poli, P. Prandoni, P. Tonella


Timbre Clustering by Self-Organizing Neural Networks 102

M. Johnson
Neural Networks and Style Analysis: a Neural Network that
Recognizes Bach Chorale Style 109

P. Toiviainen
Modelling Harmony-Based Jazz Improvisation: an Artificial
Neural Network Approach 117

Capitolo 5: ELABORAZIONE NUMERICA DI SEGNALI

M. Barutti, G. Bertini
Una nuova tecnica di sintesi additiva basata sulla trasformata
inversa di Fourier 127

L. Bazzanella, G. B. Debiasi
Analisi dell'effetto del tocco suI transitorio d'attacco dei suoni
di un organo a canne a trasmissione meccanica 134

10
A. Bernardi, G. P. Bugna, G. De Poli
Sound Analysis Methods Based on Chaos Theory 142

A. Chandra
Counterwave. A Program for Controlling Degrees of
Independence between Simultaneously Changing Waveforms 151

A. Di Scipio, G. Tisato
Granular synthesis with Interactive Computer Music System 159

S. Dubnov, N. Tishby, D. Cohen


Bispectrum of Musical Sounds: an Auditory Perspective 166

C. Lippe
Real-time Control of Granular Sampling via Nonlinear Processes
Using the IRCAM Signal Processing Workstation 178

S. Mariuz
A Program for Analysis, Separation and Synthesis of Musical
Signals Spectrum 184

A. Pellecchia, A. de Vitis
Sintesi Polare: applicazioni in campo musicale difiltri digitali
operanti allimite della stabilitd 188

A. Piccialli, S. Cavaliere, I. Ortosecco


Analysis, Synthesis and Modification ofPseudoperiodic Sound
Signals by Means of Pitch Synchronous Techniques 194

D. Rocchesso
Multiple Feedback Delay Networks for Sound Processing 202

D. Rocchesso, F. Turra
A Real-Time Clarinet Model on MARS Workstation 210

Z. Sette!, C. Lippe
FFT-based Resynthesis for Real-Time Transformation of Timbre 214

11
Capitolo 6: WORKSTATION MUSICALI

Sezione 6a: Stazione di Lavoro Musicale Intelligente

A. Camurri
The Cognitive Level of the Intelligent Music Workstation 225

A. Camurri, A.Cartoncini, M. Frixione, C. Innocenti,


A. Massari, R. Zaccaria
Toward a Cognitive Modelfor the Representation and
Reasoning on Music and Multimedia Knowledge 231

J. Chareyron, D. Rizzi
Due ambienti sperimentali dedicati alla sintesi LASy 244

P. Fischetti
PC-Music: evoluzione dellinguaggio CMUSIC
per ambiente MS-DOS 248

G. Haus, I. Pighi
"Stazione di Lavoro Musicale Intelligente":
l'ambiente integrato Macintosh-NeXT 254

G. Haus, A. Sametti
L'ambiente per l'analisilre-sintesi di partiture della
"Stazione di Lavoro Musicale Intelligente" 262

C. Massucco, M. Mercurio, G. Palmieri


Real-Time Processing and Performance Using WinProcnelHARP 270

Sezione 6b: MARS

P. Andrenacci, F. Armani, A. Prestigiacomo, C. Rosati


APPLI20: a Development Tool for Building MARS Application
with an Easy to Use Graphical Interface 277

E. Favreau, S. Sapir
La stazione MARS: dalla progettazione di algoritmi alla
realizzazione di ambienti esecutivi dedicati 285

E. Guarino, R. Bessegato, E. Maggi


Celle-funzione per la realizzazione di sistemi musicali elettronici 293

12
E. Maggi, A. Prestigiacomo
Portability of the MARS System 301

Sezione 6c: Altre workstation

G. Bertini, D. Fabbri, M. Marani, L. Tarabella


MUST C 25 - Stazione di lavoro musicale con schede DSP
Leonard C25 307

P. Prevot, A. Debayeux
Constraint Satisfaction Programming in Computer Aided
Composition on a Highly Gestual Devoted System, Based on
a VME- Multi-Processor Joining True UNIX and Real-Time 311

Capitolo 7: STRUMENTI S/W E H/W PER LA


COMPOSIZIONE E LA PERFORMANCE

S. Bettini
Music 5 Mac 321

R. Bresin
MELODIA: a Program for Performance Rules Testing,
Teaching, and Piano Scores Performing 325

N. Larosa, C. Rosati
MEDUSA: a Powerful MIDI Processor 328

M. Laurson
PWConstraints 332

A. Provaglio
SoundLib 2.0 - Una libreria di classi C++ per l'elaborazione
di segnali audio campionati 336

G. Ramello
"HIPPOPOTAMUS": un sistema di performance interattivo 340

L. Tarabella, G. Bertini, M. Romboli


Le Twin Towers: un dispositivo per esecuzioni interattive
di computer music 344

13
Capitolo 8: SISTEMI MULTIMEDIALI, REALTA'
VIRTUALE, SPAZIALIZZAZIONE

A. Belladonna, A. Vidolin
Applicazione MAX per la simulazione di sorgenti sonore
in movimento con dispositivi commerciali a basso costo 351

A. Camurri, F. Giuffrida, G. Vercelli, R. Zaccaria


A System for Real-Time Control ofHuman Models on Stage 359

S. T. Pope and L. E. Fahlen


The Use of3-D Audio in a Synthetic Environment:
an Aural Renderer for a Distributed Virtual Reality System 366

Capitolo 9: STUDIO REPORT

P. Dutilleux
Center for Art Mediatechnology Karlsruhe: the Institute
for Music and Acoustics 379

L. Gamberin, S. Mosca
La biblioteca, il computer e la musica 382

L. Garau, G. Tedde
L'attivita dell'Associazione Ricercare ed if suo studio per
la ricerca musicale e artistica 387

Capitolo 10: COMPOSIZIONI MUSICALI

Ludger Bruemmer
The Effect of Digital Synthesis Language on the Conception
and Process ofComposition 393

Luigi Ceccarelli
DOPPIO SOLO 397

Fabio Cifariello Ciardi


FINZIONI 401

Alessandro Cipriani
VISIBILl 404

14
James Dashow
RECONSTRUCTIONS 407

A. Di Scipio
Sulla composizione di ZEITWERK (l'orizzonte delle cose) 410

Amedeo Gaggiolo, Silvia Dini


ANIMALI IN SOFFITTA 414

Francesco Galante
METAFONIE 418

Francesco Giorni
Alcune riflessioni intorno al brano elettroacustico
CHROMATISM 422

David Keane
WERVELWIND 425

Cort Lippe
A Compositionfor Clarinet and Real-Time Signal Processing:
Using Max on the IRCAM Signal Processing Workstation 428

Matteo Pennese
IHADA 433

15
16
Capitolo 1

TUTORIAL

17
18
From DARMS to SMDL, and Back Again
Donald Sloan

Music Department
Ashland University
Ashland, OH 44805 (USA)
fax 419-289-5333
email authreen@class.org

Abstract. without necessitating a separate


pair of translators or interpreters
The information standards be designed for each pairwise
HyTime and SMDL (ISO/IEC combination of schemes. Instead,
10744 and ISO/IEC CD 10743, each format would only need two
respectively) were created to such translators: one from the
handle representation and original format into SMDL, and
manipulation of hypermedia one from SMDL into the original
documents. HyTime was designed format. This would greatly
to handle any combination of reduce the number of translators
media, while SMDL is being necessary today, and would
designed as an application of ensure that in the future, should
HyTime specific to music new representation schemes
information. In part, SMDL is appear, the need for translators
intended to help those working would only grow arithmetically,
with databases of musical not geometrically.
information. This paper contains
a demonstration of how SMDL
can provide a standard for the Introduction.
exchange of database
information, regardless of the Since 1986, the X3V1.8M
original database format. The committee of the American
information captured in one such National Standards Institute
music representation scheme, (ANSI) has been undergoing a
DARMS, can be shown to be able public standards process for the
to map into SMDL, and vice representation of static and
versa. In a like manner, other dynamic information in hypertext
representation schemes, including and multimedia documents. This
but not limited to SCORE, process has produced two related
MUSTRAN, ENIGMA, MIDI standards. Hypermedia/Time-
and CCARH can have a standard based Structuring Language
base for information exchange, (HyTime) has already been

19
approved by the International facilities, plus those unique to
Standards Organization (ISO), of SMDL.
which ANSI is the United States It is important to note that
representative. HyTime has been both HyTime and SMDL are
assigned the number ISO/IEC enabling standards. They place no
10744:1992. Standard Music restrictions on the content of
Description Language (SMDL) is what is represented, nor do they
still in development, and in its enforce a single kind of
current draft form bears the document architecture. By
number ISO/IEC CD 10743. following the standards, one may
Both of these standards rely gain those things that a standard
on an existing standard, SGML should confer: a representation
(ISO/IEC 8879:1986), a markup scheme that is public, a means for
language which allows documents expressing the structure of
to be tagged in such a way as to information, and a set of tools
allow representation of virtually for performing useful operations
any document architecture, by on the information, such as data
the representation of semantic queries and hypertext links. The
and syntactic information. SGML standards do not constrain
is used today by many publishers document styles; for example,
world-wide, both in the public two different publishers may use
and private sectors. HyTime is an different formats for chapter
application of SGML; that is, headings. The tagging of a
HyTime relies on certain certain datum as a chapter
functions and capabilities outlined heading is, however, an
in the SGML standard. A important piece of information,
HyTime user could design a regardless of what a user may
Document Type Definition choose to do with it. This
(DTD) based on SGML principle guided the developers
conventions, plus those parts of of HyTime and SMDL; a proper
HyTime that are needed for that level of abstraction and an
type of document. Many different appropriate set of tools will allow
instances of a given document any author to retain both the data
type can use the same DID. In content and the document
that sense, HyTime is an example structure in a useful way. These
of an object-oriented approach; standards do not force an author
the objects (called elements in to use all of the facilities of the
SGML) are defined in the DID, standards. One can include a non-
while their actual attributes can SGML document, and label it as
vary from instance to instance. In such; one would lose much of the
a similar manner, SMDL is an power of the standards, but at
application of HyTime. It uses a least the choice is up to the user.
DTD that relies on HyTime's HyTime and SMDL are
intended for use by virtually the

20
entire community of information given unique identifiers in
processing, from paper to SGML, 3) a way to link objects,
electronic publishers, from whether or not in the same
hypertext documents to document, 4) a way to schedule
multimedia documents, from events so that not only the proper
authors to end users. It is hoped sequence, but also the proper
that developers will create a set relationship between events is
of software tools similar to what maintained, and 5) a way to
now exists for SGML, so that the render a particular instance of
intricacies of HyTime and SMDL such a schedule. The scheduling
will be hidden behind an and rendering modules owe a
interface that allows a wide range particular debt to the music
of users to transparently author world; it was determined that the
and read HyTime and SMDL scheduling model for musical
documents. The HyTime standard events was at least as
can be obtained from national sophisticated as that for any other
standards organizations, as well medium, and thus any model that
as from the ISO/IEC Copyright could represent musical events
Office, Case Postale 56, CH-1211 was robust enough for virtually
Geneve 20, Switzerland. Annex B any schedule. The content of
of the HyTime standard lists SMDL, therefore, contains only
additional sources of those items unique to music
information. SMDL is still in representation. The time model is
development, and as such, is in already present from SMDL's
draft form. inheritance of HyTime facilities,
meaning that an SMDL document
will invoke HyTime elements to
Why Two Standards? represent durations. Those
HyTime modules that are not
Although this project began applicable to music
with only one standard in mind, representation may be left out;
it became clear that the there are different levels of
requirements of the music conformance depending on user
community were far more needs. Thus, if a user does not
detailed than those of the need hypertext linking facilities,
hypermedia community in this need not be included. This
general. HyTime contains a feature of modular architecture
highly abstract approach to will make it easier for those
information architecture, using HyTime and SMDL for
consisting of several modules music applications.
representing these functions: 1) a
way to measure the position and
extent of objects, 2) a way to
address objects that cannot be

21
HyTime'sTime Model. standard measurement unit
(SMU) such as meters, seconds,
HyTime uses its scheduling etc., or a virtual measurement
module to represent sequences of unit, with its temporal or spatial
events, durations, and other meaning defined by the user. The
temporal relationships between latter system is what would be
objects. In HyTime, a work may useful to represent music that is
be represented by a number of written in Common Western
coordinate axes, with each axis Notation. When musicians think
representing a temporal or spatial of half notes and quarter notes,
dimension. Each event has a these do not have an absolute
position and extent on an axis. duration in seconds until the time
For temporal events, this of performance; even then, no
corresponds to start time and two performances would be the
duration. Thus, it may be same temporally. Thus, a logical
possible to show the staging of an representation in virtual units is
opera with a temporal axis what is required to properly
showing the flow of the music, capture the musical information.
and two spatial axes showing While it is beyond the scope
where each character is on stage of this paper to give a tutorial
at any given moment of the extensive enough to teach fluency
music. in HyTime scheduling, a certain
These axes together make up amount of this can be shown in
what is called a finite coordinate the following way: a potential
space, or FCS. The measurement DTD for an SMDL document
system for each axis is defined by could contain the following
the DTD, and remains fixed for information:
that work. An axis may use a

<!DOCTYPE work SYSTEM "smdl.dtd" [


<?HyTime VERSION "ISO/IEC 10744:1992" HYQCNT=32 >
<?HyTime MODULE base
desctxt dvlist lextype refctl >
<?HyTime MODULE measure
axismdu dimref fcsmdu HyFunk HyOp markfun >
<?HyTime MODULE sched
grpdex >
<?HyTime MODULE rend
modify patch profun project >
<!NOTATION virtime PUBLIC -- virtual time --
"+//ISO/IEC 10744//NOTATION Virtual Time Unit//EN"
>
<! ENTITY tactvtu -- number of vtus per tactus (beat)
"80640" >

<!ENTITY % av.wxdm -- dimension of wfaxis -- "4294967295">

22
<lENTITY % av.wxfm -- mdu def for workfcs -- "'SIsecond 1
1000'" >
<lENTITY % av.wxbg basegran of wfaxis -- "msec" >
<lENTITY % av.wxgh gran2hmu of wfaxis "'1 I'" >
<lENTITY % av.wxpg pls2gran of wfaxis -- "'1 I'" >

< l ENTITY % av. cxdm dimension of mustime axis


"4294967295">
<lENTITY % av.cxbg basegran of mustime axis; -- "vtu"
>
<lENTITY % av.cxgh gran2hmu of mustime axis -- "'1 1'"
>
<lENTITY % av.cxpg pls2gran of mustime axis "' 1
&tactvtu; '" >
<lENTITY % av.cxfm mdu def for mustime axis --
"'virtime 1 1'" >

The first line of code gives a piece. The least common multiple
name to the DTD. The next would be most economical, of
group of lines names those course, but since this DTD may
HyTime modules, and within serve several different pieces, it
each module, those functions is probably safer to choose a
which will be supported by this larger mulitple that will not have
DTD. The notation is declared as to be adjusted from piece to
virtual time, and the units of piece.
measurement virtual time units, The next five entities define a
or vtus. Each beat or tactus is real time axis, upon which a
given a total of 80640 vtus, performance of the musical work
which ensures that equal division could be represented. Notice that
of the beat into 2, 3,4, 5, 6, 7, 8, the measurements are in seconds
9, etc. units can be achieved with and milliseconds, real time units.
integer totals in each case. This The last five entities define a
number may change according to virtual time axis, upon which the
user needs; for example, there temporal information inherent in
are certain Chopin works in the score would reside. This axis
which a beat is divided into 17 is defined in terms of virtual time
notes. In such a case, the number units, or vtus. The user may
of vtus per tactus could be a declare in another part of the
multiple of 17 as well. This document what the tactus
would guarantee an integer represents. In this way, a single
number of vtus for each note, DTD may be flexible enough to
avoiding the problems inherent in handle several different types of
floating-point calculations. One pieces, or even a single piece in
may always arrive at a safe which the tactus changes.
'tactvtu' number by taking a These lines of code could be
common multiple of the various used potentially as a typical
divisions of the beat in a given SMDL DTD, thus obviating the

23
need to redefine all of the sound manipulation; rather, it has
HyTime terms for each work of a place in the standard for those
music represented. There are who wish to describe their sound
other parts to this typical DTD manipulation methods in a
that refer to resource tables and representation of their choosing.
other SMDL devices; these have
been omitted in this example in
order to isolate those things From DARMS to SMDL...
relating directly to the time
model, but would ordinarily be The Digital Alternate
included in a typical SMDL Representation of Musical Scores,
DTD. or DARMS, is one of several data
Since this DID allows for an content notations currently in use
axis in virtual time, to capture in the music world for the
the logical information, and an representation of music
axis in real time, to capture a information. There exist musical
potential performance, there databases encoded in DARMS, as
must be a method for mapping well as other data content
from one to the other. HyTime notations, such as SCORE,
has elements in the rendition MUSTRAN, the CCARH code,
module called batons and wands MIDI and others. The inclusion
to create this conversion. A baton of MIDI in this list is an
is a schedule of projectors which illustration that not all musical
can map virtual units to real information resides in scores;
units, with extensive facilities for there is a need for capturing
determining how to reconcile one potential performances as well.
to the other. A wand is a schedule Since humans use a musical score
of modifiers which act upon as a blueprint for performances,
HyTime objects. An example of a some people have believed that a
modifier would be a filter, either data content notation that
sonic or visual. The modifying represents a score represents a
abilities of these scopes could be performance as well. Any
defined by the user; for example, programmer can tell you that
those interested in representing there is a good deal of
electronic music may find it interpretation that a machine
useful to define scopes in terms must accomplish in order to
of the formulas used to render a score in actual sound.
manipulate waveforms, rather DARMS is one of those codes
than have to name a specific that captures the appearance of
pitch for a sound that is being the musical symbols on a score,
processed in such a way as to but requires an intelligence to
make pitch difficult to define. give sonic definition to these
HyTime does not provide a symbols.
standard set of tools to describe

24
SMDL is being designed to SMDL both in the cantus, which
cover both the visual aspects of would contain pitch names,
music representation, such as articulations and relative note
scores, and the auditory aspects proportions, and in the visual
of music representation, such as domain, which would contain
gestural information for such purely visual phenomena as
performances. There exist clefs, articulation symbols,
several domains for the beaming information, etc. For
information in an SMDL work. each symbol in DARMS (or any
The set of logical information, other existing coding scheme, for
that is, the temporal that matter), there will be a
relationships, pitch information, corresponding way to capture all
and the like, is represented in a of the musical information in
domain called the cantus. The SMDL.
purely visual phenomena such as The latter point is crucial if
how the musical information SMDL is to be useful as a
appears in a score, is in the visual standard. SMDL should be able
domain. The purely audible to represent all that we regard as
phenomena, such as how the musical information. In certain
music is manifest in a certain cases, this representation may be
performance, is in the gestural nothing more than an ability to
domain. Finally, manipulations specify the extent of a chunk of
that rely on the work's musical data, and a label for what it is.
information, but are not properly For example, rather than convert
part of the piece of music reside or translate a DARMS database
in an analytic domain. It is the into SMDL data, one can include
cantus that contains that which it in an SMDL document as an
uniquely defines the work; one uninterpreted section of data
could imagine taking the called "DARMS code." This will
information in the cantus and not allow many operations that
creating several different-looking SMDL could otherwise perform
scores (e.g., two representations on musical data, but at least
of Hebrew cantilation, one using would provide a standard way to
the original tropes, and another exchange data with proper clues
in Western notation), or two as to what the data content
different performances. And yet, notation is. Similarly, a bitstream
in spite of the different ways of of sound samples may remain
representing it, there remains undifferentiated in an SMDL
only a single work of music in document, but bear the label
the cantus. This cantus represents "digital sound samples," so that a
the abstract information of a device that could potentially use
piece of music. this information will know what
Thus, a DARMS-encoded it is.
score could be represented in

25
visually, but each visual symbol
...And Back Again. should refer uniquely to its
logical information.
Since SMDL can represent at Why is it important to be able
least as much information as to translate from DARMS to
would be in a DARMS-encoded SMDL, and vice versa? If one
score, conversion from SMDL to wanted to be able to share
DARMS would be a matter of databases of musical information,
using only that information without a standard, it would be
which could be captured by necessary to create a pair of
DARMS. If one used only the translators for each pairwise
information present in the cantus combination of codes. Thus, if
of an SMDL work, then there is there are ten codes commonly in
no guarantee that the resulting use, a full translating capability
DARMS translation would be would require 90 different
exactly like the original that was translators. With a standard, one
translated to SMDL in the first would only need a translator
place. This is because the cantus from each code to the standard,
contains information about the and back again. The same ten
exact pitch, but not how it is codes could be fully shared with
represented on a page. Thus, in only 20 different translators. As
order to get from an SMDL representation standards
work back to the DARMS code proliferate, the need for
with which one started, one translators would grow
would have to use both the cantus geometrically without a standard,
and a visual domain together. but only arithmetically with a
Fortunately, SMDL has an standard. Such a standard need
element in the visual domain in not crowd out all other methods
which any datum in that domain of representation; rather, it can
can be referenced to its be used only to the extent that it
corresponding logical is valuable to do so. It has been
information in the cantus. Thus, suggested that using an already-
one can be sure that an event in existing code would save the
the cantus and an event in the trouble of developing a standard
visual domain refer to the same such as SMDL. Yet, it has not
musical object, and can be used been demonstrated that any
together to translate to the existing code can capture all of
corresponding DARMS code. the information necessary for
The link direction of such a both musical score and musical
reference is from the visual performance.
domain to the cantus, not the
other way around, since there are
potentially limitless ways to
represent the logical information

26
Current Status.
Unfortunately, development
of SMDL has been slowed while
its parent application, HyTime,
undergoes the inevitable ironing
out of wrinkles during its initial
implementations. At the present
time, SMDL is incomplete, and is
not ready for developers to use.
This will change shortly, as
HyTime matures. In the
meantime, those who have been
using another coding system for
databases can continue to use it;
when SMDL is ready, a
translator can convert the data.
While SMDL will not include
such a translator in the standard,
for those codes where such a
need is evident in the music
community, it is hoped and
anticipated that developers will
create such tools as translators.
The success of a music standard
such as SMDL and the needs of
the music community are thus
intertwined.

27
28
Capitolo 2

TEORIA MUSICALE,
COMPOSIZIONE
E MUSICOLOGIA

29
30
ALCUNE OSSERVAZIONI SULLA
ESECUZIONE DELLA
QUINTA SINFONIA DI BEETHOVEN
Mario Baroni, Luigi Finarelli

Dipartimento di Musica e Spettacolo, Universita di Bologna


via Galliera,3
1-40125 Bologna (Italy)
fax +3951 231183
E-mail g3ubouc1@icineca

Abstract II primo aspetto riguarda


l'impostazione stessa del
This paper presents a tentative pro blema. Di norma la
approach to the study of an letteratura sull 'eseeuzione
orchestral musical performance. musicale ha il proprio retroterra
The opening theme of nella psicologia 0 nella
Beethoven's Fifth Symphony has psicoaeustiea e il suo seopo
been analyzezd. Timing only is primario e quello di esaminare
taken into consideration in this in dettaglio il rapporto tra i
first phase of the analysis. Some fenomeni sonori e la loro
results are presented, but mainly elaborazione cognitiva. Nel
methodological problems are nostro caso l'obiettivo e inveee
discussed. quello di cominciare a vedere in
che misura i risultati finora
a) Obiettivi acquisiti possano essere
utilizzabili per rispondere a
Lo studio della letteratura domande provenienti dal campo
sulla esecuzione musicale della musieologia.
dimostra che, dopo il periodo II secondo aspetto riguarda
pionieristico degli anni seorsi, la Ie fonti sonore esaminate. Nella
riflessione scientifica sta grande maggioranza dei easi
raggiungendo oggi una maggiore presi in eonsiderazione in
sieurezza e stabilita. La letteratura, la coscienza dei limiti
eonsapevolezza di questo stato di dell' odierna tecnologia ha
eose ci ha suggerito il tentativo opportunamente suggerito di
di saggiare, a titolo sperimentale, attenersi all'esame di frammenti
un terreno che, per almeno due eseguiti su strumenti monofonici
aspetti, non e stato finora o suI pianoforte. Anche nel
frequentemente esplorato. nostro caso ci siamo attenuti ad
esempi monofoniei, rna abbiamo

31
preso in considerazione uno Abbiamo perC> considerato
strumento complesso come che Ie singole registrazioni sono
l'orchestra. state riascoltate e licenziate dal
In questo primo approccio direttore, in quanto interprete
ci siamo limitati ad alcune principale, come esecuzioni ben
indagini semplici, anche perche definite sia dal punto di vista
l'intenzione iniziale e quella di artistico che storico.
toecare con mano fino ache Per il momenta ci siamo
misura sia oggi possibile limitati solamente a misure di
addentrarsi in un campo di tempo, rna contiamo di
questo genere. Abbiamo preso in approfondire l'indagine
considerazione uno dei temi pili prendendo in considerazione
famosi di tutta la musica anche i fattori ampiezza e
occidentale: l'incipit della timbro.
Quinta Sinfonia di Beethoven Tale limitazione riduce
eseguito da Solti, Abbado, fortemente anche Ie possibilita di
Karajan, Toscanini, interpretazione dei risultati. II
Knappertsbusch e Hamoncourt. nostro obiettivo musicologico e
infatti quello di ricostruire,
b) Strumenti tecnologici, interpretando i dati fonici, gli
misure, problemi di metodo orientamenti espressivi che
ciascun direttore possiede nel
Per raccogliere i dati descritti ci momento in cui utilizza i
siamo serviti delle funzionalita di margini di liberta che il testo
editing digitale del sistema Sonic scritto gli concede. La prima
Solutions [1] che, con l'uso di 12 condizione per iniziare un lavoro
tracce parallele virtuali, consente di questo tipo e che per 10 stesso
anche una immediata frammento si possano eseguire
comparazione visiva tra i profili rilevamenti su pili parametri.
d'ampiezza dei campionamenti. Esistono infatti correlazioni
Nell'analizzare Ie documentabili fra scelte di
esecuzioni di musica orchestrale durata di vari tipi e scelte
abbiamo incontrato una serie di d'ampiezza (e probabilmente
problemi di non facile soluzione: anche scelte timbriche) che
la tecnica e l'epoca della obbediscono a principi di
registrazione, la risposta acustica coerenza espressiva [2]. La
dell'ambiente e inoltre i tempi di limitazione a un solo parametro
risposta tipici dell'orchestra (quello delle durate) riduce i
dovuti alIa vastita dell'insieme, margini di verifica della nostra
aIle asincronie degli attacchi, ai interpretazione.
transitori d'attacco e al tempo di L'unica possibilita di
trasmissione del segnale acustico indagare sulla coerenza delle
nella sala. scelte e data nel nostro caso dal
confronto delle durate in

32
frammenti diversi della stessa diverse: la prima ha il molo
esecuzione 0 dal confronto delle dell' enunciazione iniziale, la
durate della stesso frammento in seconda e un ritomello e la terza
esecuzioni diverse. Ma e chiaro (ripresa) arriva solo al culmine
che in questi casi esistono della frase che la precede.
interferenze di variabili esteme L'ultima comparsa ha dunque
non controllate che rendono due funzioni: e ripresa rispetto
l' interpretazione malsicura. alIa macroforma e climax
Tuttavia, come abbiamo detto, il rispetto alIa frase in cui e
nostro scopo e solo quello di inserita. Le durate dei tre esempi
cominciare a esplorare il campo. variano come indicato in Fig. 1.
In linea di massima ci
limiteremo dunque a segnalare 10,.-----------
problemi piu che a dare risultati. 8+--- I---

c) Confronto fra il tema e 6 - - I- I-

Ie sue ripetizioni 4 - - I- I- f-

2 - - f- I- f-
Nella partitura il tema
principale (batt. 1-5) si o - - '- '- '-
ripresenta esattamente uguale in
altre due occasioni, la ripetizione Fig. 1. Durala in secondi del lema del primo
movimenLO della V sinfonia di Beethoven nelle
dopo il ritomello di batt. 124 e sue ripelizioni; da sinistra: Solli, Abbado,
la ripresa a batt. 248. Karajan, Toscanini, Knapperlsbusch,
II confronto fra questi tre Hamoncourt.
frammenti solleva un problema
ben nota agli studiosi del campo: Karajan Abbado e
l'esecuzione non si limita a Knappertsbusch, come si vede,
riprodurre Ie note scritte, rna si rallentano il tempo nella terza
applica sempre a una struttura. ripetizione. E' assai probabile
Prima di venire eseguito ogni che questo rallentamento possa
frammento deve essere essere interpretato in funzione
interiormente pensato e definito del climax, poiche in moltissimi
dal suo esecutore mediante esempi la sottolineatura enfatica
l' assegnazione di un preciso presuppone appunto un
ruolo funzionale nella sintassi del rallentamento [4]. Cia andrebbe
brano, nonche di particolari pera confermato da altri
caratteristiche espressive. Solo a rilevamenti sull' ampiezza e suI
questo punto l'esecutore decide timbro. Non c'e poi da stupirsi
Ie proprie scelte interpretative che Solti, Toscanini e
[3]. Hamoncourt adottino invece una
Ora, Ie tre comparse del strategia esattamente opposta.
tema nel primo movimento Sundberg [5] segnala a questo
hanno tre funzioni sintattiche ben proposito casi significativi che

33
chiama "sinonimie" (stesso Knappertsbusch conserva invece
effetto ottenuto con strutture la sua fama di esecutore
diverse 0 anche opposte). Anche solennemente lento. Se tuttavia si
in questo caso tuttavia resterebbe confrontano con questo dato Ie
da decidere, ed e possibile farlo durate globali dell' esecuzione
sulla base della teoria (Fig. 2) si vede chiaramente che
"movimento-emozione" enuncia- la scelta del tempo per il tema
ta da Gabrielsson [6], che tipo di non coincide con quella della
proprieta emozionali attribuire a cosiddetta pulsazione media [6]:
questa accelerazione. Ma per far qui Toscanini riottiene il suo
cio occorrera utilizzare altri piu primato e Karajan si dimostra
precisi dati. notevolmente piu lento di lui.
Dunque la scelta del tempo
d) Confronto fra Ie diverse indicata in Fig. 3 non si riferiva
esecuzioni del tema aHa sinfonia in quanto tale, rna
proprio al suo tema.
1 0.,......-• •-.-.-••-.-.-
••- .-.-.-
••- .-.-.-• •-.-.-.
8 ------------------------
.
~:: .. :: 9;------------
6 ~::
8+----::,..,.--_-.......,
4 7
2 6
o 5
4
Fig. 2. Durata totale in minuti del primo
movimento della V sinfonia di Beethoven; da 3
sinistra: Solti, Abbado, Karajan, Toscanini, 2
Knappertsbusch, Hamoncourt.
1
II tema e state suddiviso in o
due elementi (sol-sol-sol-mi, fa-
fa-fa-re) di ciascuno dei quali Fig. 3. Durate in secondi dei due eJementi del
tema del primo movimento della V sinfonia di
abbiamo verificato Ie durate. Beethoven (sol-sol-sol-mi in nero, pausa in
In questo contesto ci bianco e fa-fa-fa-re in grigio); da sinistra: Solti,
interesseremo soprattutto Abbado, Karajan, Toscanini, Knappertsbusch,
Hamoncourt
dell' enunciazione del tema
(trascurando Ie durate di
E' ovvio anche qui che la
ritornello e ripresa che pure
pura e semplice durata non pua
sono indicate in figura) . Al
dare riferimenti per quanto
contrario di quanto ci si
riguarda Ie intenzioni
potrebbe aspettare I'enunciazione
interpretative. E' assai probabile
piu veloce non e affatto quella di
ad esempio che fra la "velocita"
Toscanini: Ie esecuzioni di
di Harnoncourt e quella di
Karajan e Harnoncourt sono
Karajan possa esistere quel
molto piu brevi.
rapporto di opposizione che

34
Clarke [2] definisce come Bibliografia
"understatement" (eccesso di [1] The Sonic System, User
ritegno) vs. "emphasis"(grande Manual, Software version 2.0.5,
impeto) , rna per dire questo, Sonic Solutions, San Rafael,
ripetiamo, occorrono altri indizi 1993
fonici che ancora non abbiamo [2] E.F.Clarke: "Generative
raccolto. principles in music
Per quanto riguarda il performance" , in "Generative
rapporto fra primo e secondo Processes in Music", J.A.Sloboda
elemento del tema (Fig. 3) il sol- ed., Clarendon Press, Oxford,
sol-sol-mi sembra in genere pp.I-26, 1988
decisamente piu veloce (teso? [3] B.H.Repp: "Patterns of
impetuoso?) del fa-fa-fa-re. Ma expressive timing in
tale interpretazione andra performances of Beethoven
precisata. Le manca infatti un minuet by nineteen famous
dato importante e cioe la durata pianists", Journal of Acoustic
del mi e del re coronati rispetto Society of America, Vo1.88, N.2,
aIle tre note ribattute. Se, come pp. 622-641, 1990
unanimemente ribadiscono tutti [4] J. Sundberg: "Music
gli studiosi del settore, la peiformance research: An over-
specificita dell 'esecuzione view", in "Music, Language,
consiste nelle deviazioni rispetto Speech and Brain", J. Sundberg,
alia durata grafica, che va intesa L.Nord, R.Carlson eds.,
come una categoria di durata e Macmillan, London, pp. 173-
non come una durata reale [7], 183, 1991
allora la corona e gia un segno 5] J.Sundberg: "Computer
squisitamente interpretativo che synthesys of music peiform-
Beethoven stesso suggerisce ai ance" ,in "Generative Processes
suoi esecutori. Ma c' e chi 10 in Music", J.A.Sloboda ed.,
prende piu e chi menD Oxford,Clarendon Press, pp. 52-
drammaticarnente. Purtroppo 69, 1988
perC> abbiarno avuto dubbi sulla [6] A. Gabrielsson: "Timing in
misura di questa specifica nota, music performance and its
perche il suo attacco e relations to musc experience" ,in
soprattutto la su estinzione "Generative Processes in Music" ,
pongono seri problemi J.A.Sloboda ed., Clarendon
interpretativi. La pausa di Press, Oxford, pp.27-51, 1988
Toscanini e Knappertsbusch e [7] E.F.Clarke: "Categorial
evidentemente dovuta aHa tecnica Rhytm perception: an ecological
di registrazione. Ma per gli altri perspective" , in "Action and
esecutori quanto durava Perception in Rhytm and Music"
esattamente? A.Gabrielsson ed., Stockholm,
Royal Swedish Academy of
Music, pp. 19-34, 1987

35
SISTEMI ESPERTI IN MUSICOLOGIA:
UN PROTOTIPO PER L'ANALISI
TIME-SPAN REDUCTION

L. Camilleri, F. Carreras, F. Giomi

Divisione Musicologica CNUCE/C.N.R.


Conservatorio di Musica "L. Cherubini"
P.zza Belle Arti, 2
1-50122 Firenze (Italy)
Tel.: +39-55-282105
E-mail: conserva@vm.idg.fi.cnr.it

Abstract this stage the ambiguous or un-


The project of developing of an conventional slices (with regard
expert system for tonal harmonic to their relation to a triadic
analysis has started from some chord) are interpreted as proba-
theoretical considerations about ble chords and labelled. Then,
the existent theories (mainly the probable cadential points are
Lerdahl and J ackendoff' s found out: this part of the system
GTTM) and tried to attain two deals with the representation of
purposes: 1) the creation of an the structural properties of the
analysis environment, to be "harmonic phrase". After this
easily improved and amplified; surface analysis, the system starts
2) the testing and improvement to abstract the distinct, hierar-
of theory contents, together with chical levels up to the deepest
the verification of the integration one using different rules and also
among different theoretical as- different degrees of influence
sertions; for example, the theory for each rule at each level.
of tonal pitch spaces, proposed The implementation makes use
again by Lerdahl in 1988 can of different AI techniques based
give us useful hints in finding the on rules: the algorithmic part is
harmonic modulation points. written in Pascal while the fa-
The system is composed of an cilities of the IBM shell ESE are
harmonic lexicon and a rule particularly useful for speeding
system. At first the system di- up the development process.
vides the piece into time slices, Forward and backward chaining
as in the analytical methodology; as well as rule execution control
they are catalogued according to statements are used.
their harmonic properties and at

36
Introduzione realizzazione del sistema ci
La ricerca e stata sviluppata dalla siamo posti due obiettivi princi-
Divisione Musicologica del pali:
CNUCE/C.N.R. presso il 1) la creazione di un ambiente
Conservatorio di Musica "L. per l'analisi armonica, aperto a
Cherubini" di Firenze avendo miglioramenti ed ampliamenti;
come obiettivo l'applicazione di 2) la verifica e l'arricchimento
tecniche di intelligenza artificiale sia contenuti teorici impiegati
all'analisi armonica di musica sulla base dei risultati analitici
tonale occidentale. derivati dall'utilizzo integrato di
II lavoro si e articolato in una diverse fonti teoriche.Quello che
prima fase di approfondimento, puo essere considerato il prede-
ampliamento e integrazione di cessore del sistema, sia per af-
alcune recenti teorie musicali al finita negli strumenti di sviluppo
fine di creare un ambiente per impiegati che per il tipo di teoria
l'analisi armonica; e stato suc- musicale su cui si fonda, e il
cessivamente sviluppato un pro- prototipo per l'individuazione
totipo di sistema esperto che af- della struttura a gruppi di brani
fonda Ie sue basi teoriche princi- melodici tonali realizzato sempre
palmente nella teoria generativa dal nostro gruppo di ricerca [3].
della musica (GTTM) di Fred AlIa base delle regole di segmen-
Lerdahl e Ray Jackendoff [1][2]. tazione inserite troviamo la parte
Obiettivo finale e l'espletamento della teoria di Lerdahl e
automatizzato di uno dei para- J ackendoff che si orienta
digmi della teoria, la riduzione sull'analisi di un fattore cruciale
dell'intervallo temporale (time della comprensione musicale: la
span reduction): si tratta di una segmentazione, a diversi livelli,
serie di regole e di criteri che di una linea melodica in gerar-
permettono la progressiva ridu- chie di gruppi [1][2]; questa parte
zione di un brano, nell'ambito e stata integrata anche con altri
degli intervalli di tempo di volta apporti derivati da altre teorie di
in volta individuati, nei suoi origine semiologica e psicolo-
elementi armonicamente pill im- gica. Mentre in ingresso si fomi-
portanti. Questo procedimento di sce la codifica di una melodia to-
riduzione, articolato su diversi nale, l'uscita del programma e
livelli, consente un processo di rappresentata dalla struttura ge-
analisi strutturale i cui assunti rarchica della suddivisione in
teorici sono stati da noi ampliati gruppi di ogni livello, insieme
e confrontati con i principi deri- agli elementi (sequenze di note 0
vati da altre teorie musicali ed sequenze di gruppi di livello pill
integrati con alcune parti teori- basso) contenuti in ognuno di
che appositamente studiate. questi.
AlIa luce di queste considera- Non potevano non essere presi in
zioni, con la progettazione e la considerazione ai fini della pro-

37
gettazione altri due importanti Fra gli studiosi di musicologia
lavori in questo campo: i sistemi computazionale che hanno ten-
esperti di John Maxwell e di tato realizzazioni dirette degli
Kemal Ebcioglu. Nel primo caso spunti teorici della GTTM vale
[4] si tratta di un prototipo, ba- la pena di ricordare Michael
sato su regole, che esegue Baker, nota per la costruzione di
l'analisi della funzione armonica sistemi per l'analisi automatica
degli accordi di un branD di mu- basata sui raggruppamenti e per
sica barocca mentre il sistema di forme di interazione fra
Ebcioglu [5] si basa su un pro- grouping analysis, parallelismo
cesso di sintesi, anzichC di anali- musicale e livelli di riduzione
si, per indagare il repertorio dei [7]; l'equipe del Brooklyn
corali bachiani, armonizzati ed College con il sistema di regole e
analizzati automaticamente con un modello connessionista per la
tecniche analitiche di impostazio- scoperta delle strutture metriche
ne per 10 piu schenkeriana. e di raggruppamento di brani to-
nali [8]; recentemente anche
Approcci analitici Frode Holm ipotizza l'applica-
La GTTM puo rappresentare un zione dei processi comunicanti
modello di analisi che ben si sequenziali alla riduzione dell'in-
presta per la sua natura ad essere tervallo temporale, proponendo
implementato e di conseguenza un algoritmo per la riproduzione
ad essere sottoposto a verifiche udibile dei livelli di analisi [9].
ed eventualmente ad integrazioni Anche se ben formalizzata e do-
ed ampliamenti [6]. Come primo cumentata, la teoria di Lerdahl e
aspetto, essa tenta di produrre J ackendoff presenta indubbia-
delle descrizioni formali dei mente alcune lacune e i problemi
pezzi di musica analizzati. I ri- implementativi da risolvere non
sultati analitici possono essere sono pochi, poichC molti aspetti
usati come termine di confronto, non individuabili attraverso la
in relazione aIle comuni intui- semplice lettura del testo, sono
zioni musicali, per la verifica dei lasciati piuttosto suI vago. Uno di
principi teorici asseriti. In se- questi riguarda per esempio al-
condo luogo, la natura della teo- cuni dei criteri per i processi di
ria e di matrice psicologica, dato riduzione gerarchica, a livello
che tenta di spiegare capacita di armonico, di un brano musicale.
tipo cognitivo: l'attenzione e ri- Vengono introdotti infatti con-
volta all'organizzazione struttu- cetti come quelli di vicinanza
rale che l'ascoltatore esperto at- armonica senza specificarne pe-
tribuisce inconsciamente a un raltro una definizione precisa 0
pezzo musicale e ai principi co- comunque dare indicazioni per
gnitivi "universali" attraverso i l'effettivo utilizzo ai fini della
quali essa viene determinata. scelta degli elementi armonici da
ridurre.

38
Uno degli aspetti caratterizzanti tonale proposto da Lerdahl pub a
della nostra ricerca e quello di sua volta essere esteso e modifi-
aver realizzato un modeno che cato ottenendo, dopo una prima
prevede l'interazione tra Ie pro- fase di modellizzazione della
cedure di riduzione gerarchica e stabilita, una sequenza completa
l'individuazione degli spazi to- dei valori attribuiti agli accordi.
nali del brano mediante l'esame L'esame della sequenza e del suo
di valori di stabilita, ottenuti se- andamento pub dare informa-
guendo e modificando un recente zioni riguardo la presenza di una
studio dello stesso Lerdahl [10]. eventuale modulazione.
Egli propone un modello di L'approccio computazionale ri-
"spazio tonale" Ie cui particola- sulta indispensabile in quanta sa-
rita, come dice 10 stesso autore, rebbe praticamente impensabile
stanno nel fatto che tratta altezze, costruire manualmente e in poco
accordi e regioni tonali nel- tempo decine di spazi tonali,
l'ambito di una medesima strut- confrontare note di accordi, cal-
tura. E 10 scopo del lavoro e colare valori di stabilita, lora
giusto quello di descrivere Ie medie e decidere infine sulla
relazioni armoniche tra queste presenza 0 menD di una modula-
tre categorie di elementi in ter- zione.
mini delle cosiddette condizioni Sia la GTTM che gli ultimi
di prossimita. Sia il termine ampliamenti appena descritti
"prossimita" che "stabilita" deno- costituiscono la base teorica di
tano a grandi linee 10 stesso con- un sistema di analisi armonica
cetto: esprimono infatti in ma- basato su tecniche ibride proce-
niera numerica il concetto di durali e di rappresentazione della
"vicinanza" (e quindi di distanza) conoscenza.
di una nota, di un accordo 0 di Uno dei piu significativi aspetti
una regione rispetto ai rispettivi della GTTM, dal punto di vista
fondamentali, quali la tonica, armonico, e la riduzione
l'accordo di primo grado 0 la dell'intervallo temporale (time
tonalita di impianto del brano. span reduction). L'analisi del
Con l'attribuzione dei valori di branD consiste in una progressiva
stabilita e possibile costruire una semplificazione dove ad ogni
gerarchia tra eventi musicali passo eventi menD importanti
distinti, che si differenziano l'un sono cancellati a favore di altri
l'altro per vicinanza 0 lontananza strutturalmente piu importanti
armonica. (vertici 0 teste della riduzione)
Un metodo per il riconoscimento fino a far emergere una sOrta di
delle modulazioni pub essere ossatura del pezzo.
progettato basandosi su conside- Nella riduzione dell'intervallo
razioni che coinvolgono i valori temporale i domini delle elabo-
di stabilita attribuiti agli accordi razioni armoniche e melodiche
di un branD: il modeno di spazio all'interno dei quali scegliere gli

39
eventi sono definiti dal contesto alla battuta musicale) risultera
ritmico delle strutture metriche e preferibile come testa di ridu-
di raggruppamento. Ogni linea zione. Altri due importanti fat-
successiva dell'analisi e il risul- tori hanno invece a che fare con
tato di una cancellazione di Ie propriern "tonali" degli eventi:
eventi relativamente menD im- la consonanza armonica intrin-
portanti della linea superiore. II seca e la stabilita rispetto alla
procedimento e ripetuto a partire tonica. Vengono naturalmente
dal livello delle note di durata preferiti come vertici eventi re-
minima fino al livello che rag- lativarnente consonanti e con un
gruppa l'intero frammento. Se Ie grado di stabilita maggiore.
riduzioni analitiche sono cor- Anche la condotta melodica delle
rette, ogni livelio risulta acusti- voci viene presa in considera-
camente una semplificazione zione a favore di riduzioni che
naturale della linea precedente. conducano a profili melodici il
Le regole che permettono la ri- pili possibile stabili.
duzione dell'intervallo temporale Man mana che ci si allontana dal
a partire dal livello superficiale livello superficiale aumentano
possono essere schematicarnente gradatarnente di importanza altri
divise in due parti. La prima fattori legati soprattutto a consi-
parte si fa carico di derivare i derazioni di tipo macro-struttu-
domini entro i quali effettuare Ie rale. In particolare assumono
riduzioni partendo dalle unita rilevanza nella riduzione gli
metriche pili piccole e salendo eventi che fanno parte di un ini-
verso i domini pili grandi. La se- zio 0 di una "fine strutturale"
conda parte contiene Ie regole (per esempio Ie cadenze); solita-
che individuano all'intemo del mente il movimento di una frase
dominio gli elementi funzionanti musicale si articola proprio tra
da testa 0 vertice della riduzione questi due punti strutturalmente
(elemento gerarchico pili impor- importanti e di relativa stabilita
tante). formale.
Vediamo quali sono i fattori che Devono anche essere inserite
determinano la scelta di un ele- regole per il controllo di certe
mento come vertice a partire da condizioni di parallelismo che si
domini, gli intervalli temporali, possono verificare sia in senso
contenenti due 0 pili eventi. melodico che armonico tra
Ognuno dei fattori non e di per eventi 0 coppie di eventi
se determinante per la scelta rna adiacenti.
contribuisce in una certa misura
alIa decisione complessiva. Struttura del sistema
Inizialmente si tiene conto della esperto
posizione metrica: un elemento II sistema e articolato in diversi
metricamente pili forte (in rela- blocchi, rappresentanti Ie diverse
zione alIa sua posizione intema

40
funzioni logiche del processo di dinamica dell'accordo, eventuali
analisi armonica della partitura: segni dinamici 0 interpretativi ed
infine il numero di battuta.
l. CODIFICA E ANALISI Questa rappresentazione viene
LESSICALE DELLA via via arricchita automatica-
PARTITURA
2. RICONOSClMENTO DEGLI mente: al termine del secondo
ACCORD I blocco vengono aggiunte Ie
3. CALCOLO DEI VALORI DI informazioni derivanti dal
STABILITA' riconoscimento del tipo di
4. RICONOSClMENTO DELLE
CADENZE accordo rappresentato.
5. RICONOSClMENTO DELLE Per questa problema abbiamo
MODULAZIONI scelto una soluzione classica, in-
6. PRIMO LlVELLO DI tuitivamente semplice: si basa su
RIDUZIONE
un algoritmo che calcola gli in-
tervalli tra tutte Ie note dell'ac-
i. NMO LIVELLO DI cordo, ordina Ie stesse per terze
RIDUZIONE e in base alIa successione delle
t~rze, individua il tipo di ac-
Inizialmente il sistema analitico cordo e gli attribuisce un nome.
prevede un'indagine armonico- Vengono riconosciute triadi
lessicale. II brano viene codifi- maggiori e minori, diminuite ed
cato secondo una divisione in eccedenti, piu Ie precedenti con
fette di tempo (slice), ciascuna la presenza di settime e none.
comprendente un accordo, che Una volta conosciuto il nome
sono riconosciute ed etichettate dell'accordo si possono stabilire
secondo Ie loro proprieta armo- alcune ulteriori sue proprieta
niche. Dopo una parte dedicata come l'appartenenza 0 meno alIa
alIa determinazione degli spazi tonalim di base del brano.
tonali, si procede all'individua- Corne detto precedentemente la
zione dei punti cadenzali e delle teoria originaria del tonal pitch
modulazioni, completando la space pub essere modificata ed
rappresentazione strutturale delle estesa e soprattutto usata in modi
"frasi armoniche". Dopo questo e per scopi abbastanza diversi da
tipo di analisi superficiale si ini- quelli canonici: e possibile for-
ziano a distinguere i vari livelli mulare un algoritmo per la rile-
gerarchici dell'armonia secondo vazione delle eventuali modula-
gli schemi di riduzione, fino alIa zioni di tonalita attraverso l'e-
completa evidenziazione della same dei singoli valori di stabi-
struttura del brano. lim e delle loro sequenze.
Entrando nel dettaglio, 10 slice e II metoda utilizzato per il tenta-
una lista di sei campi contenenti tivo di individuazione delle mo-
Ie note dell'accordo rappresen- dulazioni fa uso sia delle infor-
tate in notazione anglosassone mazioni sulla stabilita relativa
con il sistema nome-alterazione- degli accordi che di alcuni dati
ottava, la durata dell'accordo, la

41
globali del brano, tra i quali i oltre ad altri fattori che esc1u-
punti cadenzali. Dopo la fase di dono configurazioni derivate da
modellizzazione della stabilita si condizioni simili a queUe caden-
ha a disposizione la sequenza zali. Le regole sono organizzate
completa dei valori attribuiti agli secondo una gerarchia di attiva-
accordi. Ripetute costruzioni zione che consente l'attribuzione
sperimentali degli spazi·· tonali del punto cadenzale con diversi
hanno reso possibile l'accerta- gradi di attendibilitii.
mento di un primo valore di Effettuare invece il primo livelIo
soglia, per la stabilita di un ac- di riduzione significa esprimere
cordo, al di sopra del quale l'ac- una gerarchia tra particolari
cordo stesso pub essere conside- eventi musicali interni ad una
rata come l'inizio di una modu- fissata unita ritmica, il time span.
lazione. L'ipotesi sulla presenza Nel livello iniziale i fattori che
della modulazione deve essere entrano in gioco per la determi-
corroborata da risultati derivanti nazione della testa sono
dall'analisi delle stabilita succes- prevalentemente di natura me-
sive. Se gli accordi successivi, trica e/o di relazione armonica.
fino al raggiungimento di un A questo livello vengono ridotte,
punto cadenzale, continuano in- per esempio, Ie appoggiature, Ie
fatti questa tendenza al supera- note di passaggio, Ie note di volta
mento della soglia, si potra af- e gli altri eventi di questa c1asse.
fermare con una buone dose di Per confrontare gerarchicamente
attendibilita che siamo in pre- la consonanza di due accordi
senza di una modulazione. Tale abbiamo introdotto una forma di
modulazione avra come nuovo c1assificazione deducibile dal tipo
accordo di tonica quelIo di riso- dell'accordo. La c1assificazione
luzione della cadenza indicata usata e una nostra variante di
come limite del campo di esame. quella introdotta da Maxwell nel
L'elenco dei punti cadenzali suo sistema [4] e prevede la se-
fornisce informazioni sulla guente suddivisione degli
struttura del branD alIa parte di accordi:
rilevamento delle modulazioni e 1) triadi maggiori e minori;
di esecuzione dei livelli di ridu- 2) triadi diminuite, maggiori e
zione gerarchica: abbiamo svi- minori di settima;
luppato un insieme di regole atte 3) triadi aumentate;
alia loro individuazione. Tali re- 4) altre triadi.
gole effettuano dei test riguar- All'accordo viene assegnato un
danti tre accordi: i due caden- valore corrispondente al numero
zanti e quello precedente. della sua classe. Esiste anche una
Prendono in esame il modo degli categoria intermedia tra la prima
accordi, Ie lora posizioni, even- e la seconda, dove viene collo-
tuali segni tipo corone, il profilo cato esc1usivamente l'accordo di
melodico della linea del basso, settima di dominante della tona-

42
lita. Agli accordi che hanno una La sua lunghezza dipende, oltre
configurazione di primo 0 se- che dai valori minimi di durata
condo rivolto viene aggiunto degli eventi presenti, dalla com-
mezzo punto per compensare la plessita del brano in esame.
"piccola perdita" di consonanza. Nei livelli successivi al primo
Ad ogni regola che govema i rimangono in azione tutti i fat-
processi di riduzione viene asse- tori elencati precedentemente rna
gnato un fattore di certezza che ne iniziano ad entrare in gioco
ne contraddistingue il peso nella altri che obbHgano la lettura
determinazione dei vertici. contemporanea di un numero di
Questi pesi possono essere cam- accordi superiore a due, per
biati dall'analista per controllare esempio quateme di eventi. Ed e
l'influenza progressiva dei fat- proprio la scelta deU'ampiezza
tori in gioco 0 la predominanza del time-span uno degli aspetti
di uno di essi su gli altri. pill difficoltosi di questa sezione
In particolare, per la riduzione si analitica. Le regole devono tener
opera a livello dell'unita ritrnica conto della presenza di anacrusi,
pili piccola del brano preferendo di condizioni di fine ed inizio
come vertici accordi che risul- battuta, dei battiti metrici intemi
tano pili consonanti, che sono pili aIle battute e di molti altri para-
"vicini" alIa tonica (tramite i metri. Da notare che il concetto
valori di stabilita) e che cadono stesso di battuta perde di signifi-
inoltre su una posizione metrica cato man mana che ci si allon-
forte. Ulteriore consistenza alIa tana dal HveUo superficiale; in
riduzione viene data da un con- genere i livelli dal terzo in poi
trollo sulle altezze pill acute della non ne fanno praticamente pill
melodia e su queUe pill gravi uso.
della linea del basso. Ognuno dei La scelta degli elementi da con-
fattori presi in esame contribui- servare al Hvello via via in og-
sce aUa determinazione finale getto pub essere regolata da
dell'accordo da eliminare in mi- considerazioni riguardanti anche
sura variabile ed eventualmente certe caratteristiche di conti-
modificabile da parte dell'anali- guita. Per esempio se all'intemo
sta. Proprio la possibilita di in- della quatema gli accordi 1 e 4
tervenire su questi fattori, va- hanno Ie voci del soprano che
riandone il peso relativo, pub es- distano un tone 0 semitono, op-
sere veicolo di una interessante pure queUe del basso che distano
sperimentazione circa la diversa 10 stesso intervallo, si preferisce
importanza degli aspetti sotto- ridurre gli altri e lasciare questi,
stanti i processi di riduzione e o quantomeno attribuire loro una
sulla loro interazione. probabilita alta di non riduzione,
II procedimento di riduzione valutabile successivamente as-
prosegue per un numero di li- sieme ai dati provenienti dai con-
velli non determinabile a priori. fronti degli altri fattori. Regole

43
apposite attribuiscono inoltre una lisi musicale, soprattutto da1
buona probabilita di essere ver- punto di vista cognitivo.
tice di riduzione ad elementi ap- L'ultimo livello di riduzione, i1
partenenti agli "inizi 0 fini pili basso nella notazione
strutturali" di frasi musicali. musicale, 1a radice dell'albero
Essendo questi ultimi alcuni nella notazione linguistica, e
degli elementi che determinano rappresentato dall'unico accordo
l'ossatura portante di una frase 0 di tonica della tonalita che,
di un branD musicale, e naturale banalmente, coincide spesso con
pensare che anche gli eventi che l'ultimo accordo del brano.
contribuiscono alIa loro funzione
debbano subire un trattamento Implementazione e
particolare, ritardando il pili sperimentazione
possibile la loro riduzione. Per quanto conceme la realizza-
Un altro fattore molto impor- zione, i1 sistema e costituito da
tante da prendere in considera- una base di conoscenza formata
zione e quello basato sulle rego1e da regole per la rappresentazione
di parallelismo e che puC> essere dei dati e del modello analitico;
efficacemente definito con Ie pa- esse vengono attivate, tramite un
role stesse di Lerdahl e motore inferenziale che usa
J ackendoff: "se due 0 pili inter- tecniche di forward e backward
valli temporali possono essere chaining e un insieme di strut-
costruiti come paralleli da un ture di contrallo che rappresen-
punto di vista del motivo 0 del tano la strategia analitica (Focus
ritmo, allora assegna lora dei Control Blocks). Al sistema e as-
vertici di riduzione paralleli" [1]. sociata una parte procedurale,
Naturalmente si tratta di una de- scritta in Pascal, che si fa carico
finizione a carattere generale di risolvere i problemi di carat-
dato che nella pratica 1a verifica tere microstrutturale quali 10
delle condizioni di parallelismo e svolgimento di funzioni pretta-
assai difficoltosa e dispendiosa a mente a1goritmiche e 10 scambio
livello computazionale. Possono dei dati con l'estemo.
infatti sorgere numerosi casi di- Tomando alIa struttura del si-
versi quali 1a presenza in span stema logico, 1a divisione gerar-
contigui di progressioni di inter- chica dei compiti di analisi puC>
valli uguali oppure di rapporti essere evidenziata dall'albero
intervallari uguali in relazione al degli FCB (Fig. 1). Ognuno di
grade degli accordi e altri casi. essi rappresenta un insieme 10-
Questa non ben specificita ri- gico di operazioni e contribuisce
guardo al parallelismo e forse a creare una struttura ad albero i
uno dei punti deboli della teoria cui nodi vengono percorsi a cieli
generativa della musica tonale e per seguire i1 ragionamento
di fatto costituisce un facile ter- analitico e portare in evidenza i
reno di sperimentazione dell'ana- risultati.

44
Fig.I. Albero di attivazione della base di conoscenza.

II FCB ANALISI controlla 10 sultazione, rna di predisporre in-


svolgimento complessivo delle vece un sistema capace di cam-
operazioni e, dopo l'acquisizione biare l'insieme dei dati di volta
di dati generali del brano, stabi- in volta disponibili per Ie scelte
lisce la lista dei movimenti forti guidate dalle regole. Si tratta di
del brano per mezzo diregole una forma di rappresentazione
specifiche. Attiva poi sequen- "variabile", nel senso che per
zialmente i suoi nodi figli, corri- ogni sottoproblema si fa riferi-
spondenti alle operazioni analiti- mento ad uno specifico sottoin-
che seguenti. Ad ogni livello sieme delle informazioni dispo-
dell'albero sono attivate proce- nibili, organizzate in strutture a
dure esterne per 10 scambio dei lista.
dati intermedi e per il passaggio n sistema di analisi armonica ac-
dei risultati ottenuti. cetta in ingresso la codifica alfa-
II FCB CADENZE attiva la numerica del brano musicale da
parte per l'individuazione delle analizzare e fornisce, al termine
cadenze, il FCB LIVELLOI delle procedure analitiche,
provvede al primo livello di ri- un'analisi armonico-strutturale
duzione mentre il FCB del brano stesso espressa in ter-
LIVELLO n ripete il procedi- mini dei livelli di riduzione. In
mento di riduzione dal secondo mezzo a questi due estremi sono
livello all'ultimo. disponibili comunque tutta una
Ogni gruppo di FCB necessita di serie di risultati intermedi, al-
informazioni particolari suI trettanto degni di interesse da un
brano analizzato. Abbiamo scelto punto di vista musicologico e
quindi di non rappresentare gli importanti anche ai fini di un
slice integralmente e con moda- costante controlio suli'esecuzione
lita identiche nel corso della con- del processo analitico. Questi

45
risultati, che potremmo definire Inizialrnente i test, sia delle fasi
parziali, sono rappresentati da: di pre-elaborazione che poi delle
I) riconoscimento degli accordi, fasi inferenziali, sono stati con-
espresso corne etichettatura degli dotti su brani relativamente
stessi e verificabile all'intemo di semplici, tipicamente apparte-
un file di controllo apposito; nenti al repertorio barocco, pro-
2) descrizione dei punti caden- seguendo poi con un progressivo
zali, riscontrabile all'interno aumento di complessita dei pezzi
della base di conoscenza me- in analisi. Man mana che il
diante stampa della lista grado di complessWi musicale
specifica; sale sono state tenute in conside-
3) descrizione delle sezioni di razione, con l'aggiunta di regole
modulazione armonica, riscon- e con modifiche alle parti proce-
trabile mediante esame del file durali, sempre nuove variabili e
contenente Ie inforrnazioni sugli nuovi punti di vista per la riso-
slice. luzione dei problemi.
Un risultato indiretto utile· ai fini II processo di retroazione de-
del processo di integrazione, ve- scritto ha messo in luce, gUt in
rifica e ampliarnento della teoria alcuni dei primi esempi di ana-
e costituito dalla traccia del ra- lisi, alcune discrepanze con i ri-
gionamento (trace file) dispo- sultati manuali ottenuti dagli
nibile in diversi livelli di detta- stessi Lerdahl e Jackendoff: si
glio corne risultato di un pro- tratta soprattutto di differenze
cesso di consultazione della base nei risultati delle riduzioni, dif-
di conoscenza. Possono essere ferenze che non devono essere
evidenziati il passaggio di con- interpretate come errori nella
trollo tra i blocchi, la ricerca di formalizzazione della teoria ge-
valori e il lora assegnamento ai nerativa rna anzi corne espres-
parametri, la risoluzione dei vin- sione di una diversa interpreta-
coli e soprattutto la prova e l'at- zione di alcune casistiche musi-
tivazione delle regole, il lora cali. sara interessante verificare,
stato ad ogni istante e altre in- sia mediante una discussione con
forrnazioni secondarie. gli autori della GTTM che con il
Per quanto riguarda la consulta- confronto con altri impianti teo-
zione, ogni risultato intermedio rici, Ie motivazioni e Ie conclu-
che compare sullo schermo puo sioni da trarre nei confronti di
essere modificato dall'analista questa genere di risultati.
musicale semplicemente sovra-
scrivendo valori diversi da quelli Conclusioni
presentati. Egli puo inoltre, L'alto grado di complessita dei
corne detto, intervenire sui pesi fenomeni musicali impone delle
di certe regole di riduzione mo- forme di realizzazione che siano
dificandone i valori da tastiera. adeguate e che, a fini di corret-
tezza e completezza, affrontino i

46
problemi sotto un numero di an- Structure in Music", Music
golazioni sufficientemente ampio Perception, VoLl, N.2, 1984.
[6]. Per questi motivi appare [3] L. Camilleri, F. Carreras, C.
sempre pili interessante la pos- Duranti: "An Expert System
sibilita di integrare in maniera Prototype for the Study of
coerente, formalismi di rappre- Musical Segmentation",
sentazione della conoscenza mu- Interface, VoLl9, N.2-3, 1990.
sicale di differente natura, all'in- [4] J.H. Maxwell: "An Expert
terno dei cosiddetti sistemi ibridi System for Harmonic Analysis of
[11]. Tonal Music", Proc. of the First
L'esempio di sistema esperto da International Workshop on
noi proposto si colloca in una Artificial Intelligence and Music,
tale dimensione. Con questo la- AAAI Press, 1988.
voro abbiamo cercato di am- [5] K. Ebcioglu: "An Expert
pliare, integrare e verificare gli System for Harmonizing Four-
assunti della teoria generativa part Chorales", Computer Music
della musica tonale proposta Journal, Vo1.12, N.3, 1988.
dagli studiosi americani Lerdahl [6] F. Giomi: "Analisi musicale e
e Jackendoff, e della teoria sulle intelligenza artificiale. Problemi,
regioni tonali delle stesso metodi ed esempi" , Eunomio,
Lerdahl. Laddove la trattazione N.20,1993.
di certi aspetti analitico-musicali [7] M.J. Baker: "An Artificial
non fosse stata precedentemente Intelligence Approach to Musical
contemplata, sono state cercate Grouping Analysis" ,
nuove specifiche soluzioni teori- Contemporary Music Review,
che. Pertanto anche da un punto Vol.3, N.1, 1989.
di vista teorico-musicale la pos- [8] D. Scarborough et al.: "A n
sibilita di integrare differenti Expert System for Music
approcci appare come una via Perception", Proc. of the First
foriera di interessanti sviluppi. International Workshop on
Ed e proprio grazie alle nuove Artificial Intelligence and Music,
tecniche di intelligenza artificiale AAAI Press, 1988.
che diventano fattibili tali forme [9] F. Holm: "Machine Tongues
di integrazione e di verifica dei XIV: CSP-Communicating
risultati. Sequential Processes", Computer
Music Journal, Vol.l6, N.I,
Bibliografia 1992.
[1] F. Lerdahl, R. Jackendoff: A [10] F. Lerdahl: "Tonal Pitch
Generative Theory of Tonal Space", Music Perception, Vol.5,
Music, Cambridge Mass., The N.3, 1988.
MIT Press, 1983. [11] A. Camum: "On the Role
[2] F. Lerdahl, R. Jackendoff: of Artificial Intelligence in
"An Overview of Hierarchical Music Research", Interface
Vo1.19, N.2-3, 1990.

47
FORMALIZZAZIONE DI STRUTTURE
GENERATIVE ALL'INTERNO DE
"LA SAGRA DELLA PRIMAVERA"
Adriano De Matteis, Goffredo Haus

Laboratorio di Infonnatica Musicale


Dipartimento di Scienze dell'Infonnazione
Universita degli Studi di Milano
via Comelico 39
1-20135 Milano (Italia)
fax +39 2 55006373
e-mail: music@imiucca.csi.unimi.it

Abstract

In this paper we show how we can formally describe Generative


Structures (GS) of musical pieces. We consider as a particular case study
"The Rite of Spring" by I. Stravinsky.
The first goal of our research has been the identification of relationships
among Music Objects (MOs) belonging to the piano duet or the orchestral
score or both. We have defined two kind of GSs: Essential Generative
Structures (EGSs) and Additional Generative Structures (AGSs). We call
EGS a structure that detennines the development of music material only
in a temporal sense (Horizontal development), while the concept of AGS
concerns with the orchestration of EGSs. We think that orchestration
must be thought not only at a timbric but also harmonic level by
superimposition of new generated material (Vertical development). We
can generally say that the same EGS occours both in the piano reduction
and in the orchestral score, while AGS only in the latter.
Our formalization has been made by means of an "ad hoc" arrangement
of hierarchical Petri Nets (PNs) and a music algebra; music objects are
associated to places; music transformations are described by algorithms
associated to transitions and are coded by expressions which are based on
the set of operators and the syntactic rules we have defined.
The models we have produced have been implemented by the ScoreSynth
software module of the Intelligent Music Workstation.

This research has been partially supported by the Italian National Research Council in the
frame of the MUSIC Topic (LRC C4): "INTELLIGENT MUSIC WORKSTATION",
Subproject 7: SISTEMI DI SUPPORTO AL LAVORO INTELLETTUALE, Finalized
Project SISTEMI INFORMATICI E CALCOLO PARALLELO.

48
Introduzione. II concetto di AGS riguarda
invece l'orchestrazione di EGS.
Nel presente articolo mostriamo Orchestrazione che deve essere
come sia possibile descrivere pensata in senso generale come
formalmente Strutture Genera- mezzo per generare materiale
tive (GS) di brani musicali. nuovo, non solo a livello
Consideriamo come caso timbrico, rna anche armonico,
particolare di studio "La Sagra mediante sovrapposizione di
della Primavera" di 1. nuove generazioni (Sviluppo
Stravinsky. Verticale).
n concetto di Oggetto Musicale Come idea generale possiamo
(MO) ha un ruolo fondamentale dire che una stessa EGS si trova
nella nostra ricerca; un MO pub sia nella riduzione per piano che
essere qualunque cosa abbia un nella versione orchestrale corri-
significato musicale e che spondente, mentre una AGS solo
possiamo pensare come entita, in quest'ultima.
sia semplice che complessa, sia La nostra formalizzazione e stata
astratta che dettagliata, un'entita fatta mediante un opportuno
con un nome e che abbia qualche adattamento delle Reti di Petri
relazione con altri oggetti (PN), e un'algebra musicale [1];
musicali. gli MO sono associati ai posti
11 primo scopo della nostra della rete; Ie trasformazioni
ricerca e stato l'identificazione musicali vengono descritte me-
delle relazioni tra MO diante algoritmi associati alle
appartenenti ana riduzione per transizioni e codificati da
pianoforte a quattro mani [2] 0 espressioni che sono basate sullo
alIa partitura orchestrale [3] 0 ad insieme di operatori e sulle
entrambe. regole sintattiche che abbiamo
Abbiamo definito due tipi di GS: opportunamente definito. Questo
1) Essenziali (EGS); approccio ci permette di
2) Aggiuntive (AGS). descrivere gli oggetti musicali e
Definiamo EGS una struttura che Ie loro trasformazioni a vari
determina 10 sviluppo del livelli di rappresentazione.
materiale musicale esclusivamen- Una EGS e formalizzata da una
te da un punto di vista temporale PN in cui i MO d'ingresso rap-
(Sviluppo orizzontale). Nella presentano idee melodico-rit-
musica di Stravinsky questo pub miche, e la PN determina 10
significare l'opposizione e/o la sviluppo del materiale generato
giustapposizione di due 0 piu solo in senso orizzontale. Una
pattern melodico-ritmici periodi- AGS e una rete che perrnette il
ci e variabili, la sincronizzazione raffinamento di una EGS
e l'incastro di periodi per corrispondente. Quindi una AGS
ottenere effetti di asimmetria, e realizza un morfismo tra PN
cosl via. che, seguendo (pensata come)

49
una procedura top-down, parallelamente al quale si svi-
conserva Ie strutture temporali luppa un'altro gruppo soggetto-
stabilite dalla rete EGS risposta, che comporta lievi
corrispondente. differenze cromatiche, Sl-R1,
anch'esso derivato da SOGGET-
Un caso esemplare. TO, presente nella partitura
orchestrale, rna non in quella
Mostriamo ora attraverso un pianistica.
esempio caratteristico in che
modo si sviluppa la nostra
ricerca. L'esempio ci permettera
di mettere in evidenza i seguenti
punti, che sono anche gli scopi
::: I:: :::: :
generali della nostra ricerca:
1) e possibile trovare un insieme
di PN che funziona su un buon
numero di brani;
2) un insieme di MO e un
:::::~: :: :::::::
Figura 1.
:

numero ridotto di queste PN puo


descrivere parzialmente 0 total- Inizialmente viene esposta una
mente, 10 sviluppo del materiale serie di variazioni del soggetto,
musicale di una composizione; variazioni che riguardano la
3) si puo passare attraverso un prima nota (testa del soggetto):
processo di raffinamento, carat- queste variazioni ne alterano sia
teristico delle PN, dalla partitura Ie ripetizioni sia la durata.
per pianoforte a quella corri- Questa serie di variazioni e stata
spondente per orchestra; chiamata S. Ad esempio, nelle
4) questa formalizzazione e prime sei battute la testa del
intimamente connessa al pensiero soggetto viene ripetuta cinque
e alIa pratica musicale, in quanto volte e, tranne la prima, Ie altre
conserva e mostra Ie strutture ripetizioni duranD il doppio.
nascoste delle partiture musicali; Quindi il soggetto viene
in piu risulta sufficientemente riproposto una quarta sopra e
maneggevole. subisce ancora delle modifi-
La partitura gene rata nello cazioni della testa come visto
esempio riportato e il modello precedentemente. Ad esempio
della partitura dal numero 181 al nella battuta 15, la testa viene
numero 185 ed e il risultato di ripetuta tre volte, senza
una serie di trasformazioni di un modificazioni di durata. Questa
unico MO che chiameremo serie di variazioni e stata
SOGGETTO. chiamata R. Un analogo
Da esso deriva un gruppo trattamento e riservato al gruppo
soggetto-risposta S-R, a inter- Sl-R1, che procede parallela-
vallo di quarta superiore, mente al gruppo S-R.

50
5
It
I~'!~:~J~t~f~t~~tr~4 ~Sli~~~~~~1
5 I

'f@ f¥(? ~~- 'I

Figura 2: partitura completa del modello.

In piu il soggetto subisce delle Vediamo come procede il


alterazioni cromatiche. Nella modello mediante PN. In Figura
Figura 2 abbiamo indicato, per 3 abbiamo la rete al livello piu
semplicita, con S, R, Sl, RI, alto.
soltanto la prima delle ripetizioni n soggetto e associato al posto
di ciascun gruppo di variazioni; SOGGETTO. Inizialmente il posto
inoltre il gruppo S I-Rl e I N I Z IO ha una marca mentre gli
rappresentato trasposto di un altri non ne hanno. Dunque la
ottava inferiore. transizione A puo scattare.

51
Alla tranSlzlOne A e associato copie del soggetto che vengono
l'algoritmo: associate ai posti COP I A 1 ,
COP IA2, COP IA3. Quindi nelle
P : 1, $, [SOGGETTO, 1] , ? transizioni indicate con TRASF
vengono realizzate Ie modifiche
che ha corne effetto quello di necessarie.
caricare i1 soggetto.
IN

INIZIO A T

~
( [) ([)~

Figura 3: rete S·R.

Se alIa transizione T non e


associata una sottorete, T scatta e SI~ROl SI~ROZ

si passa all'esecuzione del gruppo


sequenziale S-R. La transizione T
realizza, quando e invece asso- Figura 4: rete MACRO.
ciata ad una opportuna sottorete,
i1 raffinamento di cui si e parlato
e che ci riserviamo di esaminare Ad esempio nella lista parametri
in seguito. Ai posti S eRe di chiamata del posto S troviamo:
associata una stessa sottorete
chiamata MACRO che realizza TRASF1:
l'esposizione del soggetto, quando {M:l,1,5
ripeti Ia prima nota cinque voite
e chiamata da s, e della risposta, D:2,4,?*2
quando e chiamata da R con Ie D:$,$,?*2}
rispettive variazioni che abbiamo raddoppia Ie durate dalia seconda
gia evidenziato. La macro di cui alIa quarta e poi dell' ultima nota
si parla e rappresentata dalla
Figura 4. In questo modo la serie di ripe-
Questa rete riceve in ingresso da tizioni del soggetto conterra Ie
soda R i1 soggetto non ancora variazioni indicate. Osserviamo
modificato. Lo scatto della che la transizione TRA SF 1 e
transizione DIS TRI BUZlONE l'unica a poter scattare in quanto
deterrnina la creazione di tre TRASF2 e TRASF 3 hanno in

52
ingresso i posti S INCR01 e suo scatto e di porre una marca
8INCR02 che non hanno marche. nel posta P e una nel posta 81.
Al posta RIP 1 e associata Quindi pub iniziare l'esecuzione
l'esecuzione effettiva del risultato del gruppo 81 - R 1 e paral-
delle trasformazioni avvenute in lelamente quella del gruppo 8 - R
TRA8Fl. mediante 10 scatto della
Eseguita questa prima ripetiziane transizione OUT che ci riporta
il posta 8INCR01 avra una nella rete principale S-R. Per i
marca. Dunque la transizione posti 81 e R1 valgono discorsi
T R A 8 F 2 pub scattare e 10 analoghi a quelli fatti in
algoritmo ad essa associato precedenza per 8 e R.
generare una nuova ripetizione
del soggetto can Ie variazioni
necessarie e COS} via. Osserviamo
che il posta 8 INCRO 1 ha una
funzione logica: indica cioe la
fine della prima esposizione e
quindi che e possibile iniziame
un'altra. Analogamente si svilup-
pano Ie altre ripetizioni .Infine il
posta OUT ci riporta nella rete Figura 5: RETE.
principale S - R. II medesimo
procedimento e associato, can Ie Questa raffinamento ci porta alla
opportune modifiche, al posto versione orchestrale.
R . Ad esempio, nei parametri di
Sottolineamo il fatto che ad 8,
chiamata della sottorete MACRO 8 1 , R, R 1 e associata la stessa
nel posto R: sottorete MACRO.
Risulta quindi che da un solo MO
DI8TRIBUZIONE:
(80GGETTO) e can l'uso di poche
{P:1,$, [80GGETTO,1],?+5}
trasponi il soggetto una quarta sopra reti, tra cui una in particolare
(MACRO) chiamata piu volte e
TRA8F1: possibile ricavare ampie parti di
{M: 1,1, 3} una partitura. In pili mediante
ripeti la prima nota tre volte l'uso di una sola rete ausiliaria e
possibile orchestrare la prece-
Fin qui il modello rappresenta 10 dente partitura.
sviluppo della partitura per
pianoforte. Vediamo come si Conclusioni.
arriva alIa versione orchestrale
can una AGS. Questa AGS e una Naturalmente, sana possibili i piu
sottorete associata alla transizione svariati approcci descrittivi.
T ed e rappresentata in Figura 5. Abbiamo seguito criteri di analisi
La transizione IN determina un che sembravano particolarmente
parallelismo perche l'effetto di un adatti a descrivere Ie strutture

53
generative all'interno de "La Riferimenti bibliografici.
Sagra della Primavera" e ne
abbiamo riportato, per brevita, [1] G. Haus, A. Sametti:
un solo campione esemplare "SCORESYNTH: a System for
caratterizzante i1 metodo. the Synthesis of Music Scores
La ricerca pub essere utile non based on Petri Nets and a Music
solo con finalita di analisi Algebra", in "Readings in
musicologica, rna anche come Computer Generated Music", D.
strurnento per illustrare Ie Baggi Ed., pp.53-78, IEEE
possibilita di una simile forma- Computer Society Press, 1992.
lizzazione nella composizione di
nuovi brani, ad esempio [2] Stravinsky, I., "The Rite of
utilizzando, tutte 0 in parte, Ie Spring - Reduction for Piano
strutture generative individuate e Duet", Boosey & Hawkes,
descritte nel corso della ricerca. London, 1947.
I modelli che abbiamo prodotto
sono stati implementati con il [3] Stravinsky, I., "The Rite of
modulo software ScoreSynth Spring", Boosey & Hawkes,
della Stazione di Lavoro Musicale London, 1967.
Intelligente e cosl possiamo avere
tanto partiture quanto esecuzioni
musicali dirette come riscontro
dei nostri modelli nelle diverse
varianti formulate.

54
USING SELF-AFFINE FRACTAL CODING TO
MODEL MUSICAL SEQUENCES
Bruno Fagarazzi:l<O & Massimo Sebastiani:l<

:I< D.E.!. Dipartimento di Elettronica e Informatica


University of Padua, via Gradenigo 6/a
35131 Padova (Italy)
tel +39(0)498287500 - fax +39(0)498287699
E-mail adtpoli@ipduniv.bitnet

o C.S.c. Centro di Sonologia Computazionale


University of Padua, via S.Francesco 11
35121 Padova (Italy)
tel +39(0)498283757 - fax +39(0)498283733
E-mail musicOl@ipdunivx.it

Abstract polifonia ed alIa rappresenta-


zione del tempo e del ritrno.
Starting from the interpolation Le funzioni vengono generate
algorithm already described in fissando: i punti di partenza tra i
the paper presented at IX CIM quali si intende interpolare, la
[1], we have developed a new dimensione frattale per ogni in-
method to compress sound tervallo e, nel caso multidimen-
events. In this paper such method sionale, i parametri di correla-
is presented and different zione tra una funzione e l'altra.
implementation of it are Ogni intervallo e sottoposto ad
suggested in order to allow una trasformazione affine che
musicians and researchers to mappa tutto il dominio nello
apply the method in the best way stesso intervallo. Essa e del tipo:
for fitting their own exigencies.

Interpolazione

Vale la pena richiamare breve- E' importante sottolineare che


mente l'interpolazione tramite per ogni intervallo esiste una
IFS che si presta alla generazione sola funzione che 10 ammette
di segnali e funzioni per invi- come codominio: in sostanza i
luppi e melodie monofoniche codomini delle funzioni, che
mentre risulta poco adatta alIa pure hanno 10 stesso dominio,
non si sovrappongono.

55
Compressione dove e e l'errore calcolato ai
minimi quadrati ed s e il fattore
La compressione tramite IFS si di contrazione della mappa che
presta moho bene all'approssi- deve essere sempre minore 0
mazione di parametri fisici cosl uguale ad 1 (mappe contrattive
come alIa rappresentazione delle in cui i domini sono pi6 ampi dei
microvariazioni timbriche rna codomini). Si pub vedere per-
non tanto alIa compressione di tanto che, a differenza di quanto
melodie. Essa si ottiene diretta- succede nel caso dell'interpola-
mente dall'algoritmo relativo zione, nella compressione siamo
all'interpolazione basandosi suI liberi di scegliere i domini come
fatto che la funzione da appros- vogliamo ed il tipo di funzione
simare e un punto fisso (cioe un pub anche non essere rigorosa-
attrattore) della trasformazione: mente autosimile.

Modello
dove W j sono singole trasfor-
mazioni affini. Per ottenere la Questo nuovo modello di IFS
compressione e sufficiente tro- permette la rappresentazione di
vare Ie wiehe mappano parte eventi musicali di qualsivoglia
della funzione in altre sue parti complessita. Si utilizza una
con eventuali sovrapposizioni. rappresentazione di eventi
Inoltre, una volta fissati dominio costituita da n-uple di valori che
e codominio, per trovare la Wi possono contenere informazioni
pili appropriata si possono stabi- riguardanti durata, altezza,
lire i gradi di liberta rimanenti timbro oltre alIa componente
minimizzando l'errore di ap- temporale peraltro
prossimazione col metodo dei indispensabile.
minimi quadrati. La grossa differenza rispetto al
n problema si riduce quindi a modello relativo all'interpolazio-
trovare, per ogni intervallo ne consiste nella dimiuzione della
(codominio), il dominic che mi- dimensione frattale dell'attratto-
nimizza l'errore ed a ridurre re che diventa minore di 1
l'intervallo considerato nel caso essendo l'insieme puntiforme.
in cui l'errore ecceda il massimo Per il caso bidimensionale Ie
consentito. II collage theorem [2] trasformazioni affini sono del
garantisce che l'errore finale E tipo:
commesso nell'approssimazione
e:
e
E<-
l-s dove, per quanto riguarda Ie du-

56
rate, si pub anche usare un arti- il sistema e di facile soluzione in
ficio: quanto, sotto opportune condi-
zioni, esso si abbassa di grado.
duratan+1 =a . duratan + e Nel caso di spazi di dimensione
superiore a due, oltre al metodo
in quanto e verosimile che Ie tra- dei momenti, la tecnica che sem-
sformazioni che agiscono sulla bra fomire i risultati pin interes-
componenete temporale (x) pos- santi e quella relativa agli algo-
sana interessare anche Ie durate. ritmi genetici.
La trasformazione affine pub es- Analogamente al caso ad una sola
sere facilmente estesa al caso n- dimensione, l'algoritmo di ap-
dimensionale nel modo seguente: prossimazione deve provvedere a
scegliere i domini ed i codomini
adatti (n-cubi) che minimizzano
l'errore di trasformazione. Se i1
numero di punti da approssimare
non e elevato, un algoritmo
mentre molto pin semplicemente, esaustivo di ricerca (con un mi-
nel caso in cui non si voglia te- nimo di 'branch and bound') pub
ner conto delle interazioni tra i essere efficace. La ricerca ter-
vari parametri descrittivi, l'e- mina quando si e trovata, per
stensione a pin dimensioni si pub ogni intervallo, una funzione che
ottenere ripetendo il procedi- 10 ammette come codominio.
mento relativo al caso bidimen- Ci sembra opportuno sott01ine-
sionale per ogni dimensione se- are il fatto che non si tratta di
paratamente. una semplice scomposizione in
Risulta evidente che in questo operatori musicali (trasposizioni,
modello l'errore di approssima- inversioni, ecc) in quanto gli
zione gioca un molo determi- eventi vengono sintetizzati con
nante. l'IFS come limite di una succes-
sione di applicazioni di funzioni
e per questa motivo essi sono
Approssimazione intrinsecamente strutturati sia 10-
calmente che g10balmente.
Supponiamo di voler approssi-
mare un intervallo in uno spazio
n-dimensionale (N-cubo) con un Modello con Condensazione
dominio prefissato. Per trovare
la Wi che minimizza l'errore di Quando l'approssimazione di un
approssimazione si pub proce- intervallo risulta troppo com-
dere in piu di un modo. plessa e di conseguenza il nu-
Nel caso bidimensionale si pub mero di trasformazioni da ag-
risolvere i1 problema col metodo giungere diventa troppo elevato,
dei minimi quadrati per il quale si pub semplificare l'analisi la-

57
sciando l'intervallo non com- 'fotogramma' in un istante in-
presso, cioe privo di funzioni terrnedio interpolando linear-
che 10 ammettono come codo- mente i coefficienti ed ottenendo
minio, aggiungendo alloro posto un attrattore che esibisce una
una funzione tabellata di identita morfologia di transizione tra
su tale intervallo (w(A)=A). quelli relativi ai due istanti h e
In questo caso A viene chiamato t2.
'set di condensazione' e pub es- Si possono quindi ottenere un
sere scelto in base a criteri numero qualsiasi di 'istanti di
strutturali riguardanti la compo- metamorfosi' tra una struttura ed
sizione, in modo da rappresen- un'altra fino a raggiungere
tare un tema od una frase da cui un'impressione di continuita
derivare il resto mediante tra- compatibilmente con la risolu-
sformazioni. zione minima.
Risulta peraltro evidente come Infine si pub procedere operando
modificando anche lievemente il direttamente sulle funzioni 0
set di partenza A si possano otte- sulla struttura nel corso dell'evo-
nere infiniti attrattori che man- luzione temporale: situazione ot-
tengono la stessa struttura. timale per i1 controllo di fasce
II set di condensazione pUl essere timbriche 0 per la sintesi granu-
usato sia in una che in n dimen- lare.
sioni.

Bibliografia
Modello 'Evolutivo'
[1] B.Fagarazzi, M.Sebastiani:
Se non si inserisce la coordinata "Codifica di eventi sonori e dei
temporale tra quelle da analiz- relativi parametri formali con it
zare, si ottiene un insieme di tra- metodo dell'I.F.S." atti IX
sformazioni, ciascuna delle quali C.I.M., pp.314322, AIMI-DIST
rappresenta un 'fotogramma' Pub!. 1991.
delle state della composizione in
un certo istante (tl). [2] M.F.Barnsley "Fractals
Una caratteristica fondamentale Everywhere", Springer Verlag
dell'IFS e quella di avere un N.Y. 1989.
comportamento stabile rispetto
aIle perturbazioni dei coefficienti
delle trasforrnazioni. Percib, se
si calcola un altro 'fotogramma'
in un istante successive (t2) uti-
lizzando 10 stesso numero di
funzioni e gli stessi intervalli
come suddivisione in domini e
codomini, si pub calcolare il

58
ANALISI STATISTICHE NEL RICONOSCIMENTO
DEGLI INTERVALLI.
Ugo Merlone

(A.M.E.T.-Conservatorio G.Verdi Torino)


P. zza Galimberti 110134 Torino
Tel. 011-3177119

achieve causes of failure in the


Abstract proper recognition.
Moreover it is possible to tailor
With this paper I mean to support aimed exercitation on the basis of
a statistical and Quality-Control the analysis results. This kind of
like approach to the harmonic and methodology is implemented
melodic interval recognition. under Ms-Dos on IBM-
The importance of the ear- compatible computer with VGA,
training for the musician is well Roland MPU401 Midi interface;
known, nevertheless not the source code is written in C++
everybody has these capacities and versions applied on melodic
developed, and so a long training intervals in the octave and over,
is often needed to reach the "pass harmonic intervals and note
mark". recognition with the movable Do
Programs dealing with these method are available.
problems already exist and are
available at some Softwarehouse,
but with this paper I want to
suggest a new approach. As a 1 Analisi statistica degli
matter of fact each user successes errori.
and failures are recorded, and on
these data statistical analyses are Con questo lavoro si e pensato di
built (type I and type II errors, dare una valenza risolutiva alle
phenomena Pareto's charts). analisi statistiche norrnalmente
These analyses can be performed presenti nei nurnerosi software di
on each kind of proposed ear-training. Si e infatti proceduto
intervals and are oriented to a considerare gli errori nel
riconoscimento di un deterrninato

59
intervallo come dati grezzi per • X e I : l'intervallo da
eseguire varie analisi statistiche: riconoscere

• Errori di I e II specie • gel: l'intervallo risposta

• Diagrammi di Pareto • E l : l'insieme degli errori


di prima specie rispetto
all'intervallo i e I

1.1 Errori di len specie. • E?: l'insieme degli errori


di seconda specie rispetto
Dalla teoria statistica dei test all'intervallo i e I
d'ipotesi si sono presi a prestito
due locuzioni molto utili Si avra, considerato un i e I :
nell'interpretare gli errori che si
compiono nel riconoscimento
degli intervalli.
E l ={ge I/g*-iAx=i}
Quando si prende in esame un
e
test statistico di un ipotesi si
possono commettere due tipi di
errore: E? ={geI/g=iAx*-i}

• Errore di I specie: quando


si rifiuta un'ipotesi vera
1.2 I diagrammi di Pareto.
Errore di II specie: quando
si accetta un'ipotesi falsa Nel 1897 l'economista italiano
Vilfredo Pareto, prendendo in
Considerando un determinato esame alcuni dati statistici di vari
intervallo i, quando viene paesi, giunse a concIudere che la
proposto un intervallo x = i distribuzione del reddito non e
chiamero errori di prima specie uniforme; una teoria simile estata
gli intervalli di risposta g *- i che esposta, sotto forma di
vengono detti erroneamente diagramma, dall 'economista
essere i; e quando viene proposto americano M. C. Lorenz nel
un intervallo x' *- i', chiamero 1907. Entrambi gli studiosi hanno
errori di seconda specie gli evidenziato che la maggior parte
intervalli g = i che sono della ricchezza e posseduta da un
ovviamente errati. numero moIto ristretto di persone.
Formalizzando, siano: Nel campo del controllo della
qualita, J.M. Juran e riuscito ad
• I : l'insieme degli intervalli applicare i1 metodo del
considerati diagramma di Lorenz come
strumento per classificare i

60
problemi primari, pochi rna di 3 Possibili Sviluppi
notevoli effetti, e problemi
secondari, numerosi rna dagli Pur essendo la versione originale
effetti limitati. Mediante questo dedicata al riconoscimento degli
metodo che ha chiamato analisi di intervalli melodici ascendenti e
Pareto, ha infatti posto in discendenti nel1'ambito del1'ottava
evidenza che in molti casi la giusta, sono state sviluppate
maggior parte dei difetti e dei alcune versioni dedicate al
lora costi e causata da un numero riconoscimento degli intervalli
relativamente piccolo di cause. armonici nell'ambito di due
Vi sono vari tipi di diagrammi di ottave, e degli intervalli melodici
Pareto, quello qui considerato ascendenti sempre nell'ambito di
consiste semplicemente in una due ottave.
distribuzione di frequenza (0 Sono inoItre in corso di sviluppo
istogramma) di dati in cui si ha una versione dedicata al
sull'asse delle ascisse il tipo di riconoscimento assoluto degli
difetto ordinato per frequenze intervalli, una al riconoscimento
decrescenti e sulle ordinate la di figurazioni ritrniche elementari
percentuale del difetto. ed un software che permetta di
utilizzare i dati sugli errori
precedenti per determinare dei
semplici dettati melodici.
2 Struttura del software

La struttura del software e molto


semplice, una volta caricati i dati Bibliografia
sugli errori precedentemente
commessi, e possibile procedere a [1] A. Agamenone, "Teoria
riconoscere: Fondamentale della Musica"
Carish, Milano.
una serie di intervalli
determinata in modo [2] J.K. Galbraith, "Storia della
indipendente dai precedenti Economia" Rizzoli, Milano,
errori 1988.

una serie di intervalli [3] H. Kume, "Metodi statistici


determinati con una per il miglioramento della
distribuzione di densita Ie qualita" Isedi Petrini, Torino,
cui probabilita sono 1988.
direttamente proporzionali
agli errori compiuti [4] D. C. Montgomery,
"Statistical Quality Control" ,
John Wiley & Sons, New York,
1985.

61
[5] J. A. Paulos, '1nnumeracy",
Penguin Books, London, 1990.

[6] S. M. Ross, "Introduction to


probability and statistic for
engineers and scientists" , John
Wiley & Sons, New York, 1987.

62
DEFINIZIONE DI RETI DI PETRI PER
L'ANALISI DELLA MUSICA
ELETTROACUSTICA
Simonetta Sargenti

Civica Scuola di Musica di Milano


Corso di Porta Vigentina IS, 1-20122 Milano te1.58314433

It is more difficult to describe


Abstract
the differents levels in timbrical
parameters. In fact these can be
This paper discusses how to associated to midi channel only.
analyse and represent a piece of
electroacoustic music by Petri In questo articolo viene proposto
Nets. The piece is Resonant un esempio di analisi e
Wholes by Claudio Gabbiani and rappresentazione di un brano di
Simonetta Sargenti, entirely musica elettroacustica mediante
realised with the M.A.R.S. Reti di Petri. Questo genere
(Musical Audio Research Station musicale, com'e noto, non e
-Iris,Bontempi-Farfisa). facilmente rappresentabile su
Generally, electroacoustic music pentagramma e risulta quindi
cannot be represented by a difficile analizzame e
traditional score; Petri Nets is a visualizzame gli eventi.[I] Le
modelisation that make possible Reti di Petri sembrano offrire,
to describe electroacoustic music almeno in parte la soluzione a
events and their relations. One questi problemi consentendo di
can create many kinds of nets at creare partiture alternative. G1i
differents levels: 1) for eventi musicali con riferirnento
representing the thematic ai parametri di altezza, durata,
structure of the piece 2) for intensita etc. sono associati ai
representing algorithms and posti, mentre Ie relazioni fra gli
tones .In this example of analysis eventi stessi ed i loro rapporti
is shown the first section of the strutturali dipendono dalIo
whole composition only, which is svolgimento e dalla struttura
represented entirely at the delle reti [2]. Si potra quindi
executive level, during the descrivere una composizione in
performance. Petri nets seems to rapporto agli e1ementi tematici,
be a very good model to analyse oppure individuarne Ie
structural and thematic events of caratteristiche in relazione a
an electroacoustical composition. parametri piu' complessi da

63
analizzare, quali il timbro. II svolgimento di ciascuna traccia in
brano descritto in questa articolo posti che rimandano a diverse
e Resonant Wholes, di Claudio sottoreti (esempio 1)
Gabbiani e Simonetta Sargenti, Ciascuna traccia, pub essere
interamente realizzato con la visualizzata partendo dal nodo
stazione di lavoro M.A.R.S. macro corrispondente: dal posto
dell'lris, gruppo Farfisa - chiamato jap ch.2 , si ottiene la
Bontempi, una composizione per sottorete ad esso collegata, che
suoni di sintesi ed elaborazione comprende un unico oggetto
elettronica dal vivo. Per poter musicale ulteriormente articolato
mettere in relazione gli algoritmi a Hvello esecutivo in due posti:
con la rappresentazione mediante una pausa e I'evento sonoro jap:
reti di Petri, I'intero brano e quest'ultima rete puo' essere
stato descritto a livello esecutivo eseguita e da luogo ad una
con reti visualizzabili durante partitura.
I'ascolto. Nell' articolo viene
invece proposto un esempio di
analisi relativo alia sola sezione
iniziale del brano, della durata
di un minuto e mezzo circa. Si e
cercato di evidenziare in primo
luogo 10 svolgimento della
composizione e Ie sue
ESEMPIO 1
caratteristiche per cosi' dire
tematiche, ricorrendo anche alia La stesso procedimento si puo'
possibilita di creare piu' livelli di seguire per Ie altre tracce: iI
astrazione fornita dalle reti. Si e nodo macro jap ch.3 della rete
poi passati alia descrizione degli di livello piu' alto; richiama una
algoritmi di sintesi basandosi sottorete costituita da un'unico
sugli stessi principi. oggetto musicale, comprendente
La composizione, prevede, per la a Iivello esecutivo una pausa ed
sezione analizzata in tutto quattro un evento sonoro (esempio 2)
tracce, registrate su sequencer in
modo da utilizzare dei files midi. ~
La prima parte dell'analisi
consiste nella descrizione dei vari
livelli gerarchici del brano per
rilevarne Ie caratteristiche
strutturali e tematiche e ESEMPIO 2
ricostruire una partitura. La Si puo' notare a questa punto
rappresentazione piu' generale, e I'analogia delle tracce analizzate.
data da una rete che sintetizza 10 II nodo macro gliss ch.5

64
richiama una sottorete Anche qui si e scelto iI metoda
raffigurante la struttura loop in top-down, partendo dal Hvello
base alia quale I'oggetto musicale piu' alto, per giungere al livello
contenuto nella traccia, viene esecutivo. Si ottengono cos) due
eseguito due volte. Anche in rappresentazioni a livello piil
questa esempio gH oggetti generale e varie sottoreti
musicali non subiscono particolareggiate (esempio 4). Le
trasfonnazioni, rna vengono solo reti del livello esecutivo sono
eseguiti. II nodo string chA, costruite in maniera
contiene un'altra rete di livello deterministica per consentire la
alto da cui si ottengono tre successione degli eventi del branD
diverse esecuzioni del medesimo nell'ordine originario. Inoltre dal
oggetto string, originale e procedimento top-down adottato
trasfonnato (esempio 3). si ricavano anche gli incisi
tematici definiti nei posti
(esempio 5).
string
~2 ~s o
ESEMPIO 5
A questa punta dell'analisi si puo
affermare che Ie reti di Petri
offrono uno strumento valido per
individuare eventi musicali
detenninati da parametri di
altezza, durata e dalle loro
trasformazioni e si rivelano
p
inoltre assai duttili, consentendo
Iivello esecutivo hacce 1-2-3
ESEMPIO 4
diverse rappresentazioni delle
medesime strutture. Tuttavia la
musica elettroacustica e
Da questa prima descrizione principalmente caratterizzata dai
ottenuta con un procedimento parametri timbrici, si e voluto
top-down, se ne e poi ricavata quindi realizzare una
una seconda che tende ad rappresentazione di questa
evidenziare ulterionnente Ie aspetto, particolannente
caratteristiche gia definite. significativo sempre mediante
Poiche I'oggetto gUs, e costituito I'uso delle reti, partendo dal
da un inciso A, autonomo e da un presupposto che un oggetto
inciso B che riprende I'elemento musicale puo' essere associato
tematico string, questa traccia e non solo a sequenze di note, rna
stata articolata in due differenti. anche a parametri di controllo; iI

65
timbro puo' quindi essere Nella nostra composlZlone sono
definito sempJicemente dal canale presenti quattro diversi algo;
midi cui viene associato. La Japdrums, che comprende due
stazione di lavoro utilizzata per tones e descritto in un nodo
la composizione del brano qui macro cui corrispondono i due
anaJizzato, permette di creare oggetti sottorete jap e jap2
algoritmi di sintesi ed definiti rispettivamente dal canale

'.
elaborazione del suono, con MIDI 2 e 3 (esempio 6). lap
I'ausilio di un programma lapdrums ®
grafico che visualizza i
componenti dell'algoritmo e Ie ~2
loro relazioni, consentendone una ESEMPIO 6
comprensione adeguata senza una Le reti di Petri costituiscono
ulteriore descrizione mediante dunque un modello valido per la
reti, peraltro possibile. E' invece rappresentazione di strutture
interessante notare che in questi tematiche della musica
algoritmi sono presenti pili livelli elettroacustica, altrimenti di
gerarchici: I' algo 0 struttura di difficile descrizione; pili
base che rappresenta Ie problematica risulta invece una
connessioni esistenti tra i moduli approfondita descrizione dei
per la sintesi 0 per I'elaborazione parametri timbrici associati ai
del segnale e i vari tones che canali midi.
sono un'articolazione ulteriore
dei parametri dell'aJgoritmo References.
stesso. L'algo corrisponde al [1] P.Schaeffer:"Traite' des objets
livello piu' alto di musicaux" ,Paris, 1966,1985
rappresentazione con Ie reti di D.Smalley:"Spectro-morphology
Petri ed iI tone al Jivello di and structuring processes" ,in
sottorete da eseguire .Una S.Emmerson:" The language of
possibilita per rappresentare gli electro-acoustic music", London,
questi algoritmi con Ie reti di 1986
Petri sembra dunque essere S.Sargenti:" Per una dejinizione
quella di definire ciascun timbro delle unita percettive di
come posto, senza attribuirgli Aquatisme", in "Secondo
altro parametro che iI canale convegno di Analisi
midi Dato che ciascun Musicale" ,Trento; 1992
algoritmo puo' dar luogo a piu' [2lG.Haus,A.Sametti:
timbri diversi partendo dalla "SCORESYNTH: a System for
stessa struttura , questa the Synthesis of Music Scores
differenza puo' essere definita in based on Petri Nets and a Music
termini gerarchici costruendo Algebra" ,IEEE
posti che richiamano sottoreti. Computer,vol.24,N.7,1991

66
THE COMPOSITIONAL PROCESS AND
TECHNOLOGICAL TOOLS: AN APPRAISAL OF
ALGORITHMIC COMPOSITION AS IT
RELATES TO COMPOSITIONAL PROCESS
Dr. Noel Zahler, Co-director
Professor of Music
Center for Arts and Technology
Chair, Department of Music
P.O. Box 5632
Connecticut College
270 Mohegan Avenue
New London, Connecticut 06320
nbzah@mvax.cc.conncoll.edu
nbzah@conncoll.bitnet

Introduction usually do so dependent on


philosophic rather than aesthetic
"The misunderstanding that lies in wait or artistic criteria. Such
for us and which we should mistrust compositions are, more often
terribly: is that of confusing composition
and organization." than not, appreciated for their
academic significance as they lack
Pierre Boulez, "...Aupres et au loin" a sufficient complexity of
[1, p.183] relations to engage the listener
more fully.
The statement above, made In fact, because of our
in 1954, might be considered interest in identifying
more prophetic today than we formalisms to be included in
might have suspected at the time computer assisted composition
of its pronouncement. We are programs one might argue that
now inundated with computer we have become so fascinated
assisted composition programs with process, data manipulation
that generate output ad for its own sake, that we may, at
infinitum.. The results of these times, forget that these
calculations must not be formalisms are a direct result of
considered compositions. They the perception of compositional
are "raw data" from which details. It is the consequence of a
composers may wish to draw compositional strategy which
compositional material. To allows us, through reflection, to
interpret these calculations in any perceive a multifarious network
other manner is to misunderstand of relations which is the
the very act of composition itself. composition. If we try to
When we are asked to accept this convince ourselves that we hear
output as a "work of art" we

67
relationships that we want to hear profitable to put our faith in the
in a given composition because of faculties of professional
an apriori assumption then we composers and our energies into
are doomed to repeat the specific types of programs that,
mistakes of the past. For some, in fact, "organize" and allow the
the engagement with and the micro and meta decision making
purpose of, the compositional process to take place using
process seems to have become strategies other than the
obscured entirely because of a "automatic."
reliance on the strategies of a
computer assisted composition Computer Assisted
program. Composition vs
As users of a powerful and Automatic Composition
growing ever more powerful
technology we must ask questions We must, from the
about the need and use for the start, distinguish between
tools we build in light of the "computer assisted
empirical and aesthetic demands composition programs"
we make on them. What are the and "automatic
advantages of these tools and composition programs."
what are our own needs for The former seeks to help a
them? composer with the various
This paper explores the and laborious manipulation
ground upon which a number of of data while the latter
well-known algorithmic seeks to have the machine
composition programs have been compose. We are not
based and scrutinizes both the interested in machine
procedures, as well as the results composition in the present
in search for a model by which discussion. We are
we may reflect on the interested in the machine
significance of the work of art obviating details of a
and the means by which we musical structure which the
create and perceive that work. composer deems relevant
Inquiry is made of a number of to his/her conception of the
methodologies associated with composition.
algorithmic composition and we Surveying the
consider from an analytic literature concerning
(perceptual) viewpoint both the computer music
process and the result. The composition one is
author's own compositional immediately struck by the
program (Music Matrix) is confusion which seems to
considered in light of the surround computer
discussion. The appraisal Assisted composition of
suggests that it might be more any type. It is our

68
contention that computer operations for
assisted composition is a accomplishing some task,
tool for the experienced or solving some problem
composer. It is a means [2]." In essence, with
for exploring regard to music
compositional material composition, an algorithm
which has been invented by is a set of instructions
the user and helps in the which seeks to simulate a
discovery of relationships compositional formalism.
which may not be Algorithmic
immediately obvious or composition has in many
which the composer hopes ways defied definition. We
to employ as a geheimnus recognize the terrain, but
of the envisioned in a similar manner to a
composition. "colloquialism" there is no
First, and foremost, is the generally agreed upon
composition. We define definition or list of
that composition as a techniques or methods for
mental construct found in implementing such a
the mind of the composer. program. The approaches
It is the task of any given to devising such
computer assisted formalisms are as
composition program to numerous as those who
help facilitate the attempt to write the
composer's realization of programs. Perhaps this is
that composition as distinct how it should be, for the
from the computers compositional process, in
realization of some facet of terms of a methodology, is
the composer's idea. All entirely subjective. By its
too often computer very nature it eludes
realizations are one formal definition. So what
dimensional generalizations is it that we hope to
dependent on specific capture when writing an
algorithms rather than on algorithmic computer
the composer's own vision assisted composition
or compositional program? What part of
imagination. the compositional process
can we possibly hope to
Algorithmic capture? In our opinion
Composition the answer to both
questions lies in dealing
"An algorithm may with the most formal
describe a set of rules or procedures of music
give a sequence of composition in as straight

69
forward and intuitive a needing to explore pitch
manner as possible. This is material he/she has already
no simple task, for as invented. The Music
Gareth Loy argues "[t]he Matrix organizes pitch
creation - and sometimes according to very specific
even the appreciation - of instructions from the
formalisms generally takes composer. The Music
an exertion of great mental Matrix is a Macintosh
effort because one must be program written in the C
holding the goal of the language. It is a
method and the requisite composer's/analyst's tool
actions in mind, while for defining pitch
simultaneously observing structures. Music Matrix
and recording what one is will generate lists of pitch
doing to achieve the goal. " transformations for user
But, while extolling the defmed sets of any
virtues of formalisms Loy ordering or cardinality. It
quickly points out that will list transpositions,
"most formal procedures inversions, retrogrades,
used in music are not retrograde inversions, M5
strictly algorithmic [2]." and M7 operations, as well
If the latter statement is as other user defined
correct then it must follow transformations. It will
that any attempt to create decline and define subsets,
an "algorithmic list interval vectors, Z
composition program" is related pairs, and build
contrary to the very nature combinatorial, as well as
of the beast. user defined matrices. The
program is equipped so
The Music Matrix that users may specify the
types of transformations to
The Music Matrix is be used in building
an original composition matrices. When directed
tool created by this writer by the user the program
and Jonathan Kozzi. will build aggregate or
Although it is still in its weighted pitch structures.
development stage it offers All information compiled
a number of advantages to by Music Matrix can be
composers and theorists saved, printed or
studying the realm of pitch transferred to any word
relations. The program is processing program for
not intended to compose. further formatting. In
It is simply an aid to the addition, all output is midi
experienced composer compatible and can be

70
routed to external synthesis
equipment to be heard. through integer notation
Music Matrix is a with the numbers 0-11
typical Macintosh program representing the chromatic
using pull down menus and scale modulo 12. When
dialogue boxes to accessing the "Preferences"
communicate with its user. level of the File menu a
VVhenlaunched,three dialogue box is activated
menus and a blank window (ex. 3)
are presented (ex. 1). The
File and Edit menus, Ex. 3
respectively, allow the
user the usual Macintosh r.nl: I iilbiiClll'
options 1IIio.: u::J [J"",llIPd_B••
DIIIlDdF_t 1olIIlelulpll1lf_1
Ex. 1. OHDIII t-I\o R,D IIlITdlD 1IIG0eer
0 •••: 0-'11\ 10,11 OLlIoiIIlIII
4IlRII! Ill-B. t.. .LBlIIIIlIlinl

~ File Edit SallP .~ d.i " IlII UlII! pr1IIllierllJ_


:0: H tOllollUlum PO I'1U81£6111lfo ::E:!..! [1111II 11... _] ~
(_. I (l:8'lIll) ~
lQ

which gives the user the


choice of specifying pc's
10 (pitch classes) using
c
hexadecimal notation,
for opening, closing, decimal notation, or a
printing, saving, and generally accepted
allocating preferences; cut, alternate notations using
paste, and copy functions, the integers 0-9 and the
etc. It is the third menu, letters t (ten) and e
Sets, (ex. 2) which (eleven) to eliminate any
initiates the most important confusion in base 10
features of the program. notation. Once the form of
Pitch is referenced input notation is selected a
throughout the program second dialogue box (the
"Control Panel")is
Ex. 2 accessed (ex. 4.) which
allows the user to select the
types of lists they would
Ellter •.• au like to have declined when
selecting the
Transformations level of
Build M8[I1Cl1I... 3CM
lIuild 8t'~i~1I~... :~11
the Sets menu in screen 1
Build I!omlilina... ieD (each category of

71
transformation appears in transformations will be
it own screen). used in generating the
matrices. Two other
Ex. 4 options are available,
"Shuffle List" and " Use
• _1>11111..." nnoUlu lio _lJIl C U.. N r. . .
Ex 5.
IllIlI1InIDllllU'" lID RB1rllllnldll III ur.UI1O 'I.1IIIl
B '"",,""'n IBI R. I.......... n ,., . . . . gr1I8111.
111......... 1...... 10" O~.llk:lll'lf'" lilll__ 010151 ol(Jr
liilL~t• ...,... PIl.....
IillmulniJn OMSI OHlI
08.1... 0'.... 01ol511 OWlIll
08 Inllllll'llllll Olllni 0
Ion' IlII
....",1111.1. OUR .8111.111R11
_ mm.llI.. 'ID ,ll!Igl.....m...
These include a ~ 08llUMIIlJort

choice of all forms of the l CIIlICIi ) 01111 _ I •• nn

entered set listed according


to "normal form" or the Normal Form." The
original form which was "Shuffle List" command
input by the user. If the randomizes the order of
"set" was a twelve tone set matrices and the " Use
there is an option for a Normal Form" allows the
"magic square." user to disregard the
Traditional serial ordering of the original set
transformations such as and generate all matrices
transposition, using the normal form of
retrogression, inversion the collection.
and retrograde inversion Having navigated
are all available. In through these last three
addition, tables can be dialogue boxes we have set
generated for M5, M7, and all the preference
any user defined parameter for a particular
multiplicative operation look at a pitch class set. In
deemed necessary. referencing the Input
Common pitches between selection of the Sets menu
the original pc set and its we access the input screen
transformations can be (ex. 6). This window
listed and combinatorial allows entry of up to
matrices will be generated twenty-five (25)characters
for collections possessing horizontally and fifteen
those qualities. The latter (15) vertically. A single
function calls up yet line or matrix of the users
another dialogue box (ex. choice may be entered
5) which allows the user to allowing the exploration of
choose which of the harmonic or melodic
already derived elements of the materiaL

72
The first data screen,(ex. again and clicking on the
7), displays the original "Build Matrices" item
string (set) at the top of the another dialogue box
screen followed by its appears (ex. 8). This panel
normal form, interval is at the heart of the
Ex. 6. program
Ex. 8.
III::tD IUtlII 13dt:lllt.. • D n1 urN.
W _ ......IIIQIlS.

_ ...... lIIIJe~n - . ....ri: 1[1 I


1lI.IBI'lJoa~n.
.......rtno.. r IIddR,* ) LDlJlet. .)
Z nllBlllllJiR
I&GOlln IDiUPl"IIII~rar: (D.m.... 1Il.BIP1
( DIllION )
~IBli ( E:dita.l. I I bllflliuild. I
NDn·llIIIIllllll 'U.... I_llIO. II S.lIa.)
_..
&1.111.70111
1IIe1j11l11lid InllWlllt
N_.hir.,. o QIlIrIkllllllll
Z"l1l11l"".
_ l I t -"'e lII_lSlarllll rooprrmffi[WUj1[l
£";1 r.tdIi.i~ r.wf:_~

class vector, and prime


effectiveness. Here the
form name (from the
user
may use any of the stock
Ex. 7. rules to operate on their
material or create their
... 111. 10= own rules to search and
n:rlflll f'IlnI 1=: ., Z • 1 U
101 Ill: .... aCi .04.
find transformations of the
40"""
ttr.nt I'ta\IaI: -
IhIlcpiI:il:ltlMC
collection they believe
ra all
_ I t l....
... a •• a
IJ\UlIII'lIIDtII
C1U .a, •• ., might be usefuL The rules
rl
r.z:
r:II
'41
al
lUI
0 iii t."
I If t T,:::::I
:I I • • • •
I'" IIiJOIi
~r
GL~
P I I G1 t
1:Z I ' " 'J.
:11:1 I
already incorporated into
~
"' t'11
alii • t 1 • • "1
t'EtS.
CIU
<t'~ t'l I , 'C Q
,.
the program appear in the
ID au 51-:l.t\1 a) !I D 7. a,
n (.011
"' al
6.!" D. e:
,t'll.
m, .. , 10111
C!) P'I I 1 2 11
boxes on the left side of the
I'Q an a7il1D a, b G t I."
rEI c::J I
rlJ OG I
Q D ., J/; !I 1
L D II .. I :I
&&.1
eli
b 4 • II .. Iii
\.11 Bill VI"
screen. The various other
buttons allow the user to
customize their selection.
The Max Count window
Morris Listing). If the set allows the user to specify
has a "Z relation," the the number of iterations of
prime form, name, and a particular
normal form of that transformation. This is
related set is displayed as useful in situations where
well. Below this are the there is limited memory.
lists of transformations The box in the lower right
previously tailored in the hand side of the screen
"Control PaneL" By allows the user to weight
accessing the "Set" menu collections by changing the

73
index in each of the small transform time dimensions
boxes. The twelve boxes in a similar way.
represent each of the
possible twelve pitches Ex. 10.
found in the user's
collection. The Do button
allows the user to apply the
=- ...... am.,.
Ex. 9. allO,. •• III'
.,.::a 'I a
:lII I ::I . . . . . .
014.7. D1oa .. ' g t..

21 4 5 t t .. .., 1::1 1.0 1'5rt1., 17AD1G


IiEWI.......... HDC _ _ 1m 1I12'~P "1.03" • D12 I " • • .. , 8 t.'I
• ttlill4'
• 7.:1 . . .
• t ra :I ":I
:I 'I • • JI.
ILlS" •
• PIa • ':lI
,I1IS'.
a 1;S .. 'II • ., ...
1 • • '" .... I ,
I' a" ••
flI D PI D .11 0
-:I .. ,. • •
1II'ltt ••
,Z::IT • • • I I D.'
l.'III1.:a:
I • • ., • •
IC "
• a
• ,
I ... ,
.2 1 , .. ,:I
• ., • a
:e
'I
,.? .. z: I
•• " a. 4
.,IfT;:'
a.c. ..
• :::I .. 'I: l!lID

• J D ~ •• 8::1 1 • • T' lIIiII::II.'.1 3::r I • • .,


".,.21 IJ III , ::I I! I I P III I! I a D' B 2 I P '~DZ 1117" t.. D" • • 12'781 V. LB ...
1..a45. -.9 Ii 1 aU "J.1111 D1.3 .. , ::S'SQ
.,. tI 1 Q
+1. tGI
III 7"11',
II« 1.:1 ... '.'lU.
:Ii ... lilt t
• .... V 81.
&:::1 1 • • '7
."0
~aal
4. tI , • D
I ,,'.a
., ,,::I ill
.Iii .. \I. A
.'.t ••
1 •• '1.
V.7iiZI
II" Iii'
r.
L•
e

.. "..
I" G .. L
I • :I . . . . .

""
eI1:tQT
<1tdG~'
, ... ,,,t
:I 'I.,
.. « • 1 ••
::J q T I It I

• 1 Ill' .,,,
• • a.fll
• ,. • •

B I: I .t 'J
.t'D~'
I Dd ."!I
III
... a .. t.,
al:ll.Ta
•• l ' • 7
t~at='D

1G.'I!
... 1:1.
laa",

..10
t:tD'L
Oa1.6
, • .,1
Ex. 11.
""\.'4' 11 4 •• ,-, • ,t 2~ 4 2::1 4 • D t.
o •• S Iii '" • D I Iii' 1 Irr.. a .. Iii r. L.Do+
I 1. '7 tI t 11.I I• "a II
•t G
• ... It, • 1'4 I I t l lit" 1I ;D... a....m iIaIIll...
., , •• y D I ... ,
00. ,t «I.
c. • • • • •
. . . . . . :1:1
;rt III • a 41
• • 11I:=1:::1 I
'''It.,
•• & ••
Laad:l:ll
7 II' to •
11::111 • • .,
"7' • 1 •• • 'I .. CI. L GIH Il • O.L .....
rules selected and move to 'in.'4'
Ott"6
.'1::11 V
54 II. t.
,., ...
till. 7.
'I7.ill'O
S . . . . LQ

the next data window (ex. .feU 11


:21'51 t.
• t D. 4'
I'IID47'
""DIII
.. Slit.. .~'II'~
2148'74

9). Returning to the Set e:. '7 • q I "'I iI' til. 1I7.PI ""2'V;I
4. ..,.. "Iii"
:a ..... L td .'IiG d~aal la.llt;;
menu again and accessing 'I , • V'
1117 • •
;JI::I1U1I7
D ..... Iii ..
'I" 'I. •• V1l7i121
a ....
the Build Regions and
Build Domains levels
(examples 10 and 11)
brings about Conclusion
similar actions organizing
transformations of this As our tools become ever
hexachord into full twelve- more powerful and we are
tone sets and 12X12 tempted by the proverbial
matrices of this set. "automaton" we must be careful
While the above not to destroy the very essence of
discussion has concentrated the act of music composition. We
on twelve tone must take special care to allow
applications, the authors the proper amount of interaction
want to emphasize that a between each individual and the
much broader range of machine. Again, Boulez
applications are possible. comments and we who create
In addition to refining the these tools should take heed:
present program we are
developing a program to "let us rather see the work as a series of
refusals in the midst of so many
possibilities; one must make a choice

74
and it is there that we encounter the
difficulty so well side stepped by the
expressed desire for 'objectivity:' Such
[5]Morris, Robert D.
choosing is precisely what constl~utes Composition with Pitch-Classes,
the work, being renewed at each .1~Stant Yale University Press, New
of composing; the act of composltlOn Haven and London (1987).
will never be identical with the
juxtaposition of ~e confronta~o~s "
established in an Immense statlstlc. [6]Roads, Curtis, ed. The Music
Machine, The MIT Press,
Pierre Boulez, "...Aupres et au loin" Cambridge, Massachusetts
[1, p. 204]
(1989).
Acknowledgments [7]Roads, Curtis, and John
Strawn, eds. Foundations of
Gratitude to Analysis Computer Music, The MIT Press,
and Technology, Inc. for Cambridge, Massachusetts
financial support of this (1989).
project and to the Center
for Arts and Technology at [5]Cope, David. Computers and
Connecticut College for Musical Style, A-R Editions, Inc.,
continuing support. Madison (1991).
Selected References [6]Ames, Charles. "Automated
Composition in Retrospect,"
[1]Boulez, Pierre. Notes of an Leonardo 20(2) (1987).
Apprenticeship, Alfred A.
Knopf, New York (1968). [7]Knuth, Donald E. The Art of
Computer Programming,
[2]Loy, Gareth, "Composing with Volume I-Fundamenta
computers," in Current directions Algorithms, Addison-Wesley,
in computer music research, ed. Reading, Massachusetts (1973).
Max Mathews and John Pierce,
The MIT Press, Cambridge, [8]Baird, Bridget, Donald
Massachusetts(1989). Blevins, and Noel Zahler.
"Implementing an Interactive
[3]Bain Reginald. "A Musico- Computer Performer."
linguistic Composition Computer Music Journal, Vol.
Environment." Proceedings, 17, N.2, pp.73-79, 1993.
ICMC 1988, San Francisco,
p.102.

[4]Baisnee, Pierrre-Francois, et
al. "Esquisse:A Compositional
Environment." Proceedings,
ICMC 1988, San Francisco,
p108.

75
76
Capitolo 3

PERCEZIONE

77
78
THE PERCEPTION OF MUSICAL STRUCTURE
AND TIME

David R. Keane, Lola L. Cuddy,


Carole A. Lunney, and Jennifer Dufton

School of Music and Department of Psychology


Queen's University, Kingston, Ontario, Canada K7L 3N6
FAX 1-613-545-6808
E-mail keane@QUCDN.QueensU.Ca

Introduction. The objective, from the


composer's point of view, was
The experimental research to make a piece that would be
reported in this presentation was engaging and listenable, using
part of a project concerned with 20th-century harmonic and
the perception of musical time melodic features. Although the
within large-scale structures. piece was written specifically
Specifically, we were interested in for the experiment, the
quantifying the perceived passage aesthetic approach was virtually
of time and its relationship to the same as the composer
the formal structure of a musical would use in any musical com-
piece. A musical piece was position. The one difference in
composed especially for the technical approach is that the
experiment. From a research structure was derived primarily
aspect, this strategy had a from textural, rhythmic, dy-
number of advantages: it ensured namic and melodic processes.
that, within a musically valid Typically, the composer would
stimulus context, a number of emphasize timbre in equal
methodological constraints on measure with the foregoing
the experimental techniques parameters, but he decided in
could nevertheless be honoured, this instance, for the purpose of
and it ensured that the piece was simplification, to limit the tim-
equally unfamiliar to all listeners braI range to that comparable
tested in the experiment. to a conventional acoustic
Moreover, it provided an oppor- instrument.
tunity to compare experimental
findings with the composer's The musical piece.
intentions and intuitions, thus
yielding hypotheses for further The composer was asked to
investigation. provide a short (4-min) poly-

79
phonic composition. Conven- into phrases is less simple and
tional (Western tonal) functional clear cut. The work is
harmony was to be avoided, but structured largely through
the piece was to contain a progressive (especially dynamic,
formal, metric, and motivic harmonic, and melodic) stra-
structure that could be tegies, harmonic/melodic
apprehended by our sample of sequences, and cadential
listeners. A requirement of the breaks. The general flow of
experimental technique was that the piece, however, could best
the piece contain no literal be described as capricious.
repetition of phrases, or repeats.
The composer provided a de- Figure 1 displays the first two
tailed segmentation analysis of notated lines of the piece. The
the piece, but was not told piece was composed using
anything about the specific Notator 3.1 software on an
motivation for the experiments. Atari 1040ST computer. The
MIDI file for the piece was
The composer's description then transferred to a Zenith
follows: he set out to create a Z-200 computer, which con-
work with some reference to trolled both the presentation of
tonal centres, but with distinctly the piece to listeners and
nonfunctional relationships experimental tests based on the
among pitches. The resulting piece. Cakewalk Professional
music is rhythmically and Software 4.0 was used to create
metrically fairly conventional, the test files used in the
but, in the main, the harmonies experiment. The piece and the
and melodic line avoid triadic or experimental tests were
other formulaic reference. The realized with the pick-guitar
work draws quite freely upon the setting of a Yamaha TX-802
12 elements of the chromatic synthesizer.
scale. It has a clear sense of
phrase and of cadence, but there Experimental procedures.
is not-infrequent elision of
phrases. The first (30 bars) of For one component of the
the three sections treats several experiment, the methodology
thematic/textural ideas. The was adapted from a study by
second section (57 bars) develops Clarke and Krumhansl [1] in
these melodically and harmoni- which listeners were asked to
cally and the third (31 bars) identify the location and
recapitulates the first section, but duration of short segments
the materials are somewhat extracted from musical pieces.
elaborated and the segmentation These authors proposed that

80
.'
Figure 1. The first eight bars of the piece.

the pattern of listeners' reference than a composer's


judgments reflected their sense hunch.
of the rate of musical develop-
ment over time. The present Sixteen musically trained
experiment examines and ex- listeners (6 males and 10
pands this proposal. females, aged 19 to 22) parti-
cipated. They heard the entire
A second component of the musical piece 20 times--4
experiment collected listeners' sessions of 4 listenings each,
impressions of the piece. This plus 4 additional listenings on
aspect of the experiment was of the testing date. After each
particular interest to the session, listeners completed a
composer because it offered two-part questionnaire.
systematic feedback of listeners'
impressions as the listeners The first part asked the
became acquainted with the subjects to rate the piece on 10
composition. Rarely does a subjective dimensions, such as
composer have much more than "active vs. inactive" and
his/her intuition as a guide to "pleasant vs. unpleasant". The
such decisions as amount of second part contained five
repetition, degree of digression, questions addressing listeners'
length of sections, etc. Although impressions of specific features
having responses of the kind of the piece (i.e., rhythm,
generated by this study does not melody, motifs, pitch structure).
relieve composers of difficult
decisions, the responses do offer On the testing date, after
some support for strategies that hearing the piece four times,
previously had no more frame of the listener heard a series of 22

81
segments abstracted from the computed, for each listener,
piece. The entire piece was between perceived location and
represented across the segments. actual location of the segments.
The segments varied in duration (A high correlation is evidence
from 11.5 to 25 sec, and were of a linear relation between
presented in random order. estimates of musical time and
Listeners were asked to estimate real time.)
the location and duration of each
segment on a horizontal time- For half the listeners, the
line, which represented the total correlation was statistically
duration of the piece (1 mm = 1 significant (p < .05). The aver-
sec). After hearing each age data for these listeners,
segment, listeners estimated called the High correlation
where it started and ended by group, is shown in Figure 2.
marking two vertical lines on the Also in Figure 2 is the average
time-line. A separate time-line data for the remaining listen-
was provided for each segment. ers, called the Low correlation
After all 22 judgments were group.
completed, the segments were
played again, and listeners were The functions shown in Figure
asked to rate the complexity and 2 were obtained by applying a
completeness of each segment. resistant smoothing technique
The listener was asked to to the obtained data. There is
consider complexity as the a nonlinear trend in both
perceived amount of musical functions, which is much more
information contained in the apparent in the Low group.
segment, and completeness as Both groups display a mono-
the perceived form of the tonic rise in the function up to
musical phrase--similar to that of a point toward the end of the
the perceived form of complete middle section of the piece
and incomplete sentences. (real time about 150 sec, bars
74-75). The slope is much
Results from the first component steeper for the High than for
of the experiment. the Low group. After that, the
function flattens for the High
For each segment, for each group and actually reverses for
listener, the midpoint between the Low group. At the end of
the two vertical lines drawn on the piece, for both groups, the
the time-line was measured in last segment was clearly dis-
mm and it served as the estimate tinctive and was located at the
of perceived location of the far end of the time-line.
segment. A correlation was (Note: The mean for the final

82
High Group
r-.. 250
8
8
'-" 200
.-.....
l:l
0
ro
u 150
0
...J
"'Cl 100
.-0)
;>
0)
u
l-<
0) 50
0..
0
0 50 100 150 200 250
Actual Location (seconds)

Low Group
250
r-..
8
8
'-"
200
.-.....
l:l
0
ro
u
150
0
...J
"'Cl 100
.- 0)
;>
0)
u
l-< 50
0)
0..
0
0 50 100 150 200 250
Actual Location (seconds)

Figure 2. Perceived location of segments as a function of actual


locations in the piece.

83
segment was not included in the gressed, there was a tendency
smoothing procedure, to avoid to overestimate the amount of
undue influence from this point.) real time remaining until the
end of the piece. The trends in
What is responsible for the the complexity and complete-
nonlinearity in the estimates of ness ratings suggest that this
perceived location? A possible extension of time was accom-
answer to this question lies in the panied by a sense of increased
listeners' ratings of complexity musical information, and a
and completeness. As the piece sense of increasing incomplete-
progressed, there were gradual ness--likely resulting in
shifts in both these ratings for increasing musical tension.
both groups of listeners. Musical time "slows down" to
Perceived complexity tended to accommodate this increasingly
increase during the development rich musical experience.
section and remained high, on
the average, until the end of the Results of the second
piece. Perceived completeness component of the experiment.
(with the exception of the final
segment which was judged "very Eight of the 10 rating scales
complete") tended gradually to yielded judgments that differed
fall from the end of the statistically from the neutral
development to the penultimate point of the scale. The
segment of the piece. There listeners' responses showed that
were, of course, differences in they found the piece "active,
ratings from segment to segment, restless, unconventional,
many of which corresponded to structured, interesting, and
the composer's segmentation playful". Ratings on two scales
analysis. It is the general trends, shifted during the listening
however, that are instructive on sessions. The piece was judged
the question above. These as becoming more "predictable"
general trends reveal perceptual and more "pleasant". If we
differences between the first and assume that the piece was ini-
third section of the piece, tially difficult to understand on
differences which converge with the listeners' part, the shift in
the composer's description of the perceived predictability and
greater elaboration and less pleasingness with stimulus
simple segmentation of the third repetition is supportive of
section. Berlyne's theoretical approach
to experimental aesthetics [2]
The results shown in Figure 2 [3]. However, it will be
suggest that, as the piece pro- necessary to study musical

84
pieces of differing structural listener's ability to enjoy and
content to evaluate the contri- understand contemporary
bution of this finding fully. music. Of course, it is now
important to vary structural
Concluding remarks. strategies in similar studies to
determine what effects each
We have shown that the meth- variable has on the listeners'
odology proposed by Clarke and responses.
Krumhansl [1] is a promising
approach to studying listeners' [Experimental work was sup-
sense of musical time. Our ported by a research grant
conclusions are necessarily specu- from the Natural Sciences and
lative: we need next to know to Engineering Council of Canada
what extent the experimental to L. L. Cuddy. D. R. Keane is
technique can identify sensitivity the composer of the piece.]
to different styles and structures,
and to what extent the listeners' References.
sense of musical time can be
manipulated by varying pro- [1] E. Clarke, c.L. Krumhansl:
perties of musical structure. "Perceiving musical time",
Music Perception, Vol.7, N.3,
The composer finds the initial pp. 213-252, 1990.
results supportive for his [2] D.E. Berlyne: Aesthetics
operating premise that, while and psychobiology, New York:
contemporary works must evi- Appleton-Century-Crofts, 1971.
dence novel or innovative [3] K.c. Smith, L.L. Cuddy:
techniques (which may be 'The pleasingness of melodic
initially distracting or alienating sequences: contrasting effects
for the listener), such conven- of repetition and rule-
tional musical features as familiarity", Psychology of
cadential divisions and progres- Music, Vol. 14, N.1, pp. 17-32,
sive patterns can be used to 1986.
reduce bafflement, to engage
listeners' attention, and to offer
musical pleasure. At the same
time, the degree of accuracy with
which listeners are able to
identify segments of the work in
time and the generally more
favourable response to the music
upon increasing familiarity with it
speak well for the average

85
TONE CENTER ATTRACTION DYNAMICS
AN APPROACH TO SCHEMA-BASED TONE
CENTER RECOGNITION OF MUSICAL
SIGNALS.
Marc Leman
University of Ghent
Institute for Psychoacoustics and Electronic Music
Blandijnberg 2
B-9000 GHENT
Belgium
Email: ml@flw.rug.ac.be

1 Introduction most important aspect of its


In this paper we focus on the emerging behavior. The results,
problem of tone center which show a good agreement
recognition and interpretation. with the mental structures for
The model is part of a theory of tone center perception known
musical morphology whose aim from psychological investiga-
is to provide an operational tions [7] have led to a theory of
account of music cognition in tone semantics [10] [11] [12].
terms of psychoaoustics and The computer model shows that
dynamic systems theory. The a schema for tone center percep-
model forms a basis for non- tion emerges automatically -
symbolic research in music without effort. The schema is
imagination and has applications produced by a self-organizing
for music analysis as well as neural network of the Kohonen-
interactive music making. type [6] and the inputs are
Computer simulations, based on patterns than come from an ear
ear models and principles of model. No information form
self-organization, show that higher levels or"domain specific
networks of artificial neurons, knowledge" is needed to explain
exposed to music, can develop the the emergence of the
functional organizations that are response structure, so the
relevant for tone center approach is purely data-driven.
perception [8] [9] [10] [11] [12] But in tone perception there is
[13]. The network, trained with more involved than just long
music, develops a response term learning. Mechanisms at the
structure divided into ordered periphery of the auditory
areas that reflect the circle of system, are continuously
fifths for chords and tone processing new information into
centers. The ordering, as well as auditory images. While these
the response to chords and tone short term images may
centers can be considered the contribute to learning at long

86
term, their relevance as isolated module and a cognition module.
auditory "objects" is far from The perception module is based
evident with respect to intelligent on a psychoacoustical model of
action and decisions in an the ear [14]. The function of the
environment. To make any sense ear model is to extract the
at short term, we must suppose relevant pitch information that is
that the information flow is present in the signa1. For tone
somehow related to information center recognition we estimate
that is stored in a long term that the relevant pitch
memory. It can be argued that information is to be found in the
organisms equiped with such residue region, that is, between
mechanisms have a high survival about 50 and 500 Hz [15]. The
value. The basic principle of frequencies of this region convey
recognition and interpretation is the tone information of the
that rapid changing patterns (that music at a particular point in
come from the senses) are time. This information is
related to memory structures represented by a so-called "tone
(schemata) that are less residue image".
vulnerable to such rapid changes. The ear model consists of two
New information is resolved by parts: an analytical part and a
the existing knowledge frame synthetical part. The analytical
such that the organism can part, adapted from Van
efficiently react to the stimuli of Immerseel and Martens, is based
the environment. on a model of hair cells
The aim of this paper is to implemented as a bank of
explore aspects of schema-driven asymmetric bandpass filters at
perception in a tone center distances of one bark (one
perception task. Although critical band). 20 such filters are
schema-driven perception is not used in the range of 220 to 7075
well understood, [2] we present a Hz (center frequencies). The ear
framework of how stable model operates at a sampling
structures of tone perception, rate of 20000 sa/sec, but the
once established, might be used extraction of envelope patterns
in recognition and interpretation allows a down-sampling to 2500
of musical signals. This involves sa/sec. The filters perform a
a short term and schema driven non-linear frequency analysis of
interaction because it relies on the musical signal. The output is
organised information in a vector of 20 elements, each
memory. element representing the
probability of neural firing
2 A Model of Tone within an interval of 0.4 msec.
Center Perception The synthetic part of the ear
The computer model consists of model comprises a periodicity
two modules: a perception analysis in each channel and a

87
summation over all channels. a space of tone center images,
Periodicity analysis is carried while interpretation is associated
out by autocorrelation of the with an adaptive dynamics of the
firing patterns. To sharpen the position of the short-term image
peaks, we clip the firing values within this schema. While we
to half of the mean value of the define recognition as a purely
analysed frame. The data-driven feature, it is assumed
autocorrelations are computed at that interpretation involves the
intervals of 10 msec. reconsideration of past
To account for the temporal interpretations in view of new
dependencies between patterns, contextual evidences.
the tone residue images are The tone center attractor
integrated within a window of 3 dynamics (TCAD) is based on a
seconds.The resulting images are model of brain dynamics. It
called "tone context images" describes the behavior of the
[13]. brain in terms of attractors,
To obtain the tone center stable states and transitions
patterns, we calculate the last towards stable states [5] [1]. The
tone context pattern of 72 states that attract other, nearby,
cadences to wit: all cadences of states are called attractor states.
degree I-IV-V7-I, I-II-V7-1 and We assume that tone center
I-VI-V7-1 for major and minor, images are stable states that serve
as well as their transpositions as attractors for other, less
over the chromatic scale (3 * 2 * stable, states - such as the tone
12 = 72). Given these 72 tone context images. An image
context patterns, we take the corresponds to the activity of
mean of the three corresponding neurons and is represented by an
cadences for each tone center, ordered array of numbers - a
which reduces our data to 24 (72 vector of pattern. A state is an
/ 3 = 24) patterns. The image at a certain point in time.
relationships between the Two features of this dynamics
patterns show a similarity of require particular attention. First
0.98 and 0.96 with the of all, in tone center perception,
relationships in the data of it turns out that recognition and
KrumhansL interpretation is often ambiguous
- in the sense that the perceived
3 The Metaphor key is related to different tone
The tone center attraction centers at the same time. The
dynamics operates as short term attractor states are indeed higWy
and relates tone context related to each other. In Jazz
information to the schema. In music, for example, tone context
particular, recognition is patterns often have no
associated with the localization pronounced tone center and part
of short-term auditory images in of the art is just to avoid the

88
attraction of tone centers. The does not involve any activity of
tone context is then localized the schema. Therefore, problems
somewhere in the middle of such as those related to the
different attractor states, and the interpretation of the IVth
forces of attraction have a degree, are not resolved.
relatively weak influence on the
position of the percept - which The metaphor of TCAD is that
gives more freedom to the of an elastic snail-like object
performer. In addition, a moving in the state space. The
feedback strategy may be head follows the time index of
necessary in order to adapt the information stream and the
previous interpretations in the tail corresponds to the P-states.
light of new evidence. For The position of the object P is
example, in the sequence IV-V- given by an interpretation state
I, the first chord points to the whose dimensionality is equal to
tone center of the tonic of the the number of stable states. Its
chord and it is only after hearing vector contains the distances of P
the rest of the sequence that the to the tone center images and
interpretation of the first chord thus determines the position of P
can be given in terms of a sub- in the framework of stable
dominanth and dominanth points. The path followed by the
function to the tonic of I. head of our elastic snail-like
object corresponds with the
a Recognition recognition process described in
To simulate recognition we the previous paragraph. The tail,
restrict the model to a however, corresponds to a frame
determination of the position of that keeps track of the past P-
the tone context image in a states. Thus, it consists of an
framework of tone center array of interpretation-vectors,
images. In other words, we just one for each P-state in the
follow the tone context as it is memory frame. In the current
produced by the data-driven implementation, the deter-
processes, and compute the mination of the position of the
distance of the pattern to all tone head is important because the
centers in terms of similarity interpretation of past the tail
relationships. Examples of such depend on it. And by changing
an analysis are given in Leman from one attractor to the other,
[12]. the interpretations will change
too. The position of the tail is
b Interpretation then reconsidered in the light of
The approach described above is the new position of the head. But
limited to a registration of the because the tail is susceptible to
position of the tone context in a the forces of attractors, it may
framework of tone centers. It happen that the positions of parts

89
of the tail remain near one and P(t-I,t-I)-attractors. Cor(.)
attractor, while the head and is the correlation function.
another part of the tail are near In general, the adaptation
another attractor. That is what is depends on the correlation
meant by "elasticity": a between P and T. When the
demarcation of past correlation is above the
interpretation states that is threshold (it satisfies the
caused by the forces of different condition of the A-set), then P is
attractors. in the attractor-basin of T and
will be adapted such that it shifts
4 Dynamics a little bit more to T. There
In what follows we simulate might be more than one T in the
aspects of the attractor dynamics A-set. Obviously, Eq.1 applies
by a computational method. The only to a limited history of P-
aim is basically to clarify our states.Therefore, the range of t
intuitions in these matters. The is limited to a few seconds.P(t,O)
law of adaptation is defined as: by itself is not adapted. To allow
a better concurrence between the
P(t ,1:)i = members of the A-sets we have
rescaled the output corO such
P(I-I;t-I)i + that all values above or equal to
thethreshold are rescaled to
a* L
tEA(I,O)
[cor[p(t,O);,Tk i ] * Tk i ] + values between I and O.

L l'rcor[PU-I,1:-n,Tki] * Tk i]
P* tEA (I-I,t-I) (1) 5 TCAD at work
The test example is the chord
sequence FM-FM-CM-Cm-Gm-
Dx(5)7-Gm-Gm-BbM-BbM-FM-
where i is an element of the Am(7)9-FM-FM. The expression
vector P and a and f3 are scaling "x(5)7" means dominanth chord
parameters. The summations run minus the fifth and "m(7)9"
over all tone center images T means minor chord with none
that are within a distance minus the seventh. The sound
(defined by a threshold) from P. was generated by a Csound
These tone center images belong program, using Shepard-tones
to the attractor-set A. The with a duration of 0.5 sec per
double time index is used to chord and a short attack and
specify the absolute time (t) and decay (to avoid clicking), For
the offset towards the past (t). practical reasons we have
The equation says that the summarised the results on a tone
adaptation of P(t,t) is based on center map (Figure 1). The map
the proper history, P(t-l,t-I) shows the trajectory of the
plus a change based on the purely data-driven method
contribution of P(t,O)-attractors ("recognition") in full line and

90
the schema-driven method generation is based on a sound
("interpretation") by the dotted enviroment that includes its own
line. For the latter, the position acoustical signals, we believe that
of the states was taken after a the model of tone center
two seconds of adaptation. The recognition might provide
trajectory of the interpretation important information for the
compensates the time-delay that navigation in the tone center
is introduced by the integration space. Indeed, it is not unrealistic
of the tone residue patterns (not to suppose that very soon, the
shown on the figure). Also the recognition part of our model
path is slightly different. In can be realised in real-time.
general, the TCAD-analysis Much more computational power
seems to give a more accurate would be needed, however, to
estimate of the tone context. cope with the interpretational
part of the modeL
6 An approach to
psychoacoustic-based 7 Conclusion
harmonic analysis We presented a framework for
schema-based tone center
a An approach to recognition and interpretation.
psychoacoustic-based Recognition was associated with
harmonic analysis the localization of short-term
A TCAD-analysis outputs the auditory images within a schema
position of a particular tone of tone center images.
context image with reference to Interpretation was associated
a frame of tone center images. with an adaptive dynamics of the
Apart from its interest as a position of the percept within
"fine-grained" tonal analysis, it this schema. The model assumes
is to be expected that this that interpretation involves the
information can be of help in reconsideration of past
defining functional roles of tones interpretations in view of new
and chords. contextual evidences.
We are currently investigating
this application in the framework Acknowledgements
of HARP (Hybrid Action We wish to thank H. Sabbe and
Representation and Planning) [3] the Belgian National Science
[4]. Foundation (F.K.F.O) for
support, as well as N. Cufaro
b Tone center Petroni for discussion during a
information in stay in Ghent. A. Camurri, J. P.
interactive systems Martens, L. Van Immerseel and
In interactive music systems, by B. Willems are greatful for their
which we understand systems help. The scientific responsibility
whose real-time sound is assumed by the author.

91
References
[8] M. Leman, "Symbolic and
[1] D.J. Amit, "Modeling subsymbolic information
brain function: the world of processing in models of musical
attractor neural networks", communication and cognition",
Cambridge University Press, 18(1-2), 141-160, Interface -
Cambridge, MA, 1989. Journal of New Music Research,
1989.
[2] A.S. Bregman, "Auditory
scene analysis: the perceptual [9] M. Leman, "Emergent
organization of sound", The MIT properties of tonality functions
Press, London, 1990. by self-organization", 19(2-3),
85-106, Interface - Journal of
[3] A. Camurri, C. Canepa, New Music Research, 1990.
M. Frixione, R. Zaccaria,
"HARP: A framework and a [10] M. Leman, "Een model
system for intelligent composer's van toonsemantiek: naar een
assistance" In D. Baggi ed., theorie en discipline van de
Readings in computer generated muzikale verbeelding", Doctoral
music, IEEE Computer Society Dissertation, University of
Press, Los Almitos, CA, 1992. Ghent, Ghent, 1991a.

[4] A. Camurri, C. Innocenti, [11] M. Leman, " The


C. Massucco, R. Zaccaria, "A ontogenesis of tonal semantics:
software architecture for sound results of a computer study", In
and music processing", 35 (1-5), P. Todd & G. Loy (eds.), Music
Microprocessing and and connectionism, The MIT
Microprogramming (Proc. Press, Cambridge, MA, 1991b.
EUROMICRO-92), Paris, 1992.
[12] M. Leman, "The theory of
[5] H. Haken, M. Stadler, tone semantics: concept,
"Synergetics of cognition", foundation, and application", (2),
Springer-Verlag, Berlin, 1990. 345-363, Minds and Machines,
1992a.
[6] T. Kohonen, "s e If-
organization and associative [13] M. Leman, "Tone context
memory", Springer-Verlag, by pattern-integration over
Berlin, 1984. time", In D. Baggi ed., Readings
in computer generated music,
[7] C.L. Krumhansl, IEEE Computer Society Press,
"Cognitive foundations of Los Almitos, CA, 1992b.
musical pitch", Oxford
University Press, New York,
1990.

92
[14] L. Van Immerseel, J. P.
Martens, "Pitch and
voiced/unvoiced determination
with an auditory model", 91(6),
3511-3526, The Journal of the
Acoustical Society of America,
1992.

[15] E. Zwicker, H. Fastl,


"Psychoacoustics : facts and
models", Springer-Verlag,
Berlin, 1990.

i
I
Cs
Is

A
d

I
F>r-- bf
-
Cs
cs I
a I

af
E

e
CI
,,f c
At

\
~
B G \ Ef
\
\g
ef b
, e1

Fs D
"\\ 8f i
i,
fs d bf ..J

Figure I Trajectories of recognition and interpretation on a tone center map. The figure is a torus
in that left and right sides as well as upper and lower sides are connected.

93
94
Capitolo 4

RETI NEURALI

95
96
AUTOMATIC PERFORMANCE OF MUSICAL
SCORES BY MEANS OF NEURAL NETWORKS:
EVALUATION WITH LISTENING TESTS.
G. U. Battel*, R. Bresin+, G. De Poli+, A. Vidolin*

*Conservatorio di Venezia "Benedetto Marcello"

+D.E.I. - C.S.C. Universita degli Studi di Padova


via Gradenigo 6/a - 35129 Padova - Italy
phone: +39498287631- fax: +39498287699
E-mail: depoli@dei.unipd.it.rb@csc.unipd.it

Introduction mean of particular artificial neural


networks. In our previous works
Musicians, according to the [3] [4] we showed the possibility
instrument they play, make to build some neural networks
loudness, duration and timbre which can learn some performing
deviations on the notes of the rules. These nets show good
score they are performing, since generalization properties, and,
the traditional musical notation after a training phase, are able to
does not suffice the composer's do real-time performances of any
real intentions, and leaves some score introducing some
freedom's degrees to the player appropriate deviations.
himself. These deviations In the present research we
determine the performing propose a comparison test
characteristics of a pianist in between various performances to
respect to another one. evaluate, by mean of listening
Furthermore a literal performance tests, the use of trained neural
of a musical score would lead to nets in automatic performance.
an extremely mechanic and
unnatural performance to the Description of the experiment
listener's ears.
The present work starts from Methods
the Sundberg's and co-workers'
researches on automatic scores' Materials
performance [1] [2] and
continues the research on real- Two tonal melodies were
time piano scores performance by chosen to be used in this study.

97
The fIrst one is the theme of the called nn-melodies. The rules
third tempo of the Mozart piano that we applied are the following
sonata K. 284, the second is the [1]:
theme of the fIrst tempo of the - durational contrast
Mozart piano sonata K. 331. - melodic charge
For the experiment it has been - articulation of repetition
used a subset of Sundberg's rules: - leap tone duration
the selected rules influence - leap articulation
mainly the relations between near - high loud
notes and don't involve greater - phrase.
segments. In this way the selected As an example of comparison,
rules are more suitable to a in fIgure 1 are showed the time
structurally simple musical (in milli-seconds) deviations in
example and to a strictly classical the two non deadpan
repertoire. This formal need, and performances of the theme of the
since the sonata K. 331 was K. 284 sonata. The 0 value is the
previously used in other works nominal value, it corriponds to
[1], brought us to the choice of the deadpan version (Le. no time
the two themes of Mozart. The deviations).
starting hypotesis are two: it is
possible to obtain an acceptable
performance with small
performing deviations and
without involving the great form;
for the intrinsic characteristics of
Mozart's music, the melodies are
performable in a meaningful way
on their own, and the deviations
sound p1easent and - - Neural netwod

understandable to the listener. Notes


We performed each of the two
melodies in three ways: deadpan
Fig. 1. Time deviations in the K 284
(with no expression); with sonata (particular).
expression given by a subset of
Sundberg's performance rules, in
the following these melodies will Equipment
be called rules-melodies; with
expression given by two neural The melodies were performed
networks trained with the same via MIDI by a Yamaha Disklavier
subset of Sundberg's rules, in the Grand Piano connected to a
following these melodies will be 80386/40MHz PC compatible. To

98
obtain the deadpan melodies and Procedure
the rules-melodies we used a
program called MELODIA Subjects were asked before the
developed at the C.S.C. [5]. To experiment began that they had to
obtain the nn-melodies we trained read a paper with the instructions
two neural networks: one for the for the experiment and the cells
loudness deviations, and another where to write their judgements.
for the time deviations (see Figure Each subject was given a copy of
2). the paper. In the paper was
explained that the aim of the
t.DR t.DRO
experiment was to compare three
different piano performances built
with the help of computer. In the
paper were listed the title and the
authors of the melodies. The text
of the paper was the following:
"You will listen to three
different performances of each
melody, and you'll have to
ND Me LP S P AR LA evaluate the musical quality of
each performance as if a student
Fig. 2. Neural network for time
deviations (NO:::;: Nominal Duration; is playing. You must your
MC :::;: Melodic Charge; LP :::;: Leap evaluation with a note from I to
Presence; S :::;: Semitones in a leap; P :::;: 10, using all the scale if possible.
Phrase; AR :::;: Articulation of I is the worst note, lOis the best
Repetition; LA :::;: Leap Articulation). one: avoid to give to much notes
in the intermediate range
(between 5 and 6), try to use
Subjects extreme values (1 and 10).
The judgement doesn't have to
Subjects for the study were 20 be to critic in a absolute sense,
professional musicians, and but has to show the qualitative
students of the last years of the differences, which you find in the
music conservatory of Venice, three different performances of
who volunteered for the the same melody. There aren't
experiment. They were 12 men right or wrong answers: the aim
and 8 women. The youngest was of this test is to find the
15 years old and the oldest 32 performance you think is the best.
years old. 17 of them were Between two melodies you have
pianists: undergraduate and 30 seconds of time to judge the
postgraduate. previous three different

99
performances of the melody just
heard, and to write it in the o K284
§P' IIK33l
apposite cells." &1'
The three different ~:
performances of each melody ~,
were played in a random order.
The total duration of the test D RM NNM
Version
was 4'50".
Fig. 3. Means for the interaction
between the effects of version and
melody (0 = deadpan; RM = rules-
Results melody; NNM = nn-melody).

The results can be analyzed by


using an ANOVA with repeated Discussion
measures on each of two factors
(version, melody). The analysis The subjects found that the
shows significant effects for greatest difficulty was the small
version [F(2,38)=1O.36; p= .000]. difference between the three
The nn-melody is the most performances of the same
preferred version (preference melody, often they said: "They
rating, 6.73), followed by the are quite equal". This fact valids
rules-melody (6.3) and the the initial hypotesis, and stresses
deadpan (4.28). the need to continue our research
The most important result is considering a larger number of
the preference given by the performing rules and a larger
subjects to the performed repertoire.
melodies (rules-melodies, and nn- The principal outcomes of this
melodies), that obtained a mean experiment are the equivalence
score 2 points greater than the of the nn-melodies and the rules-
deadpan version. Furtheremore, if melodies, and the preference they
we consider the scolastic italian had in respect to the deadpan
tradition, the subjects gave a melodies. The reasons because
more than fair rating to the the differences are not so marked
performed versions and an are mainly two: we used k=l in
unsatisfactory rating to the the Sundberg rules (this means a
deadpan version, even if these little emphasis in the performance
values don't interest the extreme rules, since all the melodies don't
values of the scale (from I to 10) have a slow metronome); from
(see Figure 3). the discussion we had with the
subjects after the test emerged the
difficulty to find great differences

100
between the three perfonnances References
of the same melody. The mean of
the subjects find the [1] Friberg A. "Generative
perfonnances to be poor, maybe Rules for Music Performance: A
for the absence of an Formal Description of a Rule
accompaniment. System", Computer Music
Journal, vol. 15, No.2, pp. 56-
71, Summer 1991
Conclusions [2] Sundberg J. et al.
"Performance Rules for
In our opinion, the main reason Computer-Controlled
for the preference of the nn- Contemporary Keyboard Music",
melodies in respect to the rules- Computer Music Journal, MIT
melodies is that the deviations Press, vol. 15, No.2, pp. 49-55,
depends from the contribution of Summer 1991
more rules. When only one of [3] Bresin R., G. De Poli, A.
these contributions is the Vidolin "Un approccio
responsable of the deviations, connessionistico per il control!o
then neural networks and dei parametri nel!'esecuzione
Sundberg's rules give the same musicale", Atti IX Colloquio di
results. When more factors act Informatica Musicale, Genova,
together, then the additive action pp. 88-102, 1991
of Sundberg's rules system, and [4] Bresin R., De Poli G.,
the properties of interpolation Vidolin A. "Symbolic and sub-
proper to the neural networks symbolic rules system for real
give different results. From this time score performance",
observation and from the results Proceedings of the 1992 ICMC,
of the test it comes out that neural International Computer Music
nets follow strategies which are Association, San Francisco, 1992
closer to the performing action of [5] Bresin R. "MELODIA: a
a human perfonner, and so they program for performance rules
can simulate in a better way the testing, teaching, and piano
process of perfonnance. (see scores performing ", elsewhere in
Figure 1) these proceedings, 1993

101
Timbre clustering by self-organizing neural
networks
Giovanni De Poli, Paolo Prandoni, Paolo Tonella
eSC-DEI University of Padova, Via Gradenigo 6a,
35131 Padova, Italy
tel: +39-49-8287631, fax: 8287699, email: depoli@dei.unipd.it

Introduction relations. Neural nets have been


used already in this field of re-
Timbre is that attribute of auditory search [4]; the aim of our work is
sensation which allows listeners to to simplify timbre multidimension-
rate as different sounds presented in ality, following the lines of Grey's
ways altogether similar with respect experiment, and to obtain similarre-
to intensity, duration, and pitch. suIts in terms of clusterization and
The similarity between two sounds of timbre space. The tools we use
can be characterized in physical and are Kohonen self-organizing neural
mathematical tenus only with diffi- networks (KNN): they show an abil-
culty because it is a subjective at- ity to correctly classify items out-
tribute and it depends on a large side the training set, and they prove
number of parameters. highly insensitive to noise. An-
1. M. Grey, in his classic work other reason for their use comes
[1], introduces the concept of "tim- from neurophysiology: the princi-
bre space", a means with which ples of self-organization Kohonen
he conveyed the vague notion of proposes were derived from a model
similarity between timbres into the of the cerebral cortex; it is therefore
precise notion of a metric rule in interesting to compare our results
a three-dimensional space. This with those obtained by Grey start-
space was the result of a multidi- ing from subjective judgments.
mensional scaling applied onto a
large set of subjective similarity rat-
Grey timbre space
ings obtained in experimental ses-
sions. A physical interpretation of J. Grey's experiments at Stanford
the reasons for such a spatial distri- University in 1975 were aimed at
bution was also provided. a thorough investigation in the field
In this work we will try to fol- of musical timbre. He considered
low the lines of Grey's experiment, the following synthetic test sounds,
but using a neural network as the obtained from a spectral analysis of
means to rate timbre differences recorded true instruments: bassoon
and to transform them into metric (BN), normal cello (S2), E flat clar-

102
inet (Cl), flute (FL), french hom jective perception and finally pro-
(FH), english hom (EH), muted duced a set of sound samples where
cello (S3), oboe (01, 02), cello suI the timbral issue was the only dis-
ponticello (S 1), soprano sax (X 1, criminant.
X2, X3), trombone (TM), and trum- We used the same data Grey used;
pet (TP); during the experimen- they consist of a line-segment ap-
tal sessions, a group of musically proximation of the true evolutions
trained listeners provided subjective both in frequency and in amplitude
ratings of the differences between of the sound partials as they resulted
tones. These perceptual data were from a heterodyne analysis of the
averaged and arranged in a similar- equalized analog signal.
ity matrix. This matrix was then
processed using a multidimensional
scaling (MDS) algorithm; the re- Self-organizing neural networks
sult was the distribution of the tim- Kohonen's neural networks are in-
bres in an n-dimensional space; at spired by the process that seems to
the same time, the matrix was an- be responsible for the map-like or-
alyzed using a hierarchical cluster- ganization of the cerebral cortex; the
ing algorithm based on the diame- observable organization of the cor-
ter method, and the result was an tex neurons shows that some zones
independent timbre grouping. The of the cortex are sensitive to cer-
most interesting result was that the tain stimulations and indifferent to
clusters thus obtained enclosed tim- others. The basic mechanism, be-
bres located at low distance in the lieved to account for this process
three-dimensional timbre space pro- of self-organization of the brain, is
duced by the MDS algorithm. Grey called the Hebb principle; it asserts
proposed a physical interpretation that if a particular neuron has a con-
for the three dimensions, showing siderable reaction to a stimulation,
the first dimension to be related to its synapses adapt themselves to the
the spectral distribution of energy, acting stimulus, and a lateral feed-
the second dimension to the pres- back process takes place; an activity
ence of synchronicity in the attack bubble is formed in the close neigh-
stage through the harmonics, and bourhood of the cell, while cells sur-
the third dimension to the presence rounding the bubble are inhibited.
of high frequency inharmonic noise In this way a clusterization process
with low amplitude, during the at- is generated, and the activity bub-
tack segment. bles come to be located in different
As timbre, in its definition, is zones of the neural map according
the feature which differentiates to the stimulations to which they are
sounds under the same conditions most sensitive.
of pitch, intensity and duration, T. Kohonen formalized this pro-
Grey first had to equalize the sam- cess into a simple numerical al-
ple sounds with regard to those pa- gorithm [5]. The arising neu-
rameters. This equalization stage ral model shows surprising prop-
featured many psycho-acoustic ses- erties of self organization: its in-
sions aimed at the comprehension ner structure modifies to become
of the phenomena underlying sub- an n-dimensional projective model

103
of the m-dimensional probability the actual learning occours: for all
space from which the input sam- the indices i E Nc(t)
ples come. As it is generally n <
m (while n = 1,2), the neural 1ni(t + 1) = mj(t) + a(t)[x(t) - 1n;(t)],
map performs ajeature extraction: while the other neurons are not up-
along the n axes of the map those dated. Both N c and a, the learning
input features are mapped which rate, decrease with time: the main
have the largest numerical variance. structural changes in the net hap-
This also explains the good behavior pen at the beginning of the process,
these models exhibit in the presence when the neighborhood is large,
of noise: KNN can maximize the while the remaining steps allow a
amount of information stored be- fine tuning of the neuronal inner
cause they organize complying to values. In our case the topological
two conflicting requirements: to in- neighborhood is three-dimensional;
crease the variance of the outputs its actual shape, cubic or spherical,
of all neurons, with the purpose of is not essential, nor is it its shrinking
recognizing the main features of the rule which could be linear or expo-
inputs; and to introduce a certain nential.
degree of redundancy, with the pur- Literature, however, offers almost
pose of obtaining correct answers no example of three-dimensional
even in presence of noise [6]. forms of this equation. A math-
ematical analysis of KNN dynam-
ics is extremely difficult; their prop-
3D Kohonen Nets erties were discovered through ex-
perimental simulation and practical
The first experiments we carried applications. For this reason some
out referred directly to Grey's main preliminary experiments were per-
result: the three-dimensional tim- formed to verify this extension.
bre space. To obtain results to be A first task was to run the clas-
compared directly with Grey's, we sical self-organization test [5, pag.
planned using a 3D neural model, 133] on our new structure: if the in-
extending Kohonen's equations into put vector x is a random variable
the third spatial dimension. with a stationary probability den-
The basic algorithm ruling the sity function p( x ), then an ordered
self-organization process is the fol- image of p( x) will be formed onto
lowing: at each training step t a new the input weights mi of the process-
input vector x(t) is presented to the ing units. If an uniform distribution
net; the neuron i whose inner values over a cubic region is used for the in-
vector mj is closest to the input vec- put space, an ordered cubic lattice of
tor x is selected as the best matching points should be obtained as the ul-
neuron. Different metric rules can timate structure of the map. Koho-
underlie this matching criterion; for nen suggested a minimal number of
instance, we adopted the euclidean training steps of 500 times the num-
metric and the "city block distance". ber of neurons in the net [6, page
Around the best matching neuron a 1496]; working with this too con-
topological neighborhood N c ( t) is servative an estimate. Probably, the
defined as the spatial region where more complex lateral interferences

104
in the solid case require allowing a in the same coordinate system of the
longer phase to structural modifica- input values, with lines connecting
tions. those units which are adjacent in the
neuronal array. Figure l.c shows
how adjacent units end up with as-
suming adjacent values.

3D Clusterization
At this point we presented the net
,. with numerical data obtained di-
11 rectly from those used by Grey in
his listening sessions. We used sam-
ples of the frequency and ampli-
tude evolutions of the sound sig-
nal in their line-segment approxi-
mation, so that all of the processing
was made by the neural network;
we also tried with data outcoming
of a pre-processing of sounds, so
that the network operated only at the
most critical stage, the classification
11
stage [8]. In all cases, the way we
used Kohonen networks was some-
what fragile because only few learn-
ing samples were available with re-
spect to the number of neurons in the
network. A lack of samples causes a
great sensitivity in the network final
state to the starting random values
of the weights. It happens that some
of the neurons remain untouched by
the learning process and the inner
structural organization cannot un-
fold. It is possible to reduce this
Figure 1: Three stages of evolution sensitivity to the initial conditions
in the process of self ordering for a computing the mean of the different
three-dimensional map. (a) startoff, (b) results obtained in a series of experi-
1000000 iterations; (c) 2000000 itera- ences, so that the effect of the initial
tions. random weights is canceled by the
average [7, page 7]; we studied the
Figures La, 1.b, and 1.c illus- convergence properties of the av-
trate two significant steps of this erage, and we noted the presence
expected evolution and show the of a final mean configuration with
validity of the three-dimensional low values of variance, and, accord-
model. In these plots the neuron ingly, of the relative error (3% is a
values mj are represented as circles typical value).

105
We obtained the best results using himself, even though for other pur-
Grey's data directly: in the origi- poses [1, pages 75-95], [3], we con-
nal line-segment representation all sidered each one of the possible cou-
the necessary information to recon- ples of tones and obtained, by an
struct the complete heterodine anal- algorithm of linear interpolation of
ysis of the timbre is contained and it the spectral envelopes, two artificial
could thus be used as an input to the tones at 1/3 and 2/3 of the dis-
neural network. We built an input tance between the couple extremes.
file containing, for the first 20 har- In so doing, we implicitly discarded
monics of 12 signals, 10 amplitude all information about the frequency
envelope samples and 5 frequency evolution of partials, adopting a
envelope samples, and we fed it to a coding of sounds which Grey calls
network sized 8 * 8 * 8 = 512 neu- the fixed frequency model. In the
rons. After the learning phase, we end we reached a data set of 200
computed the matrix ofrelative tim- units.
bre distances from the spatial lo- Clustering algorithms are gener-
cation of the best-matching neu- ally very sensitive to little pertur-
rons in the map; to this matrix, the bations in the data points; there-
same clusterization algorithm used fore, even if the timbre space built
by Grey was applied and the result by the net were not so much dif-
was: ferent from Grey's, the clustering
algorithm could produce a com-
{(BN FH) [TP (FL 52)] [51 53]} pletely mismatching result. Com-
{[(CI EH TM) 02] X3} mitting the accuracy judgement to
where the brackets split successive a close match between clusterings
levels in the clusterization process. seems too strict a requirement; how-
The analogies with Grey's results, ever, since Grey does not provide
the similarity matrices he used, a
{[(BN TP) FH] [(5253) (FL 51)]} comparison between them and our
{Cl (EHX3)} distance matrix, which would be the
[02 TM], best criterion, is impossib-le. In or-
der to obtain a significant index of
are encouraging; the mismatches similarity for the timbre spaces, we
are due more to the different times exploited the information contained
at which grouping occours, than to in [1, page 60], that is, the order
actual grouping differences. in which clustering occours: 1.(5 I,
53),2.(01, 02), 3.(BN, TP), 4.(Xl,
Timbre Interpolation X2), 5.(Cl, C2), 6.(X3, EH), 7.(FL,
51), 8.[(BN, TP), FH], 9.[(01, 02),
The most critical point in the previ- TM], 10....
ous experients was the small num- We define the following index ofdis-
ber of learning samples, which were order
just the 14 original timbres. Besides
N
averaging results, we tried to in-
crease the number of samples: start- D = :L l(dk+1 - dk) -ldk+ 1 - dkll
k=1
ing form Grey's original data, and
following a line exploited by Grey where dk is the euclidean distance

106
between net points which, in Grey's by the net approximately half way
space, belong to the k-th cluster. along the line between the two "par-
Distances are computed according ents" (fig. 3); this is not obvious
to the diameter algorithm, that is, the if we recall that KNN are nonlin-
maximum among all possible dis- ear projectors: if linear relations in
tances in a group. Clearly D ~ 0, the input space are preserved, this
and D = ° only if do, ... , dN are
in ascending order. This latter case
means that the numerical form into
which input samples are coded is
does not imply spatial equality, but well representative of their differ-
grants a good similarity. When ences. We refer to this finding as to
D > 0, distances are in a scram- the inner "coherence" of the neural
bled order; each inverted group con- space.
tributes to D with a term propor-
tional to the degree of inversion.
Experiments using the extended
data set were run on both two- and 25
three-dimensional networks; con-
vergence now required a huge num- 20

ber of steps and rendered our work


a trial to patience. Evaluations of
the index of disorder proved con-
sistent in all of the experiences:
after a first widely varying phase,
due to the large structural changes
°O~--="200-O---4-7.00C-------0600=----O-800~-cc!'000
occourring at the beginning of the Cyaos x100

self-organization process, the in-


dex settled around a value of fif- Figure 2: Typical evolution of the index
teen (fig. 2), which indicates that of disorder
few timbres are slightly misplaced.
In the analysis of the index behav-
ior, no substantial differences were
found between plane and solid net-
works, suggesting that two dimen-
sional structures manage to locate __ ~ __ - -<S:9im

timbres well enough; since there is


a great saving in time, the bulk of the
experiments was then conducted on
plane nets sized 10 * 5 = 50 neu-
rons. , , '
"-. ~ ~t

Another topological issue came - ---- --·18nR"lt _.- - - --4p

to validate our results: Grey


showed how artificial timbres ob-
1.5 2.5 3.5
tained through linear interpolation
are acoustically perceived as "half
way" between the two timbres from Figure 3: Neural space coherence
which they originate. Similarly,
such an artificial timbre is mapped

107
Conclusion of this work is the substitution, at the
initial stage in the process of timbre
KNN are an interesting tool for the recognition, of the heterodine anal-
classification of a data set belonging ysis with a simulator of the human
to a space with large dimensional- ear; in this way the operations made
ity, a task where classical tools for on input signals by biological or-
the extraction of high-variance fea- gans and neurons is entirely repro-
tures fail. We obtained maps which, duced by an artificial system.
even though different from Grey's
timbre space, were not so far away.
This suggests that the model under- References
lying the artificial networks princi-
ples of self organization resembles, [1] Grey lM., An exploration of
in a way, the features of biological musical timbre, Rep. STAN-M-
neurons organization. We could ask 2, Stanford University, 1975.
ourselves, however, which are the
legitimate expectations in such ex- [2] Grey J.M., Multidimensional
periments. From the wide spatial perceptual scaling of musical
separation between tones and the in- timbres, J. Acoust. Soc. Am.,
ner coherence of the neural space we 61(5): 1270-1277,1977.
can infer that the net is capable of [3] GordonJ.W., Grey J., Percep-
efficiently handling a multidimen- tion of spectral Modifications
sional feature like timbre; it would on Orchestral Instrument Tones,
be unlikely, however, to have the Compo Music 1., 2(1): 24-31,
same results as those obtained from 1978.
a group of trained listeners. After
all, Grey's model was developed in a [4] Feiten B., Frank R., Ungvary
peculiar environment, and need not T., Organizations ofsounds with
be assumed as an absolute target. It neural nets, Proc. ICMC 91, p.
would be interesting, among other 441-444, 1991.
things, to repeat the tests that led to it
with a group of untrained people, or [5] Kohonen T., Self-organiza-
with sound samples of a better qual- tion and associative memory,
ity. In fact, it should be noticed that Springer v., Berlin, 1984.
Grey's synthetic tones are of a low
[6] Kohonen T., The Self-Organi-
sound quality; future developments
will surely profit of a higher quality zing Map, Proc. of the IEEE,
78(9): 1464-1480, 1990.
sampling of the test timbres, and of
an adequate signal pre-processing. [7] The Self-Organizing Map Pro-
With regard to the data reduction gram Package, Helsinky Uni-
techniques, deeper studies are under versity of Technology, 1992.
completion at Padova University;
the best results have been obtained [8] De Poli G., Tonella P., Self
using pre-processing based upon Organizing Neural Networks
Grey's observations, while Char- and Grey's Timbre Space, Proc.
bonneau's methods gave worse fi- ICMC 93,1993.
nal configurations. A development

108
NEURAL NETWORKS AND STYLE ANALYSIS:
A Neural Network that Recognizes Bach Chorale Style

Margaret Johnson

Department of Computer Science


Stanford University
Stanford, CA 94305-3068
johnson@cs.stanford.edu

I. Neural Networks in Music style (Kohonen's work [1],


Mozer's CONCERT system [2],
A neural network is an analysis or Todd's algorithmic
tool that is very loosely modeled composition network [3]).
on the structure of the human Networks have also been used to
brain. It is composed of elements classify rhythms (David &
that imitate the most elementary Sandon's Tempnet [4]); for pitch
functions of a biological neuron. perception (Sano & Jenkins [5],
These elements are organized in or Bharucha & Todd [6]); for
a way that generally resembles tonal analysis (Scarborough,
the anatomy of the brain. Miller & Jones [7], Laden &
Despite this superficial Keefe [8], or Bharucha [9]); to
resemblance, neural nets exhibit determine optimal fingerings for
a surprising number of the string instruments (Duff [10], or
brain's characteristics, although Sayegh [11]); for understanding
much smaller in scope. For musical perception (Bharucha
example, they seem capable of [12]; even for the study of jazz
learning specific patterns after and improvisation (Baggi's
training. They also appear to Neurswing [13] and Alpaydin
generalize from previous [14]).
examples to new ones, and they
can often abstract the essentials II. Project Description
from inputs containing irrelevant
data. The present project represents an
exploration of the uses of neural
In recent years, neural networks networks in style analysis. Style
have been applied to music in analysis is defined as the
various ways. For example, identification of characteristic
networks have been trained to features in the music of a
"compose" music in a particular composer by comprehensive

109
analysis of harmony, rhythm, propagation model with 23
melody, sound and form. We inputs, 3 neurons in the hidden
assess the potential for using a layer, and 1 output. The output
neural network as a general tool is a real number in the 0.0 to 1.0
in identifying key features of a range. If the output is close to
composer's style. 1.0, the input pattern represents a
chorale similar in style to Bach.
We construct a neural network, Outputs closer to 0.0 represent
and train it on patterns derived chorales of other composers and
from the analysis of a set of Bach styles.
chorales, and a set of chorales
taken from the 1940 Hymnal The inputs are a series of
[15]. Another term for a chorale descriptive categories. This
is a hymn, which is a religious information is encoded and
song of praise characterized by a organized, or "pre-processed", to
chordal style in four parts, make training easier. The
suitable for congregational encoding is sometimes binary (l
singing. The 1940 Hymnal is a if true and 0 if false), and other
primary collection of hymns used times a real number. Note that
in Protestant Episcopal churches. binary values are 0.99 and 0.01
Included are hymns from rather than 1 or 0 because the use
medieval times to the present, of 0 in the neural network
including many works by Bach. calculations complicates the
training process. The categories
Once the neural net is trained on are given below:
the two sets of patterns, we test it
to see if it can recognize whether 1) Meter: 0.99 if 4/4,0.5 if 3/4
or not a chorale is written by and 0.01 if anything else.
Bach, when presented with any 2) Key: The key is encoded as a
chorale (in a major key) from real number based on the
the 1940 Hymna1. In addition, following scale. These are the
we analyze the resulting values in only keys found in the analyzed
the network to see what happened Bach chorales:
during the training. If we can GAD F Bb C Eb E Ab
understand what the neural net .9 .8 .7 .6 .5 .4 .3 .2 .1
learned during training, we may The more common keys are
be able to learn something about assigned higher values. If a
Bach's compositional style in chorale from the 1940 Hymnal is
composing chorales. in a key other than those given
above, it is encoded with 0.001.
Details of Implementation 3) Chord Frequency: Based on a
complete harmonic analysis, we
The neural network for this determine the number of times
project is a standard back- each chord is used. The chords

110
we count are: I, ii, iii, IV, V, vi, durations.
any viio, any secondary 8) Length: The longest chorale
dominant, we analyzed was 88 measures,
and inversion chords. so we compare each chorale to
4) Non-harmonic tones: We this maximum. If a chorale is 44
count how many non-harmonic measures long, it has a length
tones are used in the entire value of 0.5.
chorale. These are tones that are
not a part of the chord currently The set of input patterns consists
playing. The actual value used in of the first 85 Bach chorales
the input pattern is the from the 371 Four-Part Chorales
frequency. For example, if there [16], and 85 chorales from the
are 112 notes in the entire piece 1940 Hymnal, starting with
and 16 of these are non-harmonic number 275. We analyze only
tones, the value of this input is those in major keys. This set of
0.143. 170 patterns is used to train the
5) Unaccented passing tones: In network. We use the standard
addition to counting the total back-propagation training
number of non-harmonic tones, algorithm [17].
we also count the number of
unaccented passing tones. The set of input patterns are
6) Cadences: We count the total presented to the network 4000
number of cadences, and the times, at which point the error
frequency of cadences on I, ii, level is 0.008. This indicates that
IV, V, and vi. So, if there are 8 the network has learned all of the
cadences in the piece and 3 are patterns with only a slight chance
cadences on I, the value for of error. We go through several
the I category is 0.375. iterations of this process
7) Modulations: We look for changing various parameters of
modulations to the keys of ii, IV, the network. The primary
V, and vi. If we find them, we parameter that we study is the
assign a weight based on the number of hidden neurons. In
duration of the modulation. If a neural networks, it is essential to
modulation extends over more determine the appropriate
than two measures, we are firmly number of hidden neurons. If
based in the new key, so this we have too few neurons, the
modulation would be assigned network will not train at all. In
0.9. If a modulation extends the present project, we were
over 1/2 unable to train a network with
measure, this is a brief excursion just one hidden neuron. If we
(if really a modulation at all), so have barely enough neurons, the
we assign 0.1. In between, we network may train, but it might
assign specific values for not be robust in the face of
different irrelevant data, or it won't

111
recognize patterns that it has not 23 input
seen before. In this project, this
was the case with a network of
two hidden neurons.

Testing

In order to detennine that the


network really has learned
something more than just the
specific patterns in the training The values in this network
set, we run additional tests using consist of the following:
chorales not included in the
training set. We encode 10 1) The 170 input patterns (85
chorales from Bach and the 30 Bach and 85 from the 1940
from the 1940 Hymnal (including Hymnal) each with 23 inputs.
5 by Bach). The error level for 2) The weights in the trained
this test set is 0.005, indicating network connecting each input to
that the patterns are each of the three hidden neurons.
recognizable. The output of the In the diagram this is weight
network for these 40 patterns is vector WI, of which we show
correct in all cases (close to 1.0 only the weights from each input
for Bach and close to 0.0 for going into the first hidden
chorales not composed by Bach). neuron (WI 1,1 to WI 23,1).
The network is fully connected,
III. Analysis of Trained so there are 23 connections going
Network into each hidden neuron (i.e., WI
1,2 to WI 23,2, WI 1,3 to WI
The weights in a neural network 23,3). We do not show all these
are modified during training other connections in the diagram.
until we end up with values that 3) The values in the hidden
allow the network to recognize neurons. The hidden neurons
patterns and classify them each have a specific value when a
correctly. These values may particular input pattern is
represent what the network presented to the network. This
actually learned during the value is the sum of the inputs
training. Thus, an analysis of the times the corresponding weights
weight values in the trained coming into the neuron. A
network may provide clues to the sigmoid function (this function is
"essence" of Bach's chorale style. commonly used in the back-
propagation training algorithm)
A diagram of the structure of the is applied to this sum to provide
network appears below: the actual value of the hidden

112
neuron. This value is between 0 where we hopefully learn which
and 1. specific inputs are causing an
4) The weights connecting each output near 1. These inputs
hidden neuron to the output. represent the distinguishing
This is W2 in the diagram; it has characteristics of Bach chorales
three components (W2 1,1, W2 as learned by the network.
2,1, W2 3,1).
The analysis proceeds as follows:
To do the analysis, we do not 1) Determine which chorales
look at the specific weights in have the highest output, i.e., the
WI and W2, but rather at what is output
termed the weight-state vectors closest to 1.
[18]. This is the value of the 2) Look at the W2 weight-state
product of the input times its vectors, to determine which
corresponding weight when an hidden unites) are critical to
input pattern is presented to the obtaining the output.
network. Just looking at the 3) Look at the specific hidden
weights themselves does not give unit value(s), as determined in
as much information as looking step 2, for each of the chorales
at different patterns of the obtained in step 1.
weight-state vectors when input 4) Look at the weight-state
patterns are applied. vectors for the specified hidden
unites)
Each input pattern has 3 weight- and the specified chorales to
state vectors with 23 components, determine which inputs excite or
one set for each hidden neuron. inhibit the hidden unites).
For example, the weight-state
vector for the first hidden The following chorales from the
neuron (shown in the diagram) 371 Four-Part Chorales,have
would consist of the following 23 outputs closest to 1.0: 5, 20, 24,
components: Input 1 * WI 1,1; 64, 84, 102, 124, 135, and 179.
Input 2 * WI 2,1; Input 3 * WI An analysis of the weight-state
3,1 etc.). vector W2 indicates that the first
hidden unit is critical,
The basic idea of the analysis is specifically, this hidden unit must
to start with a set of chorales be a large value (greater than
which are most typical of Bach 0.9) in order for the output to be
(have an output of near 1.0). near 1. This being the case, we
Then, we work backwards look at the WI weight-state
through the network analyzing vectors for large positive values.
the values of the weight state The inputs associated with these
vectors of W2, the values of the values are the ones that excite the
hidden neurons, and finally, the hidden unit to larger values.
weight state vectors of WI,

113
We find that the inputs having
the strongest effect on the first What happens during training is
hidden unit are: Meter, Key, each time a chorale with these
Non-Harmonic Tones, characteristics appears as an
Unaccented Passing Tones, input pattern (which is quite
Cadences on I, Frequency of often), the weights connecting the
Secondary Dominants. Those inputs specified above to the first
inputs having a lesser effect, but hidden unit are strengthened.
still a positive one are: Cadences Later during testing, when a
on ii, Cadences on vi, chorale with these characteristics
Modulations to ii, Modulations to appears, the first hidden unit is
vi, Frequency of ii chords, excited to the point that it
Frequency of vi chords. These produces an output near 1.
inputs represent the When a chorale without these
characteristics that distinguish a characteristics is presented, the
Bach chorale from one by effect of the first hidden unit is
another composer. Further negated by the effects of the
analysis of the specific values of second and third hidden units,
the inputs for the set of chorales causing the output to be near O.
with the highest output provides
more information: We perform one more test to
determine if the characteristics
• 89% are in 4/4 listed above are typical of Bach
• 89% are in the keys of D,A or chorales. To do this, we analyze
G the entire set of 85 Bach
• 79% have non-harmonic tone chorales. This is done with a
frequency of 12% computer program that processes
• 79% have unaccented passing the input pattern file. First, the
tone frequency of 9% average score for each of the
• 89% cadence half the time on I inputs is determined. Then, we
• 89% have secondary dominant find what percent of the chorales
frequency of 12% is in close proximity to the
• 79% cadence 6% of the time on average. The results are close to
ii those given above:
• 79% cadence 12% of the time
on vi • 88% are in 4/4
• 79% have ii modulation • 70% are in the keys of D, A or
strength of 0.23 G
• 79% have vi modulation • 72% have non-harmonic tone
strength of 0.33 frequency of 11 %
• 79% have ii chord frequency of • 67% have unaccented passing
8% tone frequency of 8%
• 79% have vi chord frequency • 83% cadence half the time on I
of 10%

114
• 85 % have secondary dominant the patterns. Now that we know
frequency of 11 % neural networks can provide
• 83% cadence 5% of the time on insight into a composer's
11 characteristic style, we can
• 83 % cadence 10% of the time experiment with less precise
on vi inputs to further test a neural
• 71 % have ii modulation net's usefulness as a style analytic
strength of 0.20 tool.
• 84% have vi modulation
strength of 0.29 References
• 68% have ii chord frequency of
7% 1. T. Kohonen, "A Self-Learning
• 71 % have vi chord frequency Musical Grammar or Associative
of 10% Memory of the Second Kind,"
Proc. Int'l Joint Con! Neural
Based on these results, we Networks, Vol. 1, 1989, p.1.
conclude that the network did
indeed learn the characteristics 2. M. Mozer, "Simulating
which distinguish Bach chorales Melodies in Style of Bach,"
from those written by other Computing in Musicology, 1991,
composers. p.54.

IV. Conclusion 3. P. Todd, "A Connectionist


Approach to Algorithmic
It does not come as a surprise Composition," Computer Music
that the distinguishing Jour., Vol. 13, No.4, 1989, p.
characteristics of Bach chorales 27.
are those discovered in this
project. Anyone who hears both 4. I.L. David and P. Sandon,
a Bach chorale, and perhaps one "Temporally Sensitive Neural
by Thomas Tallis, knows there Networks," Proc. In!,l Joint
are more non-harmonic tones, Con! on Neural Networks, Vol.
more chord variety, and more 2, 1991, p. 2104.
interesting modulations.
5. H. Sano and B.K. Jenkins, "A
The value of this project is in Neural Net Model for Pitch
determining that a neural Perception," Computer Music
network is capable of Jour., Vol. 13, No.3, 1989, p.
recognizing distinctive features 41.
of a composer's style. The next
step is to make the process more 6. J. Bharucha and P. Todd,
dependent on the network and "Modeling the Perception of
less on the "pre-processor" who Tonal Structure with Neural
analyzed the chorales and created Networks," Computer Music

115
Journal, Vol. 13, No.4, 1989, p. Music, edited by Denis Baggi,
44. Los Alamitos, CA: IEEE
Computer Society Press, 1992.
7. D. Scarborough, B. Miller,
and J. Jones, "Connectionist 14. R. Alpaydin, "Connectionist
Models for Tonal Analysis," Approach to Improvisation Using
Computer Music Journal, Vol. Fingering Pattern Association,"
13, No.3, 1989, p. 49. M.S. Thesis, Bogazici University,
Istanbul, Turkey, 1992.
8. B. Laden and D. Keefe, "The
Representation of Pitch in a 15. The Hymnal of the Protestant
Neural Network Model of Chord Episcopal Church, New York:
Classification," Computer Music Church Pension Fund, 1940.
Jour., Vol. 13, No.4, 1989, p.
12. 16. J.S. Bach,Four-Part
Chorales, Wiesbaden: Breitkopf
9. J. Bharucha, "MUSACT: A & Hartel.
Connectionist Model of Musical
Harmony," Proc. Cognitive 17. D.E. Rumelhart and J.L.
Science Society, Hillsdale, NJ: McClelland, Parallel Distributed
Earlbaum Press, 1987. Processing, Exploration in the
Microstructure of Cognition, 3
10. M.O. Duff, volumes, Cambridge: MIT
"Backpropagation and Bach's Press, 1986.
Fifth Cello Suite (Sarabande),"
Proc. Int'l Joint Con! on Neural 18. R. Gorman and T.J.
Networks, Vol. 1, 1989, p. 121. Sejnowski, "Analysis of Hidden
Units in a Layered Network
11. S. Sayegh, S. "Fingering for Trained to Classify Sonar
String Instruments with Optimal Targets," Neural Networks, Vol.
Path Paradigm," Computer 1, 1988, p. 75.
Music Jour., Vol. 13, No.3,
1989, p. 76.

12. J. Bharucha, "Music


Cognition and Perceptual
Facilitation: A Connectionist
Framework," Music Perc., Vol.
5, No.1, 1987, p. 1.

13. D. Baggi, "Neurswing: An


Intelligent Workbench for the
Investigation of Swing in Jazz,"
Readings in Computer Generated

116
MODELLING HARMONY-BASED
JAZZ IMPROVISATION:
AN ARTIFICIAL NEURAL NETWORK APPROACH
Petri Toiviainen

University of JyvaskyHi
Department of Musicology
PL 35
SF-40351 J yvaskyHi (Finland)
fax +35841 601 331
E-mail ptoiviai@tukki.jyu.fi

Introdu.ction. by the connection strengths


between the neurons; the cognitive
In cognitive science and entities are represented by
research of artificial intelligence, activation patterns of the neurons).
there are two central paradigms: The most prominent feature of
symbolic and analogical. In the ANNs is their ability to learn by
symbolic approach, the cognitive example, and, to a certain extent,
models are based on entities, which generalise what they have learned.
are symbols both semantically Improvisation, the art of
(referring to external objects), and spontaneously creating music
syntactically (being part of the while playing, is the basic element
rule-based manipulation of of nearly all musical cultures of
symbols). In the analogical the world. In the western tradition,
approach, one tries to imitate the the art of improvisation reaches
cognitive phenomena using some its highest level among jazz
other system. Within the musicians. In jazz, one improvises
analogical paradigm. artificial on melodic, harmonic, as well as
neural networks (ANNs) have rhythmic level. The basis of
recently been successfully used for improvisation differs consider-
the modelling and simulating of ably between different styles of
cognitive phenomena. ANNs are jazz: a Dixieland musician bases
non-linear dynamical systems his improvisation, more than the
consisting of a large number of others, on the structure of the
massively interconnected simple melody of the composition; a
processing elements. or neurons. hehop musician leans rather on
ANNs are parallel (the neurons the harmonic structure; whereas
interact essentially simultaneously a free ja:: group, for example,
and independently) and distributed may operate solely within the
(their knowledge is represented framework of some prearranged

117
musical structure. About harmony-based impro-
As a cognitive process, the visation.
improvisation of jazz is extremely
complicated. Irrespective of the There are several factors
jazz style in question, the influencing the structure of an
improviser has to take into account improvised jazz melody: e.g., the
constraints on various hierarchical harmonic structure, the melody
levels. Every musician has, too, a of the theme, and the style of the
personal way of approaching this accompaniment. Among the
problem. essential factors are, also, the
The art of improvisation is instrument used - its fingering
learned mostly by example. and range of pitches; the musical
Instead of memorising explicit background of the soloist; and the
rules, the student mimics the musical interaction with the other
playing of other musicians. On players of the group during the
the basis of this material, he then playing.
forms implicit rules for the style The target note technique
of improvisation concerned. This [4][5], a common way of
kind of learning procedure cannot explaining the microstructure of
be easily modelled with rule-based an improvised jazz melody, can
expert systems. On the other hand, be described as follows: the notes
artificial neural networks, or of a tetrachord and its upper
connectionist systems, are suitable structure (9, 11, 13) are regarded
for this purpose. Neural networks as principal tones; when
are systems consisting of approaching a chord, one of its
massively interconnected simple principal tones is chosen as a target
processing elements, or neurons. note; the target note is reached
Their most prominent feature is through a learned melodic pattern.
the ability to learn by example. For the simulation model, the
Recently, neural networks have target note technique can be
been successfully used for the described as a dynamic process
modelling and simulating of with feedback, as presented in fig.
cognitive phenomena of music (cf. 1. At the beginning of the process,
e.g. [1][2][3]). In this paper, a some starting notes - here two -
connectionist model of bebop- are fixed, as well as the present
style melodic improvisation, based and following chords. The starting
on harmony, is described; some notes, together with the present
results, achieved by simulating the chord, determine the possi ble
model, are presented. The basis melodic patterns, whereas the
of the model is the target note following chord determines the
technique, one of the cornerstones possi ble target notes. From the
of bebop-style jazz improvisation. interaction of those constraints a
new melodic pattern emerges. The
target notes of the pattern are then

118
Figure 1. The target note technique as a dynamic process with feedback. Cf. text.

used as the starting notes of the time [6]. Secondly, it can be said
next pattern. that all jazz from 1900 to the
present day has employed the
The architecture of the following ratio of time values: a
model. quarter-note pulse, representing
the rhythmic unit of a jazz
An essential step In con- composition; a half-note pulse,
structing a model for the representing the harmonic unit~
simulation of the production of and an eighth-note pulse,
music, is to decide how time is representing the melodic unit [7].
represented. When using neural Consequently, it is justified to
networks, a natural choice is to make the following simplifications
use a representation analogical to concerning the architecture of the
"piano roll" notation, used in the network: (1) only 4/4 time is
old player pianos. In "piano roll" possible; (2) the network processes
notation, time has been translated one half-measure-Iong melodic
into an ordered spatial dimension. pattern at a time - faster chord
Accordingly, in neural network, progressions are not possible; and
the melody can be represented in (3) the smallest possible time value
a two-dimensional grid of is an eighth-note.
neurons, where one dimension For the representation of pitch
corresponds to time. and the other in neural networks. there are
to pitch. In the improvisation several possibilities [8]. In this
model in question, the network model, invariant pitch class
processes one melodic pattern at representation is used. The pitches
a time. using a modified "piano are, thus. represented relative to
roll" representation. the root of the present chord.
The rhythmic structure of both In the network. the melody is
the melody and harmony of bebop represented as follows: for every
jazz is very regular - if the eighth-note. 12 neurons represent
phrasing is ignored. Firstly. an the pitch classes of the chromatic
overwhelming majority of all jazz scale. relative to the root of the
composed and performed is in 4/4 present chord. Moreover, one

119
neuron represents the rest, and of the present chord is carried
another the ligature. In order to out by feeding external activation
simplify the model, the following into the auto-associator. For a
restrictions are made: (1) the given chord-type, the most often
contour of the melody is ignored; used notes receive the strongest
(2) The network produces only external activation. The consider-
monophonous melodies; (3) ation of the possible target notes,
Phrasing, dynamics, and octave within the melodic pattern being
are ignored. In the network, one produced, is carried out by feeding
eighth note corresponds, thus, to external activation into the last
a group of 14 neurons. Because two groups of neurons of the auto-
of syncopation, characteristic of associator: the neurons repre-
jazz music, the target note is often senting the most often used target
played on the eighth note notes receive the strongest
preceding the first beat. Therefore external activation.
it is reasonable to use six groups During the learning phase, one
of neurons: the first cor- melodic pattern at a time is
responding to the eighth note presented to the network: the
preceding the half of the measure neurons representing the notes of
in question; the following four the melodic pattern are activated,
corresponding to the half of the together with the neurons
measure in question; and the last representing the roots and the
corresponding to the eighth note types of the present and the
following the half of the measure following chord. The learning
in question. The network consists, occurs in two distinct groups of
thus, of 6 x 14 = 84 neurons neurons: (1) the melodic patterns
altogether. The neurons of the are learned through strengthening
network are fully interconnected. the connections between the active
There are two types of neurons of the auto-associator, and
connections: (1) neurons the neuron representing the type
belonging to the same group of present chord, according to the
(column) have fixed inhibitory Hebbian learning rule; (2) the
interconnections: this guarantees target notes are learned through
that, in the relaxed state, only one strengthening the connections
neuron is active in every group. between the neurons representing
(2) neurons belonging to different the type of the following chord,
groups have excitatory and the neurons representing the
interconnections, which are invariant target notes. Inside these
modified during the learning two groups of neurons, differing
phase, according to the Hehhian learning rates can be used,
learning rule [9J. The network is. resulting in melodies of different
actually, a modified auto- styles. In fig. 2, the architecture
associator. of the whole network, with an
The consideration of the type example of the representation and

120
learning of one melodic pattern, Observations about the
is presented. experiments.
During the testing phase, the
activations of the neurons The network was taught solos
representing the root and the type played by trumpet player Clifford
of the present and the following Brown. The material consisted of
chord are set to maximum. The excerpts, typically 32 measures
starting note of the melody is long, from six solos, taken from
chosen, and the corresponding [10]. After the learning phase, the
neuron is activated. During the network was tested with Rhythm
relaxation process, the activations Changes chord progression, one
of the neurons are updated of the most widely used harmonic
asynchronously: the activation of structures in bebop. When
one - randomly chosen - neuron teaching the network, various
is updated at a time. The updating learning rates were used for the
procedure consists of: (l) melodic patterns and the target
calculating the income activation, notes. It was found that the ratio
according to the formula of the learning rates had a
considerable influence on the style
of the melodies produced. When
using proper values for the
where w.. is the connection learning rates and inhibition, the
strength bitween neurons i and j, network was found to be capable
and a j is the activation value of of producing stylistically fairly
neuron i; and (2) calculating the consistent melodies - on the micro
new activation according to the level. On the other hand, it cannot
formula deal with larger structures, like
O, if input s;; 0, melodic phrases. In this respect,
a(input) = { input, if 0< input s;; 1, the melodies produced by the
1, if input> 1 network resemble those of a
beginning improviser. An
interesting aspect is. that the model
The updating procedure has a sort of creativity: it can
continues until the network has produce new melodic patterns,
reached a stable state. In every based on the patterns it has learned.
group of neurons, there is now In fig. 3, examples of the melodies,
one active neuron. These form produced by the network, are
the melodic pattern the network presented.
has produced. After feeding back
the last two notes of the melodic Conclusions.
pattern as the starting notes of
the next pattern, and updating the Above, an artificial neural
chords. the relaxation process is. network model describing the
again, started. learni ng of melodic jazz

121
Dm7 G13#l1
Target notes relative to
Auto-associator the root of the followin chord
-0 0 0 0 0 0 -0
0
Iiglllmr 0 0 0 0 0

o
0
Iiglllmr
110 0 0 0 0 0 110
100 0 0 0 0 0 100
90 0 0 0 0 0 90
80 0 0 0 0 0 80 altN
70 0 0 0 70
60 0 0 0

0
60
50 0 0 50
40 40
l3b9N
30
20
10 0 0 l3#IIN
00 0 0

0
maj7/1 m7/11
0
m7/III
type of following chord

type of present chord

Figure 2. An example of the representation and learning of one melodic pattern in


the network. The black arrow s represent the connections. the strengths of which
are increased.

improvisation is presented. The new context. If it cannot find a


starting-point of the model is proper melodic pattern for a given
learning by means of examples: chord progression, it can create a
the student is presented melodic new pattern, based on the patterns
patterns, and after having learned it has learned. This kind of
them, he is able to use them in a adaptability - or creativity - is a
given context, i.e., harmonic direct consequence of the
progression, and to generalise connectionist paradigm. The
what he has learned. Instead of structure of each melodic pattern
presenting the student explicit is not presented symbolically, but
rules concerning the style of rather resides in a distributed form
improvisation in question, he in the network - in the connection
himself builds implicit rules, based strengths between the neurons. To
on the material learned, and construct a rule-based expert
applies these in new situations. system behaving in the same
The simulations made showed manner would be laborious. The
that the model is able to apply epistemological relevance of such
the material it has learned to a a system, based on numerous

122
explicit rules, would also be small. Improvised Line", pp. 17-22,
The main shortcoming of the Watson-Guptill Publ.,1962.
system described above is the [7] J. Mehegan: "Jazz Impro-
inability to operate on higher visation 1/: Jaz: Rhythm and the
hierarchical melodic levels, e.g. Improvised Line", p. 22, Watson-
phrases. This might be overcome Guptill Publ.,1962.
by, for example, using a higher- [8] J. Bharucha: "Pitch, Har-
level network. This network mony, and Neural Nets: A
would learn and produce only the Psychological Perspective ", in
target notes of a longer harmonic "Connectionism and Music", P.
progression; e.g., four measures Todd & G. Loy ed., MIT Press,
long. Feeding the target notes 1991.
produced by this network into the [9] J. McClelland, D. Rumel-
lower-level network, and taking hart: "Explorations in Parallel
into account the respective Distributed Processing", pp.
position in the four-measure 84-86, MIT Press, 1988.
structure, would probably yield [10] D. Baker: " The Jazz Stvle
better results. of Clifford Brown. A Musical and
Historical Perspective", Studio
References. P/R, 1982.

[1] J. Bharucha, P. Todd:


"Modeling the Perception of
Tonal Structure with Neural
Nets ", Computer Music Journal,
Vol. 13, N. 4, pp. 44-53, 1989.
[2] P. Todd: "A Connectionist
Approach to Algorithmic
Composition", Computer Music
Journal Vol. 13, N. 4, pp. 27-43,
1989.
[3] M. Leman: "Tonal context
hy pattern integration over time",
Informatica Musicale, pp. 21-39
Genova: AIMI, DIST, 1991.
[4] J. Mehegan: "Ja:: Impro-
visation I: Tonal and Rhythmic
Principles", pp. 127-131, Wat-
son-Guptill Publ., 1959.
[51 S. Berg: "Ja:: Impro-
visation: the Goal Note Method",
Lou Fischer Publ., 1990.
[6] J. Mehegan: ,. Ja:: Impro-
visation 1/: Ja:: Rhythm and the

123
B1maj7/1 Gm7NI Cm7/11 F13i9 Dm7/111 Gall Cm7/11 F13l11

I@&b £IP.jRTJ~]JJJjJJ ~J! IAn.


Fm7/11 B113l11 E'maj7/1V A~13111 Dm7/111 Gall Cm7/11 F13l11

G A A G ---,G~...".

~lfl\1r ItnJ~
B1maj7/1 Gm7/V1 Cm7111 F13i9 Dm7/111 Gall Cm7/11 F13111

Fm7/11 BI 13111 E1maj7/IV E1m7111 B~maj711

b)
c D

;;w .. JflE
B' maj711 Gm7NI
1$
Cm7111
,~JJfijJ.;3JqJJ ;q
F13i9 Dm7/111 Gall
Ij; ~
Cm7111
JIDP ~
F13111

£1]3 J ,:b in $ : :fJ IJPlP J J3


Fm7/11 B113111 E'maj7/IV A' 13:11 Dm7/111 Gall

., 4~fl ",lID;] Ili\l #1 J \9 Ij ,~~~


B1maj7/1 Gm7NI Cm7/11 F13i9 Dm7/111 Gall Cm7111 F13111

Ii &b j JjJ J ~ , D, lAd


"* ...,
! 11• r I'"
Fm7/11 B"13:11 Elmaj7/1V E'm7111 B~maj7/1

Figure 3. Examples of melodies produced by the network on the Rhythm Changes


chord progression. The network was taught excerpts from Clifford Brown's solos on a)
"All the things you are" and "Gertrude·s bounce", b) "Confirmation" and "Donna Lee".
The rectangles indicate melodic patterns which occur in the learning material. Letters ··A",
"G", "c" and "D" refer to the names of the compositions.

124
Capitolo 5

ELABORAZIONE NUMERICA
DI SEGNALI

125
126
UNA NUOVA TECNICA DI SINTESI ADDITIVA
BASATA SULLA TRASFORMATA
INVERSA DI FOURIER
M. Barutti, G. Bertini

Istituto di Elaborazione della Informazione-CNR


Via S.Maria,46, 1-50126 Pisa (Italy)
Fax +39 050 554342 E-mail:bertini@ iei.pi.cnr.it

Abstract Introduzione

Additive synthesis can be Nell'ambito delle tecniche per la


considered the most powerful and generazione di segnali musicali e
general synthesis technique, but it noto che la sintesi additiva e
has very high computational quella che permette una grande
costs. Sound synthesis systems flessibilidi nella fase progettuale
which use massive and esclusive del suono (scelta del timbro, ecc.)
additive synthesis are now e consente una certa intuivita'
available on the market; dell'effetto della variazione dei
however, their cost is much parametri del segnale ed il suono
higher than that of systems ottenuto. n grosso svantaggio e la
providing an equivalent degree of notevole mole di calcolo
polyphony based on other necessaria per la sua implemen-
synthesis techniques. tazione. Grazie al continuo
The method here presented sviluppo della tecnologia elettro-
allows to generate complex nica cominciano ad apparire suI
musical signals using the Inverse mercato alcuni sistemi che
Fourier Transform with a impiegano massicciamente, la
reduced number of harmonics, sintesi additiva [1, 2]. n costo
thanks to a mechanism by which degli apparati rimane comunque
further sine waves can be added. alto, piu di un ordine di
Simulation and test of complex grandezza superiore ai sistemi di
signals synthesis have been pari grado di polifonia che usano
carried out with good results. An altre tecniche di sintesi.
evaluation of a system having Utilizzando Ie operazioni di
several thousand sine waves with analisi e sintesi di segnali basate
a good frequency and time sulle trasformata di Fourier su
resolution and a SNR higher than brevi intervalli di tempo, e
80dB, leads to a computational possibile generare digitalmente
cost of less than one tenth of dei tratti di segnale utilizzando la
systems using digital oscillators. FFT Inversa (IFFT), valutata a
partire da ampiezza e fase di

127
determinate armoniche. Anche sufficienti per generare singoli
questo procedimento ha un alto suoni di ottima qualita e varieta
costa computazionale, data l'ele- timbrica: considerando il grado
vata risoluzione di frequenza di polifonia offerto attualmente
necessaria per una buona qualita dagli strumenti elettronici ope-
nella sintesi di segnali musicali. ranti in tempo reale con un buon
Di recente sono stati proposti dei livello di prestazioni, sono neces-
metodi basati su FFr inversa nei sarie alcune migliaia di onde
quali sono introdotte delle sinusoidali.
semplificazioni che ne rendono Per quanto riguarda la velocita di
possibile l'implementazione [3]. aggiornamento dell'ampiezza e
Una soluzione proposta da della frequenza, partendo da
M.Barutti [4] (domanda di bre- precedenti realizzazioni [7] [8],
vetto n. PI93AOO0003), permette sono state da noi effettuate
di generare dei segnali complessi ulteriori verifiche utilizzando
partendo da IFFr con un numero schede con DSP fixed e floating
ridotto di armoniche grazie alIa point e particolari procedure di
possibilita di aggiungere altre test [9]; e risultato che un ottimo
componenti con frequenza qual- valore per aggiomare la frequen-
siasi, tramite un meccanismo za puo essere compreso tra 2-3
detto "metodo degli sfasamenti". msec, mentre per l'ampiezza e
Nel seguito si illustra il principio preferibile un aggiomamento pill
su cui si basa il metodo, tenendo rapido, allimite uguale al tasso di
conto che nelll'implementazione campionamento.
effettiva sono usate delle
ottimizzazioni che 10 rendono Metodi basati sulla trasfor·
conveniente dal punto di vista mata di Fourier
computazionale.
La sintesi additiva viene
Requisiti della sintesi addi· generalmente realizzata con il
!iva. metodo dell'oscillatore digitale a
lettura tabellare; l'elevato costa
Un dispositivo che implementi la computazionale e dovuto al fatto
sintesi additiva oltre a soddisfare che i valori istantanei di ogni
tutti i requisiti imposti anche sinusoide, vengono calcolati per
dalle altre tecniche di sintesi [5], ogni singolo campione generato,
(tasso di campionamento, velocita ad es. 44.1 mila volte al secondo.
di aggiomamento dei parametri, Un altro possibile metodo per
rapporto segnale/rumore, ecc.) ricavare un segnale come somma
deve poter generare un elevato di un elevato numero di onde
numero di onde sinusoidali sinusoidali e quello di utilizzare
contemporaneamente. l'algoritmo della IFFT (Inverse
Dai risultati di studi effettuati suI Fast Fourier Transform) [10]. II
suono di strumenti ad arco ed a segnale finale viene ottenuto
fiato (vedi ad es. in [6]), si puo giustapponendo segmenti(frames)
supporre che 30 sinusoidi siano ricavati dalla sequenza di una (0

128
pili) IFFT e ca1colati ad intervalli Metodo degIi sfasamenti.
di tempo non superiori a 3 msec.
Si ricorda che se Fo e la 11 meccanismo che nel nostro
risoluzione in frequenza della metodo ci permette di generare
FFf, la frequenza della riga j-ma segnali con componenti (che
e il numero delle righe (0 approssimano delle sinusoidi) con
armoniche) su cui viene ca1colata frequenze che non coincidono
la IFFT sono legate dalle con nessuna di quelle delle
relazioni seguenti: armoniche della IFFT, si basa
sulla seguente osservazione.
Farmj == jFo con Consideriamo due onde sinusoida
li di uguale ampiezza e frequenza
j==O....Narm (A, Fa) rna con differente fase:
Fo == Fc / Nifft,
Nann == Nifft / 2 a(t)==A· sin (2TC·F a· t + <p)
b(t)==A· sin (2TC·Fa· t + <p + ex)
dove N arm e il numero delle
righe della IFFT, Nifft e i1 Consideriamo t che varia nell'int-
numero di punti della IFFT e Fc e ervallo I==(O,T) e in questo
la frequenza di campionamento. intervallo eseguiamo la somma di
due funzioni a'(t) e b'(t) ottenute
L'inconveniente di generare dei da quelle di partenza come segue:
segnali tramite l'uso di una IFFf
e che non si riescono a generare a'(t)==a(t)·(l-t/T), b'(t)==b(t)·(t/T)
segnali contenenti onde sinu-
soidali con frequenza diversa da Si ottiene una funzione c(t)
Foj. Un'altro inconveniente di (fig.l) che per valori di ex piccoli
questa tecnica e che la approssima un segnale sinusoidale
giustapposizione dei frames puo d(t) di frequenza F==F a+aI2TCT
provocare brusche discontinuita (diversa da quella delle funzioni
in ampiezza; perche cia non di partenza).
accada vengono implementati
opportuni meccanismi di interpo-
lazione (mix lineari, senquadro
ecc.), noti in letteratura [11].
11 costa computazionale dei
metodi . utilizzanti l'algoritmo
della IFFT e costituito da quello
per i1 calcolo delle IFFT e per il
mix finale dei segnali risultanti,
pili una quantita variabile legata
alla preparazione dei parametri Fig.l.Esempio di sfasamento
di partenza, cioe ampiezza e fase
delle armoniche in funzione del Nell'algoritmo si sintesi, come
segnale da generare. vedremo in seguito, Ie funzioni
aCt) e bet) corrisponderanno a

129
due armoniche di uguale ordine ridurre il costa computazionale
di due diverse IFFT: i risultati di finale.
quest'ultime verranno sommati
opportunamente in modo da Algoritmo di sintesi
ottenere segnali con componenti
del tipo c(t). In tab. 1 e riportato Poniamo di dover generare un
il SNR fra la sinusoide d(t) e il segnale monofonico composto da
segnale d(t)-c(t), in funzione M onde sinusoidali contempo-
dello sfasamento a e di K, ranee. In ingresso avremo due
parametro definito come 21t/a. vettori di M elementi; i j-esimi
elementi di questi vettori sono la
Tabella 1 frequenza F e l'ampiezza Ache
K 0: (gradi) SNR (dB) l'onda j-esima deve avere nel
2 180 2.8 frame che stiamo prendendo in
4 90 13.4 considerazione.
8 45 25.1 La strutturazione a blocchi
16 22.5 37.0 dell'algoritmo principale e la
32 11.25 49.1 seguente:
64 5.63 61.1
128 2.81 73.2 while true do
begin
for j:=1 to M do
Si considerano solo i valori begin
positivi di a perche' SNR e < INVILUPPO >
invariante rispetto al segno della < SFASAMENTO >
fase. Con una serie di determinati <TRASFORMAZIONE >
< ACCUMULO>
artifici [4], si riesce ad avere un end;
SNR>80dB anche con a=45°, < IFFT >
come e mostrato nella tab. 2. < MIX FINALE>
end;
Tabella 2
K 0: (gradi) SNR (dB) Descriviamo brevemente il fun-
2 180 2.2 zionamento dei singoli blocchi.
4 90 39.4 Le operazioni del ciclo piu
8 45 83.0 interno verranno effettuate M
16 22.5 84.5 volte, cioe una per ogni onda
32 11.25 93.8 sinusoidale, ed il tutto verdI.
64 5.63 105.3 ripetuto per ogni frame da
128 2.81 117.2 creare:

Sfruttando il meccanismo degli <INVILUPPO>. Serve ad acqui-


sfasamenti si possono generare sire i valori di frequenza (F) e di
componenti con frequenza ampiezza (A) da una prestabilita
qualsiasi e con risoluzioni di 0.2- zona di memoria.
0.3 hz, utilizzando opportune <SFASAMENTO>. Dato il valo-
IFFT con un basso numero di re di frequenza (F) e la frequenza
armoniche e consentendo cosl di base della IFFI' (Fa), si ricava il
numero dell'armonica a frequen-

130
za pili VIcma, j=0.5+F/Fo e 10 sponde ad un intervallo
sfasamento necessario per temporale pari a 4T.
approssimare quella data onda II costa computazionale e stato
con a= 21tT(F-jFo). valutato prendendo in eons i-
<TRASFORMAZIONE>. Questo derazione Ie operazioni principali
bloeco esegue la trasfonnazione di moltiplicazione, addizione (0
da coordinate polari nei sottrazione) ed accesso in memo-
eorrispondenti numeri complessi, ria, supponendo che i tempi di
necessari alIa IFFT. lettura e di scrittura in memoria
<ACCUMULO>. In questo bloc- siano uguali. Vengono trascurate
co vengono effettuati gli accumu- Ie restanti operazioni, perche il
Ii nei vettori ehe costituiscono lora numero e piccolo rispetto a
l'input per Ie IFFT. quello delle operazioni principali.
< IFFT >. Vengono utilizzate 2 I risultati relativi ad un segnale
IFFT reali (equivalenti ad una stereo sono stati confrontati con
complessa) per ottenere ogni quelli relativi al metoda del-
segnale monofonico: con la prima l'oscillatore digitale a lettura ta-
si eonsiderano tutte Ie annoniche bellare e riportati nella tabella 2.
aCt), mentre nella seconda quelle
del tipo bet). Risultati e conc1usione.
<MIX FINALE>. Vengono mi-
scelati i vettori di numeri reali Al fine di verificare Ie poten-
ottenuti dalle IFFT tramite delle zialita del metodo proposto, sono
tabelle di correzione opportune, state fatte delle prove di sintesi
in modo da ottenere la somma di (in tempo differito) con simu-
tutte Ie onde di tipo e(t). lazione di FM, filtraggio, rumore
bianco, campionamento e di
Gli ultimi due blocehi devono segnali con con spettri dinamici.
essere eseguiti, in un'ipotetiea Risultati particolannente interes-
implementazione in real-time, ad santi sono stati ottenuti nelle
intervalli di T sec. esatti. imitazioni della fase di rilascio di
suoni, nei riverberi di strumenti
Simulazioni e valutazioni sui quali l'organo a canne ed il
costo computazionale saxofono e nell'imitazione del
pianoforte.
Nella fig.2 della pag. successiva In definitiva col metodo degli
sono mostrati esempi di segnali sfasamenti si riesee ad utilizzare
ottenuti con il metoda degli la sintesi additiva per generare
sfasmenti, nella versione che segnali con onde sinoidali a
tiene conto di tutte Ie ottimiz- frequenza qualsiasi entro la
zazioni: l'errore introdotto e a risoluzione di 0.03 Hz, can un
onda piena ed e amplificato 4000 rapporto Segnale/Rumore supe-
volte per poter essere visibile. riore agli 80 dB, ad un costa
Vengono considerati 3 frame e computazionale che e, nel caso di
l'intervallo visualizzato corri- qualche migliaio di componenti,
inferiore ad un decimo rispetto a

131
ru)

n
A A

Ampiezza e frequenza costan- ",n IJ


I~
, Ii A ~~ .
ti, SNR = 88.5 dB. tV \ " I' !"V'f

Ampiezza costante, frequen-


za variabile, SNR = 85.9 dB.

ret)

Ampiezza e frequenza varia- Jn~


r~1
I~ ~ ~f\\ ~
.
bili, SNR = 84.9 dB. V V~
v v

ret>

Due componenti con am-


piezza e frequenza variabili,
SNR = 84.9 dB.

Fig. 2. Esempi di segnali ottenuti can il metoda degli sfasamenti.

132
quello richiesto dal metodo puo notare come il metodo degli
dell'oscillatore digitale. sfasamenti sia conveniente a
Nella tabella 3 vengono riportati partire da 250 oscillatori.
i valori realtivi ai due metodi: si
Tabella 3
Oscillatore Digitale Metodo degli Sfasamenti
M Molt. Add Mem. Molt. Add Mem.
125 12 30 60 16 23 37
250 24 60 120 17 25 39
500 4R 120 240 19 29 4"1
1000 96 240 480 22 37 50
2000 192 480 960 30 52 66
4000 384 960 1920 46 82 97
8000 768 1920 3840 78 142 158

Questo lavoro e stato svolto con [5] G. De Poli:"A Tutorial on


il contributo del Prog. Finalizza- Digital Sound Synthesis Techni
to del CNR "Sistemi Informatici ques" Computer Music Journal
e Calcolo Parallelo" Sottoprog. 2. Vol 7, n.4, pp. 8-26 (1983).
[6] J.A.Moorer:"Signal Process
Bibliografia ing Aspects of Computer Music:
A Survey ", Proceedings of the
[1] J.Snell:"Design of a Digital IEEE, vol.65, n. 8 pp. 1108-1132
Oscillator that Will Generate up (1977) .
to 256 Low-Distortion Sine [7] G.Bertini, M.Chimenti,
Waves in Real-Time, Computer F.Denoth:"TAU2: Un Terminale
Music Journal VoLl n.2, pp.4-25 Audio per Esperimenti di
(1977). "Computer Music", Alta
[2] "The FDSS Studio, System Frequenza, Vol. XLVI n.12,
software for Apple Mcintoh pp.600-609, (1977).
Computerr", Computer Music [8] Cor Jansen:"Sine Circuitu" ,
Journal VoLl6, n.3, p. 99 (1988) ICMC, Proceedings, (Montreal,
[3] X. Rodet, Ph. Depalle: "A CANADA), pp.223-225, (1991).
new additive synthesis method [9] M. Barutti, G. Bertini:
using inverse Fourier transform "Valutazioni suI costo
and spectral envelope" ICMC computazionale di una tecnica di
Proceedings ( San Jose, CA, sintesi basata su FFT inversa"
USA) pp. 410-411, 1992. Nota Int. lEI (in stampa).
[4] M.Barutti; "Un Metodo [10] E.O.Brigharn:"The Fast
Efficiente per la Sintesi di Fourier Transform", Prentice-
Segnali Acustico-Musicali Me- Hall, (1974).
diante Trasformata Inversa di [12] H. Chamberlin:"Musical Ap
Fourier", Tesi di laurea, Corso di plications of Microprocessors',
Scienze dell'Informazione, Univ. Hayden, (1980).
di Pisa, (1992-'93).

133
ANALISI DELL'EFFETTO DEL TOCCO SUL
TRANSITORIO D'ATTACCO DEI SUONI DI UN
ORGANO A CANNE A TRASMISSIONE
MECCANICA
Laura Bazzanella, Giovanni B. Debiasi
C.S.c. - D.E.!., Universita di Padova, via Gradenigo, 6A - 35131 Padova
tel. +39 49 8287500-fax +39 49 8287699-Email debiasi@paola.dei.unipd.it
Ricerca svolta con iI contributo di General Music S.p.A.-S.Giovanni in Marignano (FO)
ABSTRACT
In order to obtain an accurate electronic imitation of a pipe organ, it is ne-
cessary to pay attention to the characteristics of pipe organ sounds. Particu-
larly, in this ambit, we consider the problem of the "touch". In the pipe or-
gans with mechanical transmission, the organists notice an answer of the
instrument connected with the type of solicitation, i.e. the so-called "touch",
and in the description of the phenomenon they stress overall the key velo-
city. In order to explain this intuitive observation, we have considered a pipe
organ with mechanical transmission (built by the organ builder Mascioni in
Padua). We have considered the sounds of various stops of the pipe organ,
using for all the notes two different ways of touch: slow and fast. The defmi-
tion of these two categories has been effected by organists that have played
the instrument; in some cases we have used a "mechanical finger". We have
then studied the sounds with the STFT (Short Time Fourier Transform) algo-
rithm and with our software useful for the analysis and representation of the
spectral components of a sound, paying attention to the transient. The obser-
vation of the results has proved a real difference between the same note
played with different "touch". The utilization of STFT permitted to notice a
remarkable sensitivity of the second spectral component: with fast "touch"
the amplitude reaches the stationary value with oscillations, while with slow
"touch" we observe an exponential evolution. Using a proprietary software
for the definition and representation of the timbre of a sound, we got the
evolutions of the timbre and the complexity of the considered sounds,
obtaining very different behaviours. The application of our method to
different pipe organs will allow the acquisition of more informations on the
effect of the "touch", in view of a subsequent application for the synthesis of
more natural sounds with electronic instruments.
Introduzione alimentate mediante aria inviata da
L'organo e lilO strumento in cui il un mantice e regolata per mezzo di
suono e prodotto da canoe tasti. I suoi componenti principali

134
sono, oltre alle camle, i somieri, i (per valutare solo due tra i vari pa-
mantici, la consolle, Ie tastiere e la reri contrapposti si vedano, per
trasmissione. esempio, [1][2]).
Dai mantici, la cui funzione viene La questione appare quindi ancora
espletata negli organi modemi da aperta, ed e sembrato percio utile
elettroventilatori, l'aria giunge al fomire un nuovo contributo per la
somiere (una cassa sopra la Quale sua soluzione. A questa scopo e
sono montate Ie camle); essa poi stata intrapresa una serie di
passa aIle Camle attraverso speciali indagini sull'infiuenza del tocco in
valvole la cui apertura e comandata organi pregiati sia di costruzione
dalla tastiera e dai comandi dei modema sia di epoca barocca.
registri. 11 collegamento tra i tasti e Vengono qui presentati i risultati
i comandi dei registri della consolle delle analisi effettuate sui suoni di
ed il somiere avviene attraverso la un organo modemo di grande
trasmissione. pregio a trasmissione meccanica,
Negli organi a trasmlSSlone ottenuti con modalita di tocco
meccanica i tasti sono collegati aile diverse in modo da modificare, ove
valvole (ventilabri) attraverso un possibile, il transitorio di attacco
sistema di tiranti 0 di leve; gli altri dei suoni di tale organo variando il
tipi di trasmissione prevedono tocco.
invece un collegamento fra tasto e Sono stati presi in considerazione
valvola di tipo elettromagnetico od tre diversi procedimenti di indagine
elettropneumatico. suI segnale registrato e
La maggior parte degli organisti successivamente campionato:
giudica che la trasmissione mecca- applicazione della STFT (Short-
nica sia pitl idonea a fomire la Time Fourier Transform);
possibilita di infiuire sull'attacco di applicazione di un algoritmo di
ciascuna nota attraverso modalita separazione fra parte annonica e
di tocco differenti; in particolare e parte inarmonica del suono;
apprezzata la meccanica cosidetta rappresentazione can diagranlffia
sospesa, nella Quale i tasti sono polare a cartesiano del timbro.
tenuti in posizione di riposo dalla 2. Strumento e modalita di
molla che tiene chiuso il ventilabro, esecuzione
al Quale sono sospesi direttamente La strumento utilizzato per Ie
mediante un tirante. registrazioni e l'organo Mascioni
Alcuni esprimono invece la convin- del Conservatorio "Cesare Pollini"
zione che il transitorio di attacco di Padova, costruito nel 1979 ed
sia assolutamente indipendente dal installato nel 1980; si tratta di uno
tipo di trasmissione e di tocco. strumento a tre manuali a
Esistono dunQue pareri discordi trasmissione meccanica, con 52

135
registri comandati elettricamente. risposta in frequenza lineare tra 2
1 materiali usati per la costruzione Hz e 22 kHz.
delle canoe sono quelli tradizionali, Si e pure utilizzato un fonometro
cioe lega di stagno per Ie canne B&K 2603, con relativo microfono
labiali, rame per Ie ance, ottone per a condensatore, per la misura della
Ie trombe Intensity Level (dB) dei suoni
Sono state effettuate registrazioni registrati.
relative a diversi registri dell'orga- Con un oscilloscopio digitale e
no, considerando per ogni nota due stato possibile inoltre effettuare un
modalita di tocco: un tocco "lento" controllo visivo dei rumori di
ed uno "veloce". fondo.
La definizione di queste due Vista la disposizione simmetrica
categorie e stata effettuata da delle canoe relative ad ogni registro
organisti che hanno offerto la loro (note pari collocate tutte nella parte
collaborazione per suonare 10 destra, note dispari tutte a sinistra),
strumento; per ottenere una velocita la posizione del microfono e stata
di battuta estremamente ridotta e fissata in due punti distinti (a 2.5 m
costante (0.5 mm/sec), e stato dalle canoe di facciata) a seconda
utilizzato un dito meccanico. che si registrasse una nota pari 0
Ogni nota e stata suonata una nota dispari.
singolarmente, con una durata di II suono registrato e stato poi
almeno 5 secondi. suddiviso in transitorio d'attacco e
Per quanto riguarda la scelta dei parte stazionaria.
registri, si e preferito considerare la 4. Analisi effettuate
III tastiera, poiche per essa la Per effettuare Ie analisi e stato
trasmissione e a meccanica sospe- utilizzato il programma SPC90,
sa. sviluppato presso la General Music
I registri analizzati sono: Principale S.p.a., integrato da opportuni
8', Bordone 8', Quintadena 16', programmi per il tracciamento
Viola da Gamba 8'. dell'evoluzione del timbro suI
3. Modalita di acquisizione diagramma polare [3][4] 0 su
Per l'acquisizione del segnale e diagramma cartesiano.
stato utilizzato un microfono Si illustrano brevemente qui di
direzionale (cardioide) Semmeiser seguito Ie principali funzioni dei
MD421 con apposito programmi utilizzati.
preamplificatore microfonico. 4a. STFT(Short-Time Fourier
II segnale dell'organo e stato Transform)
registrato su un registratore tipo Se una serie temporale possiede
OAT, marca Denon, modello OTR caratteristiche spettrali tempo-
2000 con dinamica 90 dB e varianti, come ad esemplO

136
un'annonica che trasla in frequenza, prodotto fra 10 spettro del segnale e
la sua rappresentazione in tennini la risposta in frequenza del filtro,
di trasfonnata di Fourier non ne un'operazione di trasformata di
fomisce infonnazione. Fourier inversa suI risultato
La Short-Time Fourier Transfonn pennette di ricavare l'evoluzione
Illvece consiste, III linea di temporale della parte annonica del
princlplO, in una serie di suono.
trasfonnate di Fourier, ognuna La parte inannonica, costituita
relativa ad un particolare istante soprattutto dal softio della canna,
dell'intervallo di analisi (ad ogni viene poi calcolata per differenza.
istante considerato viene associata 4c. Rappresentazione grafica
la trasformata di Fourier della dell'evoluzione del timbro
porzione di segnale posta "attomo" E' stato utilizzato il gia citato
a quell'istante);essa quindi pennette metodo per la valutazione del
di seguire l'evoluzione temporale timbro degli organi a canne
dello spettro del segnale[5]. presentato al VIII Colloquio di
Le porzioni di segnale da Infonnatica Musicale [4] che
analizzare si ottengono moltiplican- pennette, fra Ie altre cose, di
do il segnale stesso per una finestra rappresentare un suono in un
di analisi che viene traslata nel diagramma polare, nel quale la
tempo di un passo costante. distanza dal centro individua la
Nel caso specifico, e stata utilizzata complessita, mentre l'angolo
una finestra di Hamming di 256 0 rispetto all'asse verticale individua
512 punti traslata di volta in volta il "colorito" del suono in esame.
di 64 0 128 campioni Per rnigliorare la visualizzazione
rispettivamente. durante il transitorio, e state
In questa modo sono stati ricavati utilizzato anche un diagramma
gli andamenti temporali di cartesiano, che riporta in ascisse
ampiezza e frequenza delle l'angolo (colorito) ed in ordinate la
annoniche del segnale; III complessita; inoltre, sempre per 10
particolare si sono considerate Ie stesso motivo, e stata prevista la
prime sei annoniche. possibilita di selezionare (con una
4b. Separazione fra parte funzione di zoom) esclusivamente
armonica e parte inarmonica del la regione del piano interessata
suono dalla traiettoria.
Dato 10 spettro (medio) del segnale, 5. Presentazione dei risultati
viene costruito un filtro con 5a. STFT(Short Time Fourier
risposta in frequenza data da una Transform)
serie di finestre centrate sulle L'osservazione degli andamenti
annoniche significative; dopo un delle ampiezze delle prime sei

137
annoniche (vedi fig. 1 e 2) veloee, l'emissione iniziale di rumo-
evidenzia un comportamento diver- re (soffio) risulta pill accentuata.
so fra note uguali suonate con Tale conclusione fornisce tUla
modalita di tocco differenti. ulteriore confenna dell'effettiva
Tale diversita di comportamento e influenza del tocco suI transitorio
stata riscontrata, in modo pressoche iniziale dei suoni delle carme;
unifonne, per Ie note analizzate dei questa tipo di conferma non sembra
vari registri presi in considerazione. sia stato preso in eonsiderazione da
Si e cosi potuto riscontrare che il altri autori.
valore a regime viene raggiunto 5c. Rappresentazione grafica del
dopo una sovraelongazione, anche timbro
notevole, dell'ampiezza della 20 Considerando l'evoluzione del
annonica nel caso di esecuzione timbro durante il transitorio
con toceo veloce; nel caso di tocco d'attacco (fig. 5,6,7 e 8), si notano
lento, l'andamento e di tipo traiettorie differenti.
esponenziale 0 comunque molto I diversi punti delle traiettorie
menu brusco. corrispondono a intervalli temporali
Nel casu dei registri a canne di 20 ms.
tappate (Bordone 8' e Quintadena L'orientamento delle traiettorie,
16'), caratterizzati da spettri privi molto evidente suI diagranlffia car-
delle annoniche di ordine pari, 10 tesiano espanso, consente di
stesso comportamento si riscontra effettuare una valutazione qualitati-
per la 3 0 armonica. va di come si modificano la com-
Questa caratteristica del comporta- plessita e l'angolo relativo al timbro
menta in transitorio confenna pie- nel transitorio di attacco. Nella
namente quanta gia enuneiato da figura 6 si puo notare che il timbro
vari autori [6][7][8], mentre con- tende ad evolvere da valori di
traddice quanta affennato in [1]. complessita ed angolo maggiori a
5b. Separazione di parte valori minori.
armonica e parte inarmonica Anche questa e una caratteristica
AnclIe Ie evoluzioni temporali della abbastanza uniforme, gia riscontra-
parte inarmonica (soffio) nelle due ta anche in precedenti analisi [9].
condizioni di toceo risultano estre- L'influenza del tocco, in questo
mamente diverse (vedi fig. 3c e 4c). caso, consiste nel rendere differen-
Anche in questa caso si sono te questa modalita di evoluzione,
riscontrati risultati complessiva- passando dalla modalita di tocco
mente omogenei per Ie varie note lento a quella di tocco veloce.
esarninate. Anche tutto cio fornisce un'ulteriore
La conclusione che si puo trarre in confenna della sensibilita al tocco
questa easo e che, con il tocco dello strumento, con un approccio

138
del tutto nuovo, mai preso sinora in [1] E. Girardi, "Problematiche Rela-
considerazione per la mancanza del tive all'Esecuzione su Organi Liturgi-
metodo adatto ad evidenziare l'evo- ci in Situazioni e Modalita Diverse di
luzione globale del timbro in tratl- Alimentazione", III Convegno di Or-
ganologia sui Tema La Riforma
sitorio.
dell'Organo Italiano, Pisa 1990.
6. Conclusioni [2] P. Barbieri, "La Controriforma
II desiderio di approfondire sempre dell'Organo in Italia: Meccanica e Fo-
pill accuratatnente la conoscenza nica dal 1930 aI 1990", III Convegno
del comportatnento degli organi a di Organologia suI Tema La Riforma
canne, da una parte, e l'esigenza di dell'Organo Italiano, Pisa 1990.
rendere sempre pill realistico il [3] M. Dal Sasso, G.B. Debiasi,
suono degli organi liturgici "Metodo per la Valutazione
elettronici, dall'altra, conducono ad Automatica dei Timbri di Organi a
analizzare in modo sempre pill Canne", VIII Colloquio di
minuzioso Ie caratteristiche degli Informatica Musicale, Cagliari 1989.
[4] G.B. Debiasi, G. Spagiari, "Meto-
organi tradizionali.
di di Analisi delle Microvariazioni di
In particolare, qui si e indagato Ampiezza e Frequenza dei Suoni Mu-
sulla possibilita che il tocco sicali e loro Applicazioni allo Studio
influisca suI transitorio d'attacco di un Organo Barocco", IX Colloquio
dei suoni di tali strumenti. di Informatica Musicale, Genova
I risultati delle sperimentazioni 1991.
effettuate permettono di confermare [5] J.S. Lim, A.Y. Oppenheim,
che, negli organi a canne a "Advanced Topics In Signal
trasmissione meccanica, diverse Processing", Prentice Hall, 1988.
modalita di esecuzione delle note [6] N.H. Fletcher, "Transients in the
Speech of Organ Flue Pipes-A Theo-
determinatlO diverse caratteristiche
retical Study", ACUSTICA, vo1.34,
del Ioro transitorio d'attacco.
pp.224-233, 1976.
Tra Ie modalita di indagine [7] T.L. Finch, A.W. Nolle, "Pressure
adottate, due sono da considerarsi Wave Reflections in an Organ Note
innovative e anche queste Channel", Journal of the Acoustical
concordano pienatnente con i Society of America, vol. LXXIX-5,
risultati ottenuti mediante gli altri pp.1584-1591, 1986.
metodi di analisi gia utilizzati da [8] S. Caddy, H.F.Pollard, "Transients
diversi autori, consentendo di poter Sounds in Organ Pipes", ACUSTICA,
affermare con atlCOr maggiore vo1.8, pp.277-280, 1957.
sicurezza che, effettivatnente, il [9] M. Dal Sasso, G.B. Debiasi, G.
Spagiari, "Method for Automatic
tocco influisce suI transitorio di
Evaluation of Timbre and Fluctua-
attacco dei suoni degli organi a tions of Pipe Organ Sounds", Interna-
CaIme a trasmissione meccaIlica. tional Computer Music Conference,
Bibliografia Montreal 1991.

139
23.22 MS

__ ~;,d~~~t--1.,:::2",~;..;::""~--,,,H:....:~,-.
---'

a) b) c)

Fig.l Andamcnto lcIlJpOfale deB' ampiezza FIg.3 Evoluzionc tcmporaIe del


deBe prime sci annoniche - Nota RE2 - traDBitorio di attacco - Nota RE2 -
RcgjB1ro PRINCIPALE 8'- Tocco lENTO RcgjB1ro PRINCIPAlE 8' - Tocco
LENTO a) 8UODO complcssivo
b) parte annonica
c) parte inannooica

. . -~;..;"~.- '- '- '' - ~ ....


~~~ ----.:.-'
~.~~~~----,
23.22 MS
.",........ 1..346 Hz
2048 t

-:-.-.-:~.rl:::::::=:::·==:"'-::'=:'::::::::==~

·,-----1'JC '9 _._._:a-_'-.-.-,-.-.-.--~-


a) b) c)

FIg.2 Andamcnto tempor.dc dell' ampiezza Fig.4 Evoluzione lcmporIJ1e del


deBe prime sci armonicllc - Nola RE2 - traDBitorio di attacco - NotaRE2 -
Rcgislro PRINCIPALE 8'- ToccoVELOCE Rcgislro PRINCIPAlE 8' - Tocco
VELOCE a) suono complcssivo
b) parte armonica
c) parte inarmonica

140
I'---...
ts:
V
'I
\. L.-:
V
.. .
.
,~ ~
,~

Fig.S Diagramma polare deU'evolu2ione del Flg.6 Diagramma cartesiano espanso (zoom)
timbro - Nota RE2 - Registro Principale 8' - deU'evoIU2ione del timbro - Nota RE2 -
Tocco LENTO Registro PRINCIPALE 8' - Tocco LENTO

.
/ 1'\.'
- 7
->...... ~
............. ......
~ ~ ..
~~
" VI
"",-
"'" "- ,~ "::,.

~.

FIg. 7 Diagramma polare deU'evo!u2ione del Fig.S Diagramma cartesiano espanso (zoom)
timbro - Nota RE2 - Registro PrincipaIe 8' - deD'evoIu2ione del timbro - Nota RE2 -
Tocco VELOCE Registro PRlNCIPALE 8' - Tocco VELOCE

141
SOUND ANALYSIS METHODS BASED ON CHAOS THEORY

Angelo Bernardi, Gian-Paolo Bugna, Giovanni De Poli

CSC-DEI, Universita' di Padova, Via Gradenigo 6a, 35131 Padova PD, Italy
email: depoli@dei.unipd.it

Abstract signal processing, the detection,


analysis and characterisation of
Musical signals presents several signal of this type present a
aspects that are not detected by significant challenge and an
classical analysis methods, such as opportunity to explore and develop
time frequency analysis techniques. completely new classes of algo-
For instance in signals produced by rithms for signal processing. There
real instruments, the non linear dy- is the experimental evidences that
namics of the excitator may often some music signals show a fractal
result in some small or large degree nature, a first example are the mu-
of turbulence during the evolution sical sound produced by a non-
of the sound, or in the production linear dynamic excitator with
of non periodic sound (such as mu1- chaotic behaviour [1]. Besides it's
tiphonics tones). We investigate the shown that musical sound presents
possibility of using analysis a pitch fluctuation (turbu1ence) that
methods based on chaos theory can be modelled and quantified by
both for studying significant fractals. Spectral density measure-
properties of the signal and of the ments of the pitch variation in
production mechanism. various types of music show their
common correlation witl1 the fractal
1. MUSIC SIGNAL AS FRAC- world. More over some kind of
TAL (CHAOTIC) SIGNAL musical computer aided composi-
tion are tied to the self similarity
Chaotic signals are of particu1ar in- property of generator of event se-
terest and importance in experimen- quence. All the above theoretical
tal physics because of the wide considerations and the experimental
range of physical processes that evidence drive us to use fractals as
apparently give rise to chaotic be- a mathematical model for the study
haviour. In particular the physical of music signals. The music signal
systems interested by this kind of waveform is studied by new
signals are characterised by strong analysis techniques based on the
non-linearity and a fractal dynamic concepts of fractal geometry.
nature. From the point .of view of One of the main purpose of these

142
kind of approach is to detennine examine the basic concepts of exact
whether the structure of a musical self similarity and statistical self
signal can be considered self- similarity. An example of exact self
similar, and hence it is possible to similarity is the Von Koch curve; in
quantifY the degree of turbulence of fact each small portion of the curve,
the signal itself as. reflected in the when magnified, can reproduce ex-
fragmentation of its time graph. The actly a larger portion. The curve is
most important idea we focus on is said to be invariant under the
the fractal dimension of sound sig- changes of scale. An actual object
nals because it can quantifY their rarely exhibit exact self similar at
graph's fragmentation. Since the re- every scale factor; however they
lationship between turbulence and often possess a related property of
its fractal dimension or the fractal statistical self similarity, where a
dimension of the resulting time small portion of the curve looks
graph is little understood, in this like (but not exactly like) a larger
paper we conceptually equate the portion. Formally we can say that a
amount of turbulence in a musical signal is statistical self similar if the
sound with its fractal dimension. To stochastic description of the curve
compute it, is used an algorithm is invariant under scale change.
based on morphological filters that Fractal sets are characterised by
iteratively expand and contract the statistical self-similarity properties;
signal's graph. Thus it is possible to objects obviously remain different
determine the most significant in details related with scale change.
properties of music signal if the The way, in which the detail varies
starting hypothesis of fractal signal as one changes scale, can be
is sufficiently verified. Besides with characterised by a parameter called
this new analysis method based on fractal dimension.
fractal geometry it is possible to From the application of signal
obtain useful information for sound theory it is possible verifY that
characterisation also when the there is a direct relationship
music signals is not closely related between the fractal dimension and
with a fractal. the logarithmic slope of the spectral
density of a fractal signal [2]. For
2. FRACTAL SIGNAL MODEL instance a one dimensional
AND LOCAL FRACTAL Fractional Brownian Motion signal
DIMENSION with a fractal dimension D is
characterised by a spectral density
To understand how fractal proportional to 1!jf3 where f3=(5-
geometry, chaotic signals are 2D).
related with musical signal we can However . the statistical self

143
similarity above considered extend behaviours in different portions of
from arbitrary large to arbitrary the signal characterised by different
small scales. For actual signal this fractal dimension.
property can be formulated only The local fractal dimension results
over a finite range of scale changes. variable as a function of time and
Therefore in the case of real signal of scale factor LFD(t.£). We can
the hypothesis of a statistical self note the similarity with Short Time
similarity can be sustained only Fourier Transform (SIFT) defining
over a finite range of time scale. local time/frequency variations of
For instance the sound have a finite the signal, and remember the
duration and in the discrete time relationship between fractal
they have also a finite bandwidth. dimension and spectral slope.
However in this range the fractal Numerous algorithms exist to
dimension should appear constant, estimate fractal dimension. We
but on the other hand for real employed an efficient algorithm,
signals the hypothesis of self recently developed by
similarity is only approximately Maragos [3] [4], based on morpho-
verified and then the fractal logical filtering of time graph of the
dimension results a function of the signal to compute variation of the
time scale factor £ too. Moreover cover areas vs. time scale. This al-
for music signal that can be gorithm allows to estimate LFD(£)
described by non stationary process which for each £ is equal to the
the characteristic of the signal var- slope of a line fitted to the log-log
ies in function of time: hence also plot of the cover area over a
the fractal dimension will reflect moving window. We employed a
this property. window of 10 scales {£,£+1, ...
The estimation of a global fractal ,£+9}.
dimension will give an average
value of this variation of little inter-
3. FRACTAL DIMENSION AS
est for a sound characterisation.
SCALE FACTOR FUNCTION
Hence, instead of a global dimen-
sion we estimate a local fractal di-
Here we report some fractal
mension (LFD) computed on a
dimension analysis on real musical
window covering portion of a
signals, such as multiphonics
musical signal. Since the music sig-
sounds on woodwinds (clarinet and
nal generally is non stationary, the
oboe) and wolf-note on double-
measurement of different windows
bass together with conventional
will not be same. If we move the
sounds on the same instruments.
window along the time evolution of
the signal we can underline various

144
Fig. l.a and l.b show time behav- using time windows of 1 or 2
iour of a clarinet sound and its frac- periods of the examined signals.
tal dimension vs. scale factor. It Indeed with a largest number of pe-
can be seen that fractal dimension riods we can reach a saturation
analysis of a real musical signal condition in which to a scale factor
shows interesting properties. variation don't concur an adequate
Particularly we can observe that variation of the cover area.
many referred signals show fractal However to avoid this problem we
properties such as the invariance of can use a higher sampling fre-
quency.
We can observe that fractal
analysis is not sensitive to a
magnitude (attenuation) factor ap-
plied to the musical signal too.
Indeed fractal dimension is
estimated with a straight line slope
on a logarithmic scale: so the
possible magnitude/attenuation fac-
tor reflect back only on intercept of
Fig. l.a the straight line interpolation, not
its slope. To obtain best results we
suggest to use full range signal am-
plitude; this way we have a less

"olr
sensitive dependence of the
estimated dimension from quantiza-
, .. tion errors.
, .. In the previous example we applied
a fractal model to a substantially
quasi-periodic signal. To better
characterise turbulence we prefer to
Fig. l.b
separate this component from the
local fractal dimension vs. scale quasi-periodic part of the signal.
factor. We extract the frequency and
We can see that fractal dimension amplitude deviations of the
isn't representative of the analysed harmonics by SIFT analysis and
instruments too: indeed the same apply the fractal model. Fig.2.b
instrument can generate sounds shows the estimated fractal
with different fractal dimension. dimension of amplitude deviations
After a great number of tests we es- (fig.2.a) of the first C4 partial
tablished that we obtain best results played in the principal register with

145
a pipe-organ. These fluctuations are verified.
well approximated with fractal di- Information deduced with the
mension values 1.6. Successive previous analysis could be
partials show similar behaviours. employed in a sound synthesis
These evidences reveal that fractal model. We can think to an additive
synthesis model in which we can
rule partials' amplitude and
frequency fluctuations by control
signals consistent with fractal
analysis. In simpler cases it should
be enough to generate a control
signal with a constant local fractal
dimension varying the scale factor.
Fractal interpolation can be used
Fig.2.a when we desire to specify a general
behaviour too.
Synthesis techniques could
presumably be applied for signals
•• 0 ( with a local fractal dimension
... varying (vs. scale factor) according
an assigned function of scale factor.
This way it should be possible to
look at sound synthesis for signals
Fig.2.b where the invariance of the local
modelling in control signals IS fractal dimension vs. scale factor of
widely justified. control signals is not verified.
A great number of the signals we
analysed keep a sufficiently 4. FRACTAL DIMENSION AS
uniform local fractal dimension FUNCTION OF TIME
varying the scale factor E. Certain
signal don't verify this condition Computing variations of the fractal
anyway. Some seem to stabilise dimension as function of time for a
very slowly on a steady fractal di- musical signal may show the time
mension. Others never exhibit a varying turbulence of the various
stable fractal dimension, reed ex- phases of the sound.
cited registers for instance. Such To remark this information we
musical signals are probably choose the value of fractal
characterised by an absence of tur- dimension corresponding to a fixed
bulence, so the primary conjecture scale factor, so we calculate this
of the fractal modelling is not value for different windows,

146
namely different portions of the After a transient where fluctuations
same signal in subsequent time. may be perceptible, generally the
The choice of the designated scale fractal dimension of a real
factor is settled from two prominent instrument sound shows a tendency
factors: a choice of a too low factor to settle on a constant value. We
£ could give rise to a wrong value made several tests on the same sig-
of the dimensions, while a large nal, varying the number of the
scale factor imply a long computa- samples of the window used for the
tion time. Due to this factors we es- dimension computation. Using
timate that the best reference value windows with a lower number of
is in order of tens. samples the behaviour is similar to
As a result to the dimension that shown in the reported pictures
estimation for that particular scale but it marked out large fluctuations
factor on consecutive signal on the average value. Clearly a re-
sections, it is possible to plot a duction of the number of samples
graph reporting the estimate involves a diminishing of the time
dimension vs. time. required for computation.
Results of this analysis are shown Nevertheless previous pictures
in the following picture (Fig. 3), in clearly show fluctuations of the
which we report time behaviour of fractal dimension into the attach
the signal and the corresponding (and decay) transient for sounds ra-
fractal dimension variations. In this diated by real instruments. Of
picture we used observation course such information could be
windows of 1,000 samples, scale useful applied into synthesis
factor was fixed to c= 10 while the models so to evoke a large natural-
estimation of straight line slope ness of the artificial sounds.
(dimension) was discerned on ten
points. 5. GENERATOR'S DYNAMIC
Fraetal Dl,..,alon
ANALYSIS IN STEADY
SOUNDS

Besides fractal modelling of


signals, chaos theory developed
tools for the analysis of the sound
generation mechanism. Namely, it
is possible to resolve the particular
behaviour originating the sound.
Particularly this approach is useful
-1.00~.DO;--+--O""'.eo~--;-1.1;;;-",
---;.--"."':::.40:--+---,'''''.20o--+----,-,!
•. DO in musical acoustics in steady state
1.1

Fig. 3 behaviours.

147
Reconstruction of the attractor in chaos in multiphonics tones derived
time-delayed phase space [5] is the from measurement of their power
most important tool, associated spectra can be confirmed by the re-
with Poincare map technique, that constructed attractors in the phase
allowed a reduction of one dimen- space of the dynamic system. Fig. 4
sion for system description. These shows a 3D phase space recon-
methods complement classical struction and the concerning
time-frequency analysis techniques, Poincare map for a clarinet
and can show many aspects that multiphonic. The attractor IS
aren't detected by a simple FFT. In chaotic (biperiodic).
particular we investigate the regime We can evaluate the fractal
produced by some self-sustained dimension of the attractor recon-
musical instruments, with the aim structed in phase space: this
of exploring the physical phenom- parameter is quite different from
ena underlying sound production the fractal dimension of the signal
mechanism. Multiphonics of oboe, (vs. time) graph we saw. The
clarinet and recorder and wolf-note dimension of attractors can be
of contrabass are analysed. In this evaluated by embedding the time
way the experimental evidence for series in a higher space. The

r------jf-------I)((t)

('\
-'..J

Fig. 4 Fig 5

148
correlation dimension D [6] is of low-dimensional chaotic
measured by embedding a single attractors in a mechanical or fluido-
measured time series in a higher- dynamic systems, in particular it
dimensional space, so to seems that the real musical
reconstruct the phase space of the instruments analysed show a
dynamic system. In particular the behaviour like a quasi-periodic
estimated dimensions of the route to chaos.
attractors can show that sometimes So we examined the multiphonic
the sound has the characteristics of sound of fig. 4 with a dimension
chaotic dynamics with a fractal 2.01, suggesting a biperiodic
(non integer) dimension, and other dynamic. Instead for the clarinet
times it shows a behaviour with a multiphonic #33 we detected three
biperiodic spectrum. Thus the spe- subsequent kind of steady state be-
cific typology of the reconstructed haviours, with attractor dimensions
attractors shows that self-sustained 1.23 (like a periodic and a little bit
musical instruments can be noised sound), 3.24 and 3.82,
modelled by non-linear dynamic suggesting a quasi-periodic route to
systems with a low degree of chaos (see respectively fig. 5-6-7).
freedom. In fact a phase-locked bi- Only the first regime is quite
periodic spectrum is characteristics different from the others: frequency

1-------'-+---'--'--..-...,--1 )Ole l )

Fig. 6 Fig. 7

149
analysis appears similar in the sec- realistic period-doubling route to
ond and third case, and hearing too. chaos.
Reconstruction delay choice in
phase space is a very critical factor 6. REFERENCES
for an expressive description of
phenomena. For musical signals we [1] D.H. Keefe, B. Laden: "Corre-
obtained good results picking out a lation dimension of woodwind
zero (generally the first) of the self- multiphonic tones", J.Acoust.
correlation function. Soc.Am., vol.90, n.4, pp.1754-
Bifurcation diagram is another use- 1765,1991.
ful tool for chaotic system analysis: [2] RF. Voss: "Random fractal
it shows the proceeding of the forgeries", in Foundament
steady state behaviour varying a algorithms for computer graphics,
system parameter. This tool helps RA.Earnshaw ed., Springer-Verlag
the correct choice of parameters in 1985.
the simulation of physical models. [3] P.Maragos: "Fractal aspects of
Often it reveals the particular route speech signals:dimension and
to chaos of the model (often we interpolation", Proc. ICASSP 91,
fmd that it's different from the route 417-420, 1991.
to chaos of the real instrument). [4] P.Maragos, Fang-Kuo Sun:
Thus a basic shortcoming of many "Measuring the fractal dimension of
physical models for sound synthesis Signals: Morphological Covers and
is pointed out. In fact these models Iterative Optimization", IEEE Tran-
P::I.lI Acou:7tical preooeou rC! ,m rM:i saction on Signal Processing, 1993.
[5] W. Lauterborn, U. Parlitz:
"Methods of chaos physics and
their application to acoustics",
J.Acoust.Soc.Arn., vol. 84, n.6,
pp.1975-1993, December 1988.
[6] P. Grassberger, 1. Procaccia:
1.~,=
...-..----,::----:----......,....,:.-------.--.-IUiU Pm "Measuring the strangeness of
mouthpr~rl:
strange attractors", Physica,
vol.9D, pp.189-208, 1983.
Fig.8 [7] A. Bernardi, G.P. Bugna, G. De
does not allow to obtain the global Poli: "Analysis of musical signal
behaviour (e.g. route to chaos) of a with chaos theory", Proc. Int.
real instrument. For example, fig. 8 Work. Models and Representations
shows a bifurcation diagram for the of Musical Signal, Capri 1992
classic physical model of the
clarinet [7] exhibiting a non-

150
COUNTERWAVE
A program for controlling degrees of
independence between simultaneously
changing waveforms
Dr. Arun Chandra
Institute of Applied Arts
National Chiao Tung University
Hsinchu, Taiwan 30050
Republic of China
fax: +886-35-712-332
email: arunc@cc.nctu.edu.tw

Introduction conferences in the early 50s, and


the subsequent founding of .the
In the 1950s, W. Ross Ashby American Society for Cybernetics.
Also in the 1950s, Lejaren
published his book An Introduc-
A. Hiller and Leonard Isaacson
tion to Cybernetics, which pre-
programmed a computer (the Il-
sented structural transfonnations
liac II) to generate output based on
in a way that allows their descri~­
applications of traditional contra-
tion to be independent from theIr
medium of implementation. At puntal rules. This.output ~as tran-
that time, the science of cybernet- scribed into musIc notatIOn, and
performed by a string quartet, and
ics, started by the work of math-
was the first composition created
ematician Norbert Wiener and en-
gineer Claude E. Shannon, gener- with the assistance of a computer.
In the 1970s, the composer Her-
ated an enonnous amount of inter-
bert Briln (who had been in-
est due to the possibilities it of-
volved with compositional exper-
fered with regard to devising a
language for the analysis of sys- imentation with technology since
his work in the 1950s at the
tems that was independent of any
Cologne Radio Studios in West
one field. People from many dis-
Germany), began work at the Uni-
ciplines were attracted to it: .an-
thropologist Margaret Mead, bIOl- versity of Illinois on. ~i~ proj~ct
SAWDUST, with the InItIal assIs-
ogist Heinz von Foerster, mathe-
tance of Gary Grossman, and later
matician John Neumann, psychol-
Jody Kravitz and Keith Johnson.
ogist Gregory Bateson, and many
SAWDUST, originally written for a
others, all took part in the "Macy"

151
VAX 11/780 and then ported to 1. A wiggle: All samples have
a 386-PC by Johnson, took its the same amplitude.
synthesis paradigm not from the
mathematical models of Fourier 2. A twiggle: The samples have
synthesis, but the transformational amplitudes that rise to a poten-
ideas of cybernetics. SAWDUST al- tially fluctuating "peak," then
lows for the specification of square return to their starting value.
waves, which are then subject to Both rise and fall are linear.
linear transformation from a spec- 3. A ciggIe: The amplitudes rise
ified initial to a specified final state. to and fall from a potentially
The program currently has five fluctuating "peak," following
transformational algorithms: vary, the paths of two second-order
turn, merge, mingle, and link.
polynomials.
My work on CounterWave be-
gan by wondering what would hap-
pen if the idea of counterpoint were All three types can be "slanted,"
applied, not to notes, but to states that is, a segment's starting am-
of waveforms? plitude is the previous segment's
ending amplitude. The segment's
specified amplitude is taken as its
CounterWave ending amplitude. If a slanted seg-
ment is the first member of a state
Using CounterWave requires two (i.e., if there is no "previous seg-
steps: [1.] Specification of the ment"), its starting amplitude is as-
waveforms and their changes. [2.] sumed to be zero.
Specification of the relationships
between waveforms.
Change
Definitions Each segment type has either two
or four variables, described be-
Relationship: The degree of con- low. Every variable is given an
sequence the presence or absence initial value, minimum and maxi-
of one waveform has on the state mum limits, and a rate of change.
of another. Upon each iteration of a state:
Waveform: An iterated sequence
of states, where a state mayor may • If a variable has a non-zero
not change upon iteration, depend- rate of change, its rate is added
ing on its constituent segments. to or subtracted from its cur-
State: A specified sequence of seg- rent magnitude.
ments. A state can be made of any
• If a variable has a rate of
combination of segment types. A change that is zero, it repeats
segment can occur more than once its magnitude.
within a state.
Segment: A specified sequence of • If a variable reaches its min-
samples. As of this writing, a seg- imum or maximum limit, it
ment can be one of three types: changes the direction of its

152
changing, i.e., if it was grow- -10000 10000 -10000 20
ing, it will start shrinking, and w2 wiggle
vice-versa. 50 100 20 4.5
10000 20000 -20000 30
Thus, every variable that has a w3 wiggle
non-zero rate of change has a cycle 1 130 20 4.5
-10000 10000 -10000 40
length of
w4 wiggle
1 100 20 4.5
2 (max - min) 10000 20000 -20000 50
eye1es =
rate w5 wiggle
1 130 20 4.5
After the state has iterated cycles -10000 10000 -10000 60
number of times, the variable will
return to its initial magnitude. w6 wiggle
1 100 20 4.5
Since a segment can have either 10000 20000 -20000 70
two or four variables, and each
variable can have a unique cycle These seven segments are com-
length, the return of a variable to bined to form a state. The se-
its initial magnitude does not nec- quence of segments in this state is:
essarily result in the return of a seg- wO, wI, w2, w3, w4, w5, w6.
ment to its initial configuration. Plots of this state at its 100th
and the 700th iterations, followed
by their respective FFfs, are dis-
A wiggle played below (figures 1-4). In ad-
A wiggle has two variables: dition to the change in the ampli-
[1.] Length in samples (0-1000). tude configuration, please notice
[2.] Height (amplitude) (±32767). the change in length of the state.
Below is an example of a data The next example (Figures 5
file for creating seven segments and 6) uses the same segments as
that are wiggles. above, but four of the seven seg-
The first line gives a unique iden- ments are slanted. The iteration
tifier for the segment, followed by number for each state is the same
its type. The second line gives, as that of Figures 1 and 2.
from left to right, the initial length #init max min rate
of the segment, its maximum and wO wiggle
100 100 20 4.5
minimum limits, and its rate of 10000 20000 -20000 10
change. The third line gives the
wl wiggle
initial height (amplitude), its max- 20 130 20 4.5
imum and minimum limits, and its -10000 10000 -10000 20
rate of change. w2 wiggle slanted
50 100 20 4.5
#init max min rate 10000 20000 -20000 30
wO wiggle
100 100 20 4.5 w3 wiggle slanted
10000 20000 -20000 10 1 130 20 4.5
-10000 10000 -10000 40
wl wiggle
20 130 20 4.5 w4 wiggle slanted

153
Figure 1: 7 segments, wiggles, 1DOth iteration

Figure 2: same as figure 1, 700th iteration

Figure 3: FFf of figure 1

Figure 4: FFf of figure 2

154
1 100 20 4.5 Coordination
10000 20000 -20000 50
w5 wiggle slanted
1 130 20 4.5 Each waveform records aspects of
-10000 10000 -10000 60 its current state in a "window"
w6 wiggle variable, accessible by all other
1 100 20 4.5 waveforms in formation. This
10000 20000 -20000 70
window variable contains:
1. The number of segments in the
state and their type.
A twiggle
2. The current maximum and
A twiggle is a triangular wig- minimum amplitudes of the entire
gle. The idea behind a twig- state.
gle was to have a "triangle" in 3. The current length in samples
which all three sides could be in of the entire state (the sum of all
flux. Thus, there are four vari- the segments).
ables: [1.] Base length in sam- Upon beginning construction of
ples (0-1000). [2.] Base height a new state, each waveform checks
(amplitude) (±32767). [3.] Peak the windows of all other wave-
height (amplitude) (±32767). [4.] forms in creation. After reading
Peak location relative to the length this information, the waveform can
of the base (0-1). either ignore it, or act on it. If ig-
Figures 7 and 8 show a state nored, the waveform continues its
that has 15 segments, all twig- configuration of changes. If not
gles. Please notice that, unlike the ignored, the waveform can:
wiggles example above, the length 1. Change the minima and maxima
of the state in iterations 300 and for some or all its variables.
900 is approximately the same, al- 2. Change the rate of change for
though the waveform has substan- some or all its variables.
tially changed.
3. Change some or all segments
from being "slanted" to "straight,"
or vice-versa.
A ciggIe
4. Set the number of iterations for
A ciggIe is a twiggle with curved which it will ignore other wave-
sides. Whereas the twiggle con- forms.
nected the base of its triangle to its The criteria used to decide
peak with straight lines, a ciggIe whether to change a waveform's
connects them with curved lines. variables can be relatively simple,
A ciggle uses the same set of vari- such as "if :r other waveforms are
ables as a twiggle. present, change some variables,"
In Figures 9 and 10, four ciggles or variations on it. The criteria can
(cO--c3) are used to construct a also be more affectionate:
state that has eight segments in the Jealousy: if another waveform's
following sequence: cO c1 c2 c3 dynamic range is greater, match its
cl cO c2 cl. dynamic range.

155
Figure 5: 7 segments, wiggles, some slanted, 100th iteration

Figure 6: same as figure 5, 700th iteration

Figure 7: 15 segments, twiggles, 300th iteration

r-~-------------

Figure 8: same as figure 7, 900th iteration

156
Figure 9: 8 segments, ciggles, slanted, 200th iteration

Figure 10: same as figure 9, 600th iteration

Revulsion: if another waveform's generates a fundamental that is


limits are within x, change all lim- below 20Hz (sub-audio), the re-
its by a factor of y. sulting sounds resemble those of
Me-too: if another waveform's heavy machinery.
rates of change are faster, increase 3. A state can generate a tempo-
rates by a factor of x rary steady pitch if the the rates
Stubborn: if any other waveforms of change of its segments' sample
are sounding, repeat the current lengths add up to zero, or close to
state. it. We hear the repetition of the
Shy: if any other waveforms are state's length (or near repetition)
present, remain silent. as a steady pitch.
Loud-mouth: if any other wave- 4. When adjacent segments have
forms are present, make the dy- amplitude relationships that are
namic range greater than the less than 60dB apart, their move-
largest of the others. ment relative to each other seems
to have no consequence on the
resulting spectrum, although the
Observations on Results lengths of the segments have a con-
sequence on the fundamental fre-
1. Relatively prime cycle lengths quency of the state. There ought
generate richer harmonic fields to be an intelligent way to address
than cycle lengths that are multi- this relationship.
ples of each other. 5. The flexibility in sequence
2. When the length of a state specification for segments, and the

157
possibility of segment repetitIOn to PostScript.
within a sequence, was done on the The FFf algorithms are from the
hypothesis that the parallel move- book Numerical Recipes in C.
ment of separated segments could I'd like to thank Wolfram Re-
generate a second "fundamental" search, Inc., for their contributions
frequency, i.e., an overtone whose of time and support on their com-
amplitude was as strong as the fun- puters and equipment.
damental. I prototyped the synthesis al-
This did not happen. Occasions gorithms in Mathematica, V2.1,
have occurred of radical shifts in then rewrote them in C for speed.
timbral presence (as I hope are The program currently runs on a
hinted at by the above FFf plots), NeXT 68040 workstation, running
but I cannot yet successfully pre- NeXTStep 2.1.
dict them.
6. Currently, there is no reason-
able way to organize the data con-
trolling the relationships between
the waveforms. Changes in the
relationships require modification
of the source code. An useful or-
ganization would allow for greater
flexibility of experimentation.

Acknowledgements
Herbert Briin and Keith Johnson
taught me about SAWDUST, its al-
gorithms, and its radical address to
compositional premises.
Jerry
Keiper and Robert Naiman, both
of Wolfram Research, Inc., gen-
erously contributed their skills in
mathematics and numerical anal-
ysis, and patiently answered the
many questions I had on imple-
mentation.
All plots in this article were
generated with gnuplot, version
3.0, written by Thomas Williams
and Colin Kelley, and available
from the Free Software Founda-
tion. gnuplot ran as a child pro-
cess under CounterWave, plotted
the data, and converted the result

158
GRANULAR SYNTHESIS WITH
INTERACTIVE COMPUTER MUSIC SYSTEM

Agostino Di Scipio
Graziano Tisato
Centro di Calcolo di Ateneo - Via S. Francesco 11,1-35121 Padova
Fax +39498283733 - E-mail musicOI@unipad.unipd.it

time. Here lies the potential of


Introduction granular synthesis, if considered not
only as a technique of sound
As a form of sound synthesis synthesis, but also as a powerful
grounded on a microstructural method of microstructural time
representation of the musical signal, modelling of sound [2], by which
granular synthesis raises twofold set we mean a prospective of
of strictly related problems, electroacoustic and computer music
concerning 1) the inexpansiveness involving the composition of timbral
and effectiveness of algorithms of events starting at the level of micro-
signal processing (oscillators, time patterns of elementary signals.
envelope generators, phase-level Thus conceived, granular synthesis
controls) and 2) the design of a high- acts as a model of materials and a
level control-structure, defined as model of musical design at once [3],
''front-end parameter processor" a case of compositional approach
[1]. Typically, the density of grains, and creative sonic design whose
a determinant factor, is dependent on music-theoretical implications
the inexpansiveness of the signal deserve special attention.
processing algorithms in terms of
their computational load. In tum, the Control-structure relevance
flexibility of controls and the degree
of generality of the synthesis process In order to yield the
are heavily dependent on the density superposition and juxtaposition of
of grains. signals, each having its own
Of course, the higher is the amplitude and frequency, like
density of grains and the more jnIDt
critical the role of high-level controls s( t) = Ln,kan,kg( t + kt )e
becomes. As a consequence, the
characteristics of the parameter (where e jIDt =cos rot + jsinCiJt ),
control-structure are a major factor software synthesis may resort to
influencing the opacity of granular different modules addressing
synthesis methods. different scales of time: signal
The task of the control-structure generators, envelope generators,
should be conceived as the task of local controls pertaining to the
providing a description of how parameters of each single grain, and
grains overlap and repeat through more global controls describing the

159
evolution of low-level parameters (quad tape, 1991), ''plex'' (double-
throughout the synthesis process. bass and quad tape, 1991) and
The control-structure should "Kairos" (soprano sax and stereo
interconnect all these modules. It tape, 1992), finally, all software was
represents, indeed, the transported and further improved by
operationalization of some Tisato in his Interactive Computer
theoretical model of how does the Music System [6]. ICMS is a
low-level organization and temporal venerable (first release in 1975),
displacement of myriads of grains powerful software system for
give rise to a global, coherent analysis, synthesis and editing of
acoustical behaviour [2]. This point sound, with some emphasis on linear
reflects a cognitive perspective of prediction code (lPC) and methods
major relevance in music: The of analysis/resynthesis of sound [7].
problem of making some higher- Today, the engine of ICMS is a
level morphological coherence ffiM 9121 mainframe computer,
emerge from many partial details, is used for time-sharing applications,
a compositional problem tout court, which returns minutes of 4-channel
and remains a question for major audio in very few seconds. Also, one
importance for the composer even can have access to ICMS from PC
when it concerns the level of sound and NeXT computers via Ethernet
materials, i.e. when timbre is network.
conceived as the central dimension
of musical articulation [4]. Basic technical criteria
Significantly, recent studies in the
field of auditory scene analysis [5] The very basic idea is that
give relief to the fact that granular granular synthesis can be reduced to
representations can well characterize a particular case of granular
textural percepts and other processing (the GRANULAR
dynamical sonic phenomena PROC. options of ICMS are
(transients, for example). One needs selected in the SOUND
not only a good description of the PROCESSING menu). In practice,
basic grain itself (or several grains); each grain is generated following
determinant is also, if not mainly, a three main steps:
good description of how grains 1) a pointer to a pre-existent source
repeat through time [5], hence a soundfile is calculated;
model of how tiny irregularities 2) n samples are read from the file;
coalesce in a more or less 3) the sequence of n samples is
homogeneous auditory image. processed, enveloped by a given
function and written into a target
Precedents of the work sound:file. The system then jumps
back to update the pointer.
After Di Scipio's long a) Case of granular synthesis: The
experimentation devoted to source file contains a signal, as a
designing and testing programs for single sine wave of linearly (or
granular synthesis and processing exponentially) increasing frequency;
with an ffiM 80486 and the the pointer, then, will be in some
realization of 3 musical works "ikon" direct relation with the frequency of

160
the grain being generated. While in sampled sounds. Tinre-stretching and
the linear variation case the spectral time-contracting of sound can be
bandwidth of grains remains constant performed, as well as various
regardless of the frequency, in the mixtures of the two in the sanre
exponential, it broadens as frequency process (for example, a musical
is higher. The latter case better gesture beginning much faster than
matches perceptual criteria and reality and then slowing down
theoretical issues of signal thousands of times).
representation [1], [8]. The options 2-7 refer to
b) case of granular processing: If a various methods of random access to
sampled sound is stored in the the source fIle, according to the
source fIle, the pointer history results equations of fig. 1. Difference
in a sequence of variable wave form equations utilized by selecting
grains, depending on the sequence of options 4-7 represent simple iterated
fragments read from the soundfile. models with complex dynamics, i.e.
The two cases are distinct only in featuring non-linear behaviours with
that completely different signals are both chaotic, unpredictable and more
being processed, and no other structured, recognizable patterns
practical distinction can be drawn. (plus interesting nre1anges of the
This simplification radically two). The scientific literature about
decreases the computational load of the topic is vast; we only refer, here,
the signal generators and allows one to May [9] for an introduction, and
to experiment with different to Collet [10] for analytical insight.
approaches to the design of the Similar nrethods have been proposed
control-structure. Also, high-level and adopted as interesting controls
strategies can be studied and tested for granular synthesis [11], [12],
with different source soundfiles. [13], [14], [15], [16].
When the option 4-7 are selected,
Dynamical parameters controls the user must declare the coefficient
a of the particular difference
Clearly, the synthesized sound equation, which influences the
is dependent on the content of the global behaviour of the iterated
source soundfile, as well as on the equation, and the initial condition xo,
pointer history. The latter is itself the initial state of the iteration, which
determined by the parameter control- determines the sequence of the
structure. In the ICMS granular following states (slightly different
options, the pointer moves either initial states cause completely
with linear motion (sequential access different sequences having the sanre
to the sourcefile) or non-linear overall behaviour). In the course of
motion (random access to the the synthesis process, the coefficient
sourcefile). The pointer moves a can vary if the user typed in a final
linearly in the "direction of time" value different from the starting one.
only when option 1) in the The linear variation of the coefficient
GRANULAR PROC. subnrenu is corresponds to moving the pointer
selected (fig. 1). It is the only option, history through different regions in
moreover, which has little sense if the bifurcation diagram proper to the
not used for the processing of equation; different regions give

161
1) CONSTANT-VARIABLE STEP GRANULATION
2) GRANULATION WITH BROWNIAN MOTION 1/£2 noise
3) GRANULATION WITH GAUSSIAN DISTRIBUTION
4) GRANULATION WITH EQ. "DISCUBIC" X n = (l-a)xn-l+ (ax 3 n _l)
5) GRANULATION WITH EQ. "LOGISTIC" xn=axn-l(l-xn-l)
6) GRANULATION WITH EQ. "VERHLUST" X n = (l+a)xn-r (ax2 n _l)
7) GRANULATION WITH EQ. "MAY" xn=l- (ax2n _l)

Fig. 1 IeMS granular processing menu

rise to different micro-structural unexpected effects, parameters


conditions in the generated sound. "grain duration", "grain delay" and
The value of the current state in the "scanning step" (increment of the
iterated model Xn is rescaled and pointer to the source soundfile) must
utilized as the pointer to the source be assigned carefully related values.
soundfile (or as a value in some The evaluation of the remaining
synthesis parameter within the parameters must be well "tuned" in
defined range). That projects in the order to minimize (or maximize)
output sequence of grains, the side-effects due to the periodic phase
conditions of regularity or chaos of roll-off in the generated sound. A
the iterated model, which can yield a small range of random values in
variety of effects. Timbral events and these parameters helps to avoid
sound textures can be synthesized similar by-products, if necessary.
according to the evolution of the The remaining parameters are
difference equation adopted; one switches. If "on", they cause phase-
must experiment with good level modifications in the generated
combinations of synthesis grain, such as writing samples
parameters and equation parameters backwards [14], adding an amplitude
to find the best appropriate offset, phase inversion and number
mappings. of repetitions of the current grain
before generating the next one. This
Fig. 2 shows the synthesis last is a kind of "group delay unit"
parameters shared by all options. For with effects similar to comb-filtering
some parameters, a tendency-mask by-products; however, the amount of
control is available, which makes the delay - "grain delay" - can be
range of possible values change dynamically controlled, here, since it
through time. Value assignment, in can be dependent on the feature
that case, is done using a random particular to the selected option of
number generator (gaussian granular processing.
distribution). All this applies with If active, each modification is
parameters "grain duration", "grain actually operated only when an
delay" (time delay between two internal white-noise random
grains), "grain amplitude" generator gives a number greater
(amplitude rescaling factor) and the than 0.5 in the range 0-1; so, phase
portion of the source soundfile inversion, backwards writing of
submitted to granulation. To avoid samples, etc. apply only for

162
*** GRANULAZIONE CON PUNTATORE VAR. DA EQ. "LOGISTIC"
COEFFICIENTI DI MISSAGGIO C1, C2 0.00000 1.00000
PUNTO SOVRAPPOSIZIONE FILE-LAV. o MSEC
CANALE FILE-AUX (1=MONO,2=STERE01,ECC.) 1
LIMITI INF. FILE-AUX SU CUI GRANULARE o o MSEC
LIMITI SUP. FILE-AUX SU CUI GRANULARE 5000 15000 MSEC
PERIODO MIN MODULAZIONE GRANO 10 10 MSEC
PERIODO MAX MODULAZIONE GRANO 40 70 MSEC
RITARDO MINIMO FRA I GRANI o 5 MSEC
RITARDO MASSIMO FRA I GRANI 10 30 MSEC
RISCALAMENTO MINIMO DI AMPIEZZA 0.10000 0.50000
RISCALAMENTO MASSIMO Dr AMPIEZZA 1.00000 1.70000
NUMERO DI ITERAZIONI 10000
RETROGRADAZIONE DEL GRANO (N=O,Y=l) 1
SOMMA AL GRANO UN OFFSET IN % AMPIEZZA o
NUMERO DI RIPETIZIONI DEL GRANO 1
INVERSIONE FASE RIPETIZIONI (N=O,Y=l) 1
ESPONENTE DELLA FUNZIONE COSINUSOIDALE 4.00000
PARAMETRO Dr CONTROLLO EQUAZIONE (0-4) 3.74700 3.789
VALORE XO INGRESSO EQU. (0.00001-.99999) 0.10000
RITORNO==> ANNUL "
II HELP==>"AP1"

Fig. 2 ICMS granulation parameters with "logistic" equation

approximately 50% of generated rescales and mixes them with already


grains. lbat produces aleatory stored samples (rescaling factors are
micro-modulations and, together declared in the synthesis parameters
with the effects of other parameters, submenu, see fig. 2). lbat gives the
give the sound characteristics of user the opportunity of layering no
"turbulence" and irregularity in the matter how many streams of grains,
time domain. In the frequency- if rescaling and grain delay are well-
domain, spectral lines corresponding studied. The superposition of entire
to the energy of grains tum to generations of grains yields very high
broader frequency bands and may granular densities and gives the
act even as a proper formant sound a certain smooth continuity,
structure. Perceptually richer sounds notwithstanding the underlying
are obtained, with complex micro- discrete-time representation.
time details and, perhaps, noisy Perception detects this character of
components. Finally, the p parameter continuity, even if sound retains
"cosinusoidal function exponent" characteristics of roughness and
detennines the shape of the grain richness due to irregularities in its
envelope as from equation detailed evolution. Too many
superpositions, however, result in the
s(t)=l-jcosffi tjP irreversible degradation of sound,
because of too many micro-
Recursive processes modulations and a lack of local
regularities in the signal.
Each time the system writes new Furthennore, interesting
samples in the target soundfile, it sounds are produced by granular

163
processing of previously generated around those formant peaks then
streams of grains. The utilization of precisely concentrated. The listener
difference equations in a similar cannot really grasp this stable
process, yields peculiar results when formant structure which remains at a
stable points arise: grains accumulate subliminal level for the entire
around particular moments in time, duration of the piece. The four
and similar behaviours in density are sections sound radically different., if
perceived as "spontaneous" not contrasting, in their general
amplitude curves, in themselves timbral character, exactly because of
consistent but fragmented and not the different ways grains follow each
perfectly smooth. This possibility other and overlap, i.e. because of the
was explored in "zeitwerk micro-time structural properties of
(l'orizzonte delle cose)" (1992), the sonic materials.
whose realization included no A final aspect that Di Scipio
amplitude envelope generators. should stress, concerns the sense of
Rather, all musical gestures were "temporal horizon" he tried to
brought forth by this process of capture. Each section results from a
accumulation of grains and by quasi-deterministic process, and
recursive processing using larger and develops as long as transformations
larger grain durations (up to 0.2 of timbre and novel gestures arise. If
sec.). Internally articulated textures a "termination-point" is reached,
of some noisy nature were meaning that sonic materials seem no
synthesized, reminding of natural, more susceptible evolution, then the
environmental phenomena. All four process stops and next musical
sections of the piece were section begins. For each section, this
composed using nothing but 8 process is activated with quasi-
sinusoidal, fIxed-frequency grains (of identical initial conditions, but the
48, 105,232, 511,1124, 2473, 5442 resultant long-term musical structure
and 11972 Hz). These basic features timbral evolutions of its
elements were stored in a single own. (In this sense, the concept of
sourcefIle utilized all through the sensitive dependency on the initial
compositional process. The conditions, which is fundamental in
challenge consisted in confining chaos theory, acted as more than a
sonic design to time-domain suggestive metaphor in the
operations exclusively. No matter application of the design strategy).
how different the musical events Notice that what he called
heard in the piece sound to "termination-point" was not a "goal"
perception, their global morphology orienting the whole process; rather it
reflects only different properties of came out of the low-level strategy
continuity, discontinuity, linearity only after the latter was activated.
and non-linearity in the "local", Indeed, it was impossible to foresee
short-term structure of grains. for how long the process would have
A frequency-representation, to given rise to fresh materials, i.e. to
be sure, would give relief to the predict the temporal horizon
invariable presence of the above materials.
listed frequencies. Spectral energy, This impossibility gives the
however, appears more spread musical a sense of continual, and not

164
particularly goal-oriented, [5] Bregman, A, Auditory Scene
exploration. Analysis, MIT Press, 1990.
[6]Tisato, G., Interactive Computer
Final observations Music System. Manuale operativo.
Centro di Calcolo di Ateneo,
Independently of the DniversitA di Padova, 1990.
implementation (in the near future in [7] Tisato, G., Un sistema interattivo
a real-time version on a NeXT per la sintesi dei suoni e la loro
computer), further work should analisi mediante elaboratore, Proc.
include the automation of operations II CIM, AIMI, 1977.
that currently can be only done "by [8] Jones,D & Parks, T.W.,
hand", including the processing of Generation and composition of
previously generated streams of grains for music synthesis,
grains and the layering of arbitrarily Computer Music Journal, 12(2),
large numbers of streams. Some sort 1988.
of synthesis by rules can be [9] May, R., Simple mathematical
devised, based on the particular models with very complicated
microstructural approach. In that dynamics, Nature (261), 1976.
case, a single rule may instantiate [10] Collet, P. & Eckmann, J.P.,
multiple operations in the realization Iterated maps on the interval as
of an entire process. This higher- dynamical systems, Birkhauser,
level approach would represent a 1980.
"step towards the abstract" for a [11] Di Scipio, A, Composition by
perspective of sonic design which, exploration of non-linear dynamical
by definition, is closely bounded to systems, Proc. of the ICMC, 1990.
the level of sound materials. [12] Di Scipio, A, Caos
deterministico, composizione e
References sintesi del suono, Atti del IX CIM.
AIMIjDIST, 1991.
[1] Roads, C., Asynchronous [13] Di Scipio, A, An overview of
granular synthesis, in digital sound synthesis by models of
"Representations of musical signals". non-linear dynamic systems,
MIT Press, 1991. Bulletin of the!nst. of
[2] Di Scipio, A, Microstructural Mathematics and its Application,
time modelling of sound. A South-end-on-sea, 1992.
perspective of sonic design, Proc. [14] Truax, B., Chaotic non-linear
"2nd International Workshop on systems and digital synthesis. An
Models and Representations of Exploratory study, Proc. of ICMC
musical signal", 1992. 1990.
[3] Laske, 0., Towards an [15] Truax, B., Real-time granular
epistemology of composition, synthesis with the DMX-1000.
Interface 20(3-4), 1991. Software documentation. S. Fraser
[4] Di Scipio, A, La musica di due Dniv., 1991.
culture. Tracce di una mutazione, in [16] Hannnan, M., Mapping
"Musica e scienza. n margine complex systems using granular
sottile", ISMEZ, 1991. synthesis, Proc. of the ICMC 1991.

165
BISPECTRUM OF MUSICAL SOUNDS:
AN AUDITORY PERSPECTIVE

Shlomo Dubnov and Naftali Tishby


Institute for Computer Science and
Center for Neural Computation
Hebrew University, Jerusalem 91904, Israel
and
Dalia Cohen
Department of Musicology
Hebrew University, Jerusalem 91904, Israel
E-mail
dubnov@cs.huji.ac.il}

1 Introduction trast to other musical properties


such as pitch, interval and meter,
The research into the realm of that we clearly perceive and
musical timbre has become in re- search for their physical charac-
cent times one of the major re- terization, the timbral parameter
search topics, central to the un- has neither simple perceptive
derstanding of music and the mu- characterization, nor obvious
sical practice itself. With more physical properties. Due to these
and more musicians and scientists reasons, the research of timbre
being involved in the research, takes sometimes a reverse re-
contributions considering both search methodology, i.e. starting
the artistic, psychophysical, cog- with a physical-mathematical
nitive and physical-mathematical characterization of the acoustic
properties of the musical signals, signal we seek for it's perceptual
shed new light and understanding meaning. Stochastic processes,
on this fascinating subject. In our such as acoustic signals, are char-
research we chose to treat the is- acterized in general by an infinite
sue mainly from the mathemati- series of correlation functions.
cal point of view, regarding the An important subset of the pro-
acoustic signal as a stochastic cesses, known as Gaussian, are
process and suggesting new as- completely determined by their
pects for it's treatment. In con- autocorrelation, or equivalently,

166
their power spectrum. Much of acoustic realizations of signals,
the acoustic signal processing so on which we focus in this paper.
far is based on powerspectral Polyspectra are the natural math-
properties. This is mainly be- ematical generalization of the
cause linear systems are fully de- power spectrum and as such natu-
termined by their effect on the rally provide the next step in
spectrum, and linear systems are acoustic research. The bispectra
sufficient to describe most contain more information about
acoustic phenomena, and are of the signal then the power-spec-
course easier to understand. Yet, trum, particularly about it's
musical instruments have highly phase. In case of skewed signals,
nonlinear characteristics which for instance, which have non-zero
affect their tone, timber, and bispectrum, the signal can be re-
sound quality. In this paper we constructed from it's bispectrum
suggest the use of higher-order up to a constant time shift, much
statistics (polyspectra) [I] [2] for more then can be achieved from
the analysis and evaluation of the power spectrum alone. In ad-
acoustic signals and instruments. dition, it is easy to analyze and
In recent years bispectral meth- manipulate the bispectral charac-
ods have been applied in various teristics of signals in linear sys-
signal processing fields, such as tems and in quadratic non-linear
sonar, radar, image processing, systems. From the physical point
adaptive filtering, etc.[3] [4]. of view, the bispectral parameters
Surprisingly, polyspectra have correspond, in some models to
almost not been used so far in specific mechanisms, such as
auditory and acoustic signal pro- characteristics of reverberant en-
cessing, primarily due to the dif- vironments, or the non-Gaussian
ficulties in the estimation and nature of a source signal passing
analysis. These higher order cor- through a resonator. This corre-
relations, known as cumulants, spondence provides us with an
and their associated Fourier insight to the questions of model-
transforms, known as ing particular systems, and sug-
polyspectra, not only reveal all gests new techniques for signal
the amplitude information of the manipulation and synthesis.
process, but also maintain the Finally, the acoustic perception
phase information. Since for of the bispectrum is one of the
Gaussian processes all the high main issues of this study. We
(greater than second order) present several ideas and experi-
cumulants vanish, the third and ments concerning aspects of
fourth order polyspectra provide acoustic perception, sound qual-
an indication to the non-Gaussian ity, and the nature of the human
nature of a random process. auditory apparatus that are di-
These mathematical facts have rectly related to bispectra.
interesting parallels in the

167
The following issues are dis-
cussed in this paper: 2.1 Multiple Correlations and
CUIDulants
• Polyspectral criteria for
"quality" design of musical Extracting information from a
instruments. signal is a basic question in every
branch of science. The lack of a
• Effects of reverberation and complete knowledge of the signal
chorusing on the perception of exists in many physical settings
tone color. Within this due to the nature of the observed
framework an artificial all-pass signal or the type of measurement
reverberator is demonstrated. devices. In information process-
ing we encounter the inverse
• Tone separation by means of problem given the signal we want
bispectral detection-questions to extract information from it in
of timbral fusion/segregation order to perform basic tasks such
are believed to be influenced as detection and classification.
by the presence of strong We presume that any biological
bispectral ingredient. The ear, information processing system
though almost "blind" to acts in a similar manner. For in-
phase, is sensitive to long term stance, our ears perform analysis
phase behavior. This phase of the acoustic signal by extract-
coherence is clearly detected ing pitch and timbre information
by bispectral estimators. from it. To understand our moti-
vation to study higher order cor-
relations it is worthwhile to re-
2 Mathematical capitulate briefly some of the rea-
Preliminaries sons for using the ordinary dou-
ble correlation. A customary as-
In this section, multiple correla- sumption is that out ears perform
tions and cumulants of time sig- spectral analysis of the incoming
nals are defined. To simplify the signal. Naturally, not all of the
matters, the discussion focuses on signal information is retained in
discrete signals. Some of the im- our ears, and the simplest as-
portant relationships and proper- sumption is that the phase is ne-
ties of finite impulse response glected. It is well known that the
(FIR) signals and linear, time in- amplitude of the Fourier spec-
variant systems are briefly re- trum is equivalent to the Fourier
viewed. The results for general transform of the signal's autocor-
non-linear systems are more relation. This double correlation
complex and require the use of in time domain is the basic type
Voltera-Wiener system theory of information extracted from the
[6], which are beyond the scope signal by our ears. This informa-
of this presentation. tion has the meaning of signal's

168
spectral envelope in frequency coincide. Thus we arrive at an
domain. Now we intend to widen equivalent definition of the kth-
the scope of acoustical analysis order spectrum as the (k-l)-D
by suggesting the use of triple, Fourier transform of the respec-
quadratic and higher correlations, tive kth-order cumulant of the
which are known also as process.
polyspectra in the frequency do- Let be y(i) the output of an FIR
main. system h(i), which is excited by
The kth-order correlation, an input xCi):
hk(i},..,ik-1) of a signal {h(i)}~o is
defined as: N
y(i)=Lh(i)x(i-j)
j=O
N
(3)
hk(i p .• ,i k-l)= Lh(i)h(i+i l~·h(i+i k-l)
i=O
(1) Using the definition (1) it is easy
to show that :
and frequency domain it respond
to kth-order specrtum:
y k(i1,··,i k-l)
N
= Lh k(il,··jk-l)x k(i1-jl, .. ,i k-l-jk-l)
j,•.. J._,=-N
(4)

where Yk> hk, Xk are defined as in


=H((Ol~.H((Ok-l)H(-(Ol··-(Ok-l) (1). Further, employing (1) and
(2) (2) we arrive at the frequency
domain relations:
Under some common assump-
tions, the time domain correlation
converges to the kth-order mo- Y k(C01,.·,co k-l)
ment of the process. The kth-or-
der cumulant is derived from the =H k(C01,..,COk-1Jx k(C01,..,COk-J
kth and lower order moments, (5)
and contains the same informa-
tion about the process. We prefer An important property of the
to use cumulants in our defmition polyspectra is that if we are given
of spectra since for Gaussian pro- two signals f and g that originate
cesses all higher then second cu- from stochastically independent
mulants vanish. For zero mean processes and their sum signal z
sequences, the second and third = f + g, then:
order moments and cumulants

169
to recognize the characteristic
Z k((O j, ..,(Ok-J signature of the sound in the bis-
pectral plane, the excitation of a
=Fk((Oj,··,(Ok-j}tO k((O j''''(O k-j) given (01. (02 should be distin-
(6) guishable from the background
noise. Thus, a "good" instrument
This property is important when is supposed to produce a maxi-
considering the perception of si- mum bispectral excitation possi-
multaneously sounding indepen- ble for a given signal energy.
dent signals as will be discussed Stating the problem as "can we
later. predict the properties of a
Stradivarius ?", Gerzon claimed
that the design requirement for a
3 Sound Quality of musical instrument is that "they
Musical Instruments should have a third formant fre-
quency region containing the sum
The first attempts to use bispec- of the first two formant frequen-
tral considerations for sound cies". Surprisingly enough this
quality characterization can be theoretical criterion seems to be
traced down to Gerzon [5]. satisfied by many orchestral in-
For a musical tone, the power struments. For example, particu-
spectrum analysis shows how the lar cases of Stradivarius violin
fundamental frequency and it's (435 Hz, 535 Hz, 930 Hz),
higher harmonics compose to- Contrabassoon (245 Hz, 435 Hz,
gether to form the timbre of the 690 Hz) and English Hom (985
tone. However, the power spec- Hz, 2160 Hz, 3480 Hz). In a later
trum, being "phase-blind", cannot work, Lohman and Wimitzer [11]
reveal the relative phases be- an~yze~ two flutes by calculating
tween the sound components. theIr b1spectra. Their results
Although the human ear is almost demonstrate that a higher inten-
deaf to the phase differences, the sity of the phase of the complex
ear can perceive time-varying bispectra is achieved for the flute
phase differences. The bispectral of good quality. This also sug-
analyzer is the generalization of gests that the intelligibility of
the power spectrum to the third speech could be determined by
order statistics of the signal. The looking at the bispectral signature
bispectrum reveals both the mu- and. might be even enhanced by
tual amplitude and phase relation addmg an artificial third formant
between the frequency compo- to the sum of the momentary two
nents (01, (02 If sound sources are lowest formant frequencies. Such
stochastically independent, their a device can be easily constructed
bispectra will be the sum of their by means of a quadratic filter or
separate bispectra. In order that a other non-linear speech clipping
bispectral analyzer should be able system. One must note that such

170
a simple device will modify the mation of it's impulse response
spectrum also, which might be by a long sample of a r~ndom
undesirable. Gaussian process. Accordmg to
Eq.(lO), the bispectral re~ponse
of such a filter is zero, whIch re-
4 Effects of sults in zero bispectrum of the
Reverberation and output signal. The total resultant
signal contains a (stochasticall!'
Chorusing independent) mixture of the dI-
rect and the reverberant sound.
Other more subtle problems of The spectral energy of the com-
intelligibility can be considered bined sound at 001, (Oz, ffi3 will be
by looking at effe~ts of :ever- (l +k)Sl (l +k)S2 (1 +k)S3 at 001>
beration and chorusmg. Bemg an 002, 003 and bispectrum level ~ at
important musical issue, we note, co}, (Oz. Naturally the proportIOn
quoting Erickson [9], that, "~~re of the bispectral energy to the
is nothing new about mUltIphcI~y spectral energy of th~ signal ~e­
and the choric effect. What IS teriorated. For a SIgnal with
new is the radical extension of complex spectrum 1(00) the power
the massing idea in contemporary spectrum equals S(oo) = 11(00) I
music, and the range of its musi- and the bispectrum is:
cal applications; but a great deal
more needs to be known before
the choric effect is fully under-
stood or adequately synthesized". Taking a bicoherence index:
As mentioned previously, if the
sounds are stochastically inde-
pendent, then their bispectra will
simply be the sum of the separate
bispectra. Assume a sound
source with energies S}, S2, S3 at
we arrive at a dimentionless mea-
frequencies 001, 002, 003 = 001 + 002
and bispectrum level B at (OOJ, sure of the proportionate energy
(02) subject to reverberation e~­ between the spectrum and the
fect. Now let us assume that thIS bispectrum of a signal. If b=bin
for original signal, then after
effect can be modeled as a linear
filter acting as a reverberator reverberation bout=(l +k)-3/2 bin'
Thus for a reverberation energy
added to the direct sound.
Suppose that the effect of the re- gain k, the relative bispectral
level has been reduced by a factor
verberation only is to produce a
(l +k)-3t2 [5]. Now consider a very
proportionate spectrum energy
similar effect of chorusing. For N
kS}, kS 2, kS3 at 001, 002, 003.
A plausible model for the linear identical but stochastically inde-
filter describing the reverberator pendent sound sources the resul-
part alone could be an approxi- tant spectral energies at OOJ, 002,

171
(03 = (01 + (02 are NS h NS 2, NS 3 signals. As we can clearly see
and the resulting bispectra is NB from Fig.!, there is a significant
at (01, 0)2 Comparing again the bi- reduction of the bispectral ampli-
coherence indexes we arrive at tude for the "ArcoViolas" signal.
bout=N-ll2bin giving a relative at- Note also that the bispectral exci-
tenuation of N-1/2 due to this cho- tation pattern is different for the
rus effect. It is worth mentioning two signals, with the "SoloViola"
once again the importance of signal having few clear peaks
stochastic independence. The while the "ArcoViolas" has a
chorusing as described above much more spread and noisy like
might be confused with a simple pattern.
multiplication of the original sig-
nal energy by a gain factor N.
Such a gain is not stochastically
independent and the resulting
bispectrum would be augmented
by N3/2 instead of N. Only a true
lack of coherence between the
replicated signals will cause the
resulting bispectra to be actually
NB.

4.1 Experimental results

In order to demonstrate the above


effects, we have performed anal-
ysis of sampled signals of solo
instrument (Solo Viola) and of an
orchestral section of the same
instruments (Arco Violas). (The
signals were recorded from a
sample-player synthesizer and are
believed to be true recordings of
the above instruments.)
The signals have very similar Figure 1: Bicoherence index amplitude
spectral characteristics and the of Solo Viola and Areo Violas signals.
"chorusing" feature, dominantly The x-y axes are normalized to the
present in the "Arco Violas" sig- Nyquist frequency (8 KHz).
nal, cannot be extracted from the
spectral information alone. It has
though it's manifestation in sig- 4.2 Artificial all-pass filter
nal's bispectral contents. We
plotted the amplitude of the bico- As seen from Eq.6, the bispectra
herence index for each of the two of the output signal y(i) resulting
from passing a signal x(i) through

172
a linear filter h(i) equals to the bicoherence index of the signal
product of their respective bis- after filtering.
pectra. An equivalent relation
holds for the linear random pro-
cess, i.e when the output signal
results from passing a stationary
random signal through a deter-
ministic linear filter.
Consider now a device whose
impulse response resembles a
long segment of a Gaussian pro-
cess. Although the filter might be
on the overall deterministic, it Figure 2: Bicoherence index amplitude
could be considered as a random of the output signal resulting from
signal for any practical purpose. passing the "Solo Viola" signal through
a Gaussian 0,5 sec. long filter.
Applying, for instance, a bispec-
tral analyzer of final temporal
aperture to such an impulse re-
sponse, would average to zero the
5 Tone Separation and
it's bispectral contents, giving us Timbral Fusion/
a filter with zero bispectral char- Segregation
acteristics. Naturally, the output
signal resulting from passing a Among the various questions
deterministic signal through such dealing with the timbral charac-
a filter will have a zero bispec- teristics of sounds, the problem of
trum. Since the impulse response concurrent timbres [7] [8] is basic
resembles a white noise signal, to the musical practise itself,
it's spectral characteristics are manifestating itself in a daily or-
flat, giving us an all-pass filter. chestration practise, choice of
Also, by properly scaling the im- instruments and the ability to
pulse response we can assure that perceive and discriminate indi-
the filter gain equals 1. vidual instruments in a full or-
The following figure describes chestral sound. Originally treated
the result of passing the original in semi-empirical way by the or-
"Solo Viola" signal through a lin- chestration manuals, vague crite-
ear filter whose impulse response ria for evaluating orchestral
was created by taking a 0.5 sec. choices were presented. In recent
sample of a Gaussian process. times a more quantitative acous-
The bispectral analysis of the tical studies point out several fea-
signal was performed by averag- tures in the temporal and spectral
ing over 32 frames of 16 msec. behavior of the sounds which are
each. The subjective auditory re- pertinent for instrument recogni-
sult seems to resemble a rever- tion and modeling spectral blend.
beration device. Fig.2 shows the None of these attempts have real-

173
ized the power of polyspectral the ear performs grouping of the
techniques for the analysis of various spectral components pre-
spectral blend. One of the most sent in the sound, relating strong
basic applications of bispectral bispectral peaks to one source or
methods reported in literature is another. Thus, spectral blend
the detection of phase coupling would be a blend between bis-
between harmonic components of pectral patterns with sounds of
a signal [3]. Such a phase cou- close bispectral signature being
pling exists in most sounds pro- impossible for separation by a
duced by a single musical in- bispectral analyzer. Concluding
strument (except idealized sine- this discussion we must mention
tone generators). As claimed in that this bispectral mechanism is
section 3, the two strongest low- one among many others that in-
est frequency components of a fluence tone color separa-
musical signal are assumed to be tion/blending.
harmonically related to another
strong components at a higher
frequency. In other words, strong 6 Conclusions and
coupling should exist between Future Trends
two harmonic components of a
sound and a component at their This paper presented an impor-
sum frequencies. Since the power tant additional characteristic of
spectrum suppresses all phase the musical timbre which origi-
relations, it cannot provide the nates in the non-Gaussian and
answer. The bispectrum, how- non-linear characteristics of the
ever, is capable of detecting and signal. The higher order spectral
quantifying phase coupling. An characteristics provide an "expla-
illustrative example of such an nation" of various phenomena in
effect can be found in [3] where a auditory perception, which are
'quadratic phase coupling' phe- basic to the understanding of the
nomenon is treated, which occurs perception of tone color and are
due to quadratic nonlinearities central to musical research and
existing in a stochastic process. practice. Currently we are inter-
The non-zero bispectrum does ested in applying the above ideas
not depend on this particular to a simple model that assumes a
mechanism and it will hold for linear auto-regressive filter
any case of statistical dependence driven by a white non Gaussian
between the phases. Such depen- noise (WNG) excitation.
dence is natural for musical Although the details of this work
sounds as mentioned above. Now are beyond the scope of this pre-
having in our disposal such a sentation, we shell mention that
powerful tool for detecting co- this model enables us to "lump"
herence between spectral compo- all of the non
nents of a signal we claim that Gaussian/polyspectral properties

174
of the signal upon the character- that need studying and extensive
istics of the input. The pUIpose of experimentation. Naturally, the
the work is to derive an extended implications of such results to
feature representation of the mu- modeling of the auditory mecha-
sical signal, which would capture nism could be very substantial.
more signal characterisitcs then Also, it is widely recognized that
the standard representations such most of the timbral characteristics
as, for instance, the widely used are time dependent and thus can-
LPC-coefficient-based represen- not be analyzed by the stationary
tation [12] [13] [14]. methods discussed in this work.
The relevance of these ideas to An extension of polyspectral
music lies further ahead then methods to transient signal anal-
mathematical modeling of timbre. ysis seem to yield promising re-
Direct applications lie in the field sults [16]. One could study the
of musical tone synthesis, under- use of nonlinear filters [17] [18]
standing the mechanisms re- for processing of audio signals
sponsible for timbral fusion and and obtaining a better control
segregation and also in the field over the higher order spectra.
of musical theory. We expect that Additionally, adaptation of the
the bispectral characteristics above techniques to investigation
would be closer to the musician's of other musical parameters is
characterization of the 'density' or suggested. Also, it might be pos-
'complexity' of the musical sible to systematize the rules of
sound. Informally, one could say orchestration or tone-color pro-
that the bispectrum contributes to duction in orchestral and elec-
some sort of 'focusing' quality of tronic music by using a bispectral
the signal, thus enabling us to description. We hope to draw a
distinguish between 'focused' bispectral description in a manner
timbre versus 'dispersed' or similar to the way the spectrum-
'chorused' timbre. These random related musical staves have sys-
fluctuations between the har- tematized speech. If such a far
monic partials of the sound add a fetched ideal could be accom-
sense of vitality to the signal, and plished, it would provide a huge
we might suggest that there exists leap towards creating 'timbral
an analogy between this disper- composition' methodology, so
sion and the known meaningful desperately sought for in our
variations in other musical pa- days.
rameters, such as the quality of
intonation[15].
There is plenty of room for re-
search in this area with several
lines of investigation to be pur-
sued. Mainly, there are many
open psychoacoustical questions

175
References [8] S. McAdams, "Spectral
Fusion, Spectral parsing and the
[1] K. Haselmann, W. Munk, G. Formation of Auditory Images",
MacDonald, "Bispectra of Ocean Ph.D. dissertation, Stanford
Waves", in Proceedings of the University, CCRMA Report no.
Symposium on Time Series STAN-M-22, Stanford, CA.,
Analysis, Brown University, June 1984.
11-14, 1962 (ed. Rosenblatt),
New-York, Wiley, 1963. [9] R., Erickson, "Sound
Structure in Music", Berkely,
[2] D.R. Brillinger, "A n CA, University of California
Introduction to Polyspectra", Press.
Ann. Math. Stat., Vol. 36, 1361-
1374,1965. [10] F.Winkel, Music, "Sound
and Sensation", New -York,
[3] C.L. Nikias, M.R. Raghuveer, Dover, 1967, pp. 12 - 23, 112 -
"Bispectrum Estimation: A 119
Digital Signal Processing
Framework", Proceedings of the [11] A.Lohmann and B.
IEEE, Vol. 75, No.7, July 1987 Wirnitzer, "Triple Correlations",
Proceedings of the IEEE, Vol. 72,
[4] J.M. Mendel, "Tutorial on No.7, July 1984.
Higher-Order Statistics (Spectra)
in Signal Processing and System [12] R.Cann, "A n
Theory", Proceedings of the Analysis/Synthesis Tutorial",
IEEE, Vol. 79, No.3, July 1991 Computer Music Journal 3(3):6-
11; 3(4):9-13, 1979; and 4(1),36-
[5] M.A. Gerzon, "Non-Linear 42,1980.
Modelsfor Auditory Perception",
1975, unpublished. [13] P.Lansky, "Compositional
Applications of Linear Predictive
[6] M. Schetzen, "Nonlinear Coding", Current Directions in
System Modelling Based on Computer Music Research, MIT
Wiener Theory", Proceedings of Press, 1989.
the IEEE, Vol. 69, No. 12, July
1981. [14] R.Gray, A.H.Gray,
G.Rebolledo, J.E.Shore, "Rate-
[7] S. McAdams, A.Bregman, Distortion Speech Coding with a
"Hearing Musical Streams", Minimum Discrimination
Computer Music Journal 3 (4) : Information Distrtion Measure",
26-43,60,63, 1979. IEEE Transactions on
Information Theory, 27 (6),
November 1981.

176
[15] D. Cohen, "Patterns and
Frameworks of Intonation",
Joumal of Music Theory, 1969.

[16] J.R.Fonollosa, C.L.Nikias,


"Wigner Higher Order Moment
Spectra: Definition, Properties,
Computation and Application to
Transient Signal Analysis", IEEE
Transactions on Signal
Processing, 41(1):245-266,
January 1993.

[17] G.L.Sicuranza, "Quadratic


Filters for Signal Processing",
Proceedings of IEEE,
80(8):1263-1285, August 1992.

[18] I.Pitas,
A.N.Venetsanopoulos,
"Nonlinear Digital Filters",
Kluwer Academic Publishers,
1990.

177
REAL-TIME CONTROL OF GRANULAR
SAMPLING VIA NONLINEAR PROCESSES USING
THE IRCAM SIGNAL PROCESSING
WORKSTATION
Cort Lippe

IRCAM, 31 rue St-Merri, Paris, 75004, France


email: lippe@ircam.fr

Introduction. real time [9], [10]. Finally, a user


interface, developed by the au-
Interest in granular synthesis, thor using the program Max [11],
along with compositional strate- [12] running on the ISPW, is de-
gies for exploring this technique, scribed.
have been presented by various
composers [1], [2]. More re-
cently, important compositional Granular Techniques.
and technical results have been
presented in the domain of real- A simple description of a granu-
time granular sampling [3], 1ar synthesis model includes the
which has proven to be a power- following constants: a sinusoidal
ful tech nique for timbral trans- waveform, a bell-shaped ampli-
formation of sampled sounds in tude envelope with a duration of
real time. This paper discusses 20 milliseconds, and an overlap
essential differences between time of 5 milliseconds between
granular synthesis and granular successive grains. Grains of
sampling techniques and de- sound, produced at a high rate of
scribes a musical application us- speed, are usually overlapped
ing the IRCAM Signal with neighboring grains in order
Processing Workstation (ISPW) to produce a certain density and
[5] in which granular sampling is continuity of sound. Pitch, max-
controlled via nonlinear pro- imum amplitude of individual
cesses in a real-time composi- grains, and grain density (rate of
tional environment [6], [7], [8]. grain production and overlap of
Compositions by the author for successive grains) may be con-
instruments and live ISPW are sidered as compositional vari-
presented which make extensive ables in this model.
use of "expressive" control of
granular techniques via the detec- The technique of granular sam-
tion and tracking of musical pa- pling involves the application of
rameters of live instruments in the above-described technique

178
whereby the waveform used in the electronic music paradigm.
granular synthesis is replaced by a One is immediately confronted,
small chunk of sampled sound. historically speaking, with the two
Thus, for each grain, the onset main categories of electronic mu-
time into a sampled sound be- sic: granular synthesis is elektron-
comes a compositional variable, ische Musik, making use of purely
along with the pitch, amplitude, synthetic sounds, while granular
and grain density. In a more de- sampling is part of the world of
tailed model of granular sound musique concrete in which
production, the waveform (or recorded sounds are manipulated
sampled sound), envelope de- and transformed. As the
scription, grain duration, spatial Canadian composer, Jean Piche,
location of each grain, etc., may has suggested, granular sampling
also be considered as composi- is an "input dependent" technique.
tional variables. Thus, using granular techniques
on sampled sounds offers an ob-
With this large palette of available vious level of musical implication
parameters, it is clear that an im- which does not exist in granular
mense quantity of data may be re- synthesis: one is acting on and
quired in choosing individual val- transforming a pre-existing sound
ues for each grain of sound. object.
Historically, compositional algo-
rithms have often been employed As mentioned above, in granular
to automate these choices. The synthesis the parameters most
practicality and necessity of au- often controlled algorithmically
tomating control of granular pa- are the pitch, amplitude, and den-
rameters was obvious to Xenakis, sity of grains. While the ordering
who, prior to working with granu- of grains in a coordinate space is
lar techniques in electronic music, calculated, giving some sort of
had already explored similar density distribution, the concept
problems in the instrumental do- of grains in an ordinal sense re-
main during the 1950's with mains somewhat abstract. The re-
works such as Metastaseis, in suits of arbitrarily different
which he employed techniques orderings yield sounding variants
akin to a kind of "granular" con- which, although possibly very dif-
ception of instrumental music. ferent in nature, remain abstract
synthetic sounds. Since the syn-
thetic waveform used in granular
Compositional Implications. synthesis is replaced by a small
portion of a stored sampled sound
While granular synthesis and in granular sampling, an addi-
granular sampling are variants of tional parameter exists: onset time
the same technique, their musical into the stored sound. This addi-
essences lie at opposite poles of tional parameter can be of pri-

179
mary importance in granular ues to take place). Independent
sampling. No longer a kind of granular sampling tasks can run at
"commutative" or arbitrary pa- the same time. Each task can pro-
rameter, grain order may have duce a single stream of grains or
important consequences, creating multiple, simultaneous grain at-
an implicit hierarchy of parame- tacks (the number of tasks and the
ters. Using spoken speech as a number of overlapping grains
sampled sound, if onset times de- within a task being limited by
scend in an ordinal fashion from real-time constraints). A recent
high to low, while density distri- addition to the system allows for
butions of all other parameters are real-time mixing and sampling of
randomly calculated, the sounding the granular output of simultane-
result will always be recognized ous tasks, which then may be
as spoken speech played back- reused as stored samples for other
wards even though variants may granular sampling tasks. This
sound quite different. "recursive" aspect offers expo-
Furthermore, one of the principle nential increases in densities, and
musically interesting characteris- a musically "reflexive" dimen-
tics of granular sampling is the sion, namely, the ability to recall
ability to "deconstruct" sounds earlier musical material in a real-
via the manipulation of onset time context.
times, moving between the
boundaries of recognizability and
non-recognizability on a contin- Initial Experiments.
uum.
My initial experiments with
granular sampling were extremely
ISPW User Interface. simple: The auditory result of ran-
domly choosing onset times into a
Using Max on the ISPW, I con- stored sound, while producing a
structed an interface for control- single stream of grains at the
ling granular sampling in real original pitch of the sound, is
time. Parameter settings can be fairly statistical. Using samples of
given using sliders and number instrumental music, one has the
boxes, and parameters can be impression with certain phrases
changed independently over time that, for example, an entire 10-
by automating control processes. second phrase is sounding simul-
Max also allows for real time taneously. This is not surprising,
switching from one sampled since any onset is just as possible
sound to another; either by read- as any other, and, since, in using
ing elsewhere in memory, loading 20-millisecond grain durations
soundfiles from disk, or sampling with overlaps of 5 milliseconds
anew (all of which can be done between successive grains, more
while the granular reading contin- than 60 grains are produced each

180
second. (Increasing the overlap statistical "scrubbing" fashion,
time between grains will greatly creating more or less recognizable
increase the density of grains per playback of the original sounds
second.) One sampled clarinet with a rich amount of timbral
phrase, in particular, made up of variation. Random walks through
approximately 5 seconds of rapid the sound can be calculated and
short notes and then a 5-second combined with control over the
held note, was noteworthy be- numerous other parameters avail-
cause of the omnipresence of the able: pitch, amplitude, choice of
long note in the statistical sound stored sample, envelope descrip-
mass. It was immediately obvious tion, grain duration, rate of grain
that the musical content of the production, overlap of grains, and
stored sounds being operated on spatial location of each grain, giv-
was not a trivial aspect of the pro- ing one a vast amount of trans-
cedure, and that mapping algo- formational flexibility. My com-
rithmic calculations onto a stored position Music for Clarinet and
sound might produce more or less ISPWemploys granular sampling,
successful results if the musical making use of tendency masks to
content of the sound was taken control virtually all parameters of
into account. granular sampling. In a work in
progress, algorithms have been
tried for controlling different pa-
Nonlinear Control. rameters using chaotic equations.
These algorithms can be predic-
A first attempt at controlling tively and easily controlled, en-
granular sampling using nonlinear abling smooth transitions from the
mapping was simply to choose seemingly random towards stabil-
grains statistically within defined ity.
"tendency masks " (constantly
moving windows with varying Mapping Musical Expression.
sizes in which grains are statisti-
cally chosen). For instance, a The compositions mentioned
window with the size of a single above involve the use of live per-
grain moving forward in a sound formers. Since the ISPW offers
which expands to the size of a full tools for real-time audio signal
10 second stored sample over a analysis of acoustic instruments
specified time produces a sound for the extraction of musical pa-
which begins untransformed and, rameters, another level of control
over time, becomes a statistical over the granular sampling comes
sound mass. These tendency directly from the performers,
masks of constantly moving win- giving musicians a degree of ex-
dow sizes and window locations pressive control over the elec-
can be used to read through tronic transformations. In Music
sounds quite freely in a kind of for Clarinet and ISPW, the sam-

181
pled sounds used for granular Granular sampling is a powerful
sampling are taken directly from tool for transforming sampled
the performed score, either sam- sounds. Control of granular sam-
pled on-the-fly, or prerecorded pling via nonlinear processes in a
and loaded into memory during real-time compositional context,
performance. Continuous pitch and via continuous control signals
and amplitude tracking of a per- made available by the detection
formance offers musically rele- and tracking of musical parame-
vant data which can be used to ters of live instruments in real
control aspects of an electronic time, offers composers and per-
score, and perceptually create co- formers a rich palette of possibili-
herence between the instrument ties. In addition, sampling of the
and electronics. In the clarinet output of a performer in a real
piece, continuous pitch data taken time environment, while allowing
from the clarinet is used to control the performer a certain degree of
the pitch of grains, and continu- control over the granular sam-
ous amplitude data controls the pling of this same material, can
windowing of the tendency masks ultimately provide an instrumen-
of certain parameters. In a work in talist with a high degree of inti-
progress, control information mate expressive control over an
gathered via spectral analysis is electronic score.
applied to parameters, thus allow-
ing for timbral control of the
sampling by way of instrumental Acknowledgements.
color changes. (See figure I be-
low.) I would like to thank Miller
Puckette, Jean Piche, and
Agostino Di Scipio for their in-
valuable technical and musical
insights.

derives disaelc and


References.
cmtinuOUJ coottol lignala,
..chu:
articulation,
ootedcnlity, [1] I. Xenakis. Formalized
et::lUel of spcclra1 mus,
pitehmbility, etc. Music. Bloomington: Indiana
University Press. (Pendragon,
1991) 1971.
[2] C. Roads. "Automated
figure I. Mapping performer expression
Granular Synthesis of Sound."
Computer Music Journal 2(2):61
- 62, 1978.
Conclusion. [3] B. Truax. "Real-Time
Granulation of Sampled Sound

182
with the DMX-1000." In J. Technology at Connecticut
Beauchamp, ed. Proceedings of College, 1990.
the 1987 International Computer [9] C. Lippe and M. Puckette.
Music Conference. San Francisco: "Musical Performance Using the
International Computer Music IRCAM Workstation." In B.
Association, 1987. Alphonce and B. Pennycook, eds.
[4] D. Jones and T. Parks. Proceedings of the 1991
"Generation and Combination of International Computer Music
Grains for Music Synthesis." Conference. San Francisco:
Computer Music Journal 12(2):27 International Computer Music
- 34, 1988. Association, 1991.
[5] E. Lindemann, M. [10] D. Wessel, D. Bristow
Starkier, and F. Dechelle. "The and Z. Sette!. "Control of
IRCAM Musical Workstation: Phrasing and Articulation in
Hardware Overview and Signal Synthesis." Proceedings of the
Processing Features." In S. 1987 International Computer
Arnold and G. Hair, eds. Music Conference. San Francisco:
Proceedings of the 1990 International Computer Music
International Computer Music Association, 1987.
Conference. San Francisco: [11] M. Puckette. "The
International Computer Music Patcher." In C. Lischka and J.
Association, 1990. Fritsch, eds. Proceedings of the
[6] B. Truax. "Real-Time 1988 International Computer
Granular Synthesis with a Digital Music Conference. San Francisco:
Signal Processor." Computer International Computer Music
Music Journal 12(2):14 - 26, Association, 1988.
1988. [12] M. Puckette. "Combining
[7] R. Waschka and A. Event and Signal Processing in
Kurepa. "Using Fractals in the Max Graphical Programming
Timbre Construction: An Environment." Computer Music
Exploratory Study." Proceedings Journal 15(3):68 - 77, 1991.
of the 1989 International
Computer Music Conference. San
Francisco: International
Computer Music Association,
1989.
[8] A. Di Scipio. "Composing
with Granular Synthesis of Sound
in the Interactive Computer
Music System." In D. Smalley and
N. Zahler, eds. Proceedings of the
Fourth Biennial Arts &
Technology Symposium. New
London: Center for Arts &

183
AS4
A program for analysis, separation and synthesis of
musical signals spectrum
Sandro Mariuz
c.s.c - DEI Universiti degli studi di Padova
Via Gradenigo, 6A 35100 Padova Italy
fax: +39 49 8287699

ABSTRACT

The main object for AS4 is to create a study enviroment for musical signals
wich allows to operate a great number of transformations on data coming from
analysis before synthesis.
The elaboration of a sound is closely connected to its structure and, at the
beginning needs a signal valutation witch permits to find the main caracteristics
interested by the applicated transformations.
An aspect sometimes not considered and which makes the difference between a
synthetic sound and a real one is the presence of a noisy element in the last one
( try to think to the continuous rubbing of a guitarist's fingertips when he plays
his guitar or to the breath present in the flute's sound ).This particular element
influences, more or less, the elaborated sound quality when the using model for
the rapresentation of the original signal changes.
If we consider a real sound composed by a deterministic component and by a
noisy one, with As4 we can study the caracteristics of theese two components
with different procedures of analysis and synthesis.

1. INTRODUZIONE adottato per la rappresentazione del


segnale originate. Lo studio delle
Un aspetto, spesso trascurato, che catteristiche spettrali di questa
differenzia un suono sintetico da componente, unitamente alla
uno reale e la presenza di una possibiliti di separarla dal resto del
componente rumorosa in suono (e quindi di elaborarla in
quest'ultimo. Tale componente maniera indipendente) diventano
influenza infatti, in maniera piu 0 aspetti essenziali per ottenere
meno sensibile, la qualiti del suono risultati qualitativamente buoni al
elaborato al variare del modello variare della struttura del suono.

184
2. ANALISI E SINTESI Fourier) oppure mediante sintesi
additiva (sommatoria di forme
Dal punta di vista teorico, Ie due d'onda sinusoidali precedentemente
assunziom che stanno alIa base di individuate da un algoritmo di
AS4 sono: analisi spettrale se si usa il modello
1) ogni suono pu6 essere sinusoidale di rappresentazione del
rappresentato equivalentemente nel segnale musicale.
dominio del tempo da una forma In fase di elaborazione (time
d'onda 0 in quello della frequenza stretching e pitch transposing), si
da un insieme di spettri; incontrano problemi legati alIa
2) la maggior parte dei suom e componente rumorosa presente nei
scompombile in una componente suom reali. Tale componente infatti
deterministica (che comprende la non eben modellabile in termini di
parte predicibile del suono, serie di sinusoidi e produce un
approssimata da una somma di sensibile degrado della qualitfl del
sinusoidi tempo varianti) ed in una suono sintetizzato.
rumorosa che consiste nella parte Per ovviare a tale inconvemente,
non predicibile del suono in esame considerando un suono reale
[1]. composto da una componente
L'ambito tempo-frequenziale in cui deterministica e da una rumorosa, si
opera AS4 consente una prima fase e voluto operare una separazione di
di analisi e sintesi basata sui queste due per poter applicare Ie
modelli Short Time Fourier elaboraziom in modo indipendente
Transform e Sinusoidale utilizzando prima della sintesi finale.
Ie tecmche di overlapp & add e I modelli STFT e Sinusoidale, oltre
additiva. Partendo da una forma ad essere essi stessi strumenti di
d'onda si opera una finestratura, la analisi e sintesi, costituiscono la
successiva FFT consente di base di sviluppo per quelli
ottenere gli spettri del modulo e Deterministico-residuo e
della fase su cui applicare Ie Deterministico-stocastico che
trasformaziom desiderate. n tipo di permettono questa separazione.
finestra, la lunghezza della finestra, Se la componente rumorosa e
l'hop-size e la dimensione della FFT ottenuta mediante differenza nel
sono i parametri fondamentali che tempo tra segnale originale e
caratterizzano l'analisi e la sintesi. componente deterministica
La sintesi avviene mediante IFFT (precedentemente sintetizzata
seguita da overlapp&add (qualora mediante sintesi additiva), viene
si utilizzi il modello Short Time definita come residuo; se invece e

185
ottenuta con un operazione di mantenere l'esatta informazione sia
sottrazione in frequenza viene sulla frequenza che sulla fase
trattata come un segnale stocastico istantanea.
la cui sintesi utilizza una tecnica Mediante l'individuazione
mista basata sull'overlapp&add e dell'inviluppo spettrale (ottenuto
sulla caratteristiche teoriche di tale mediante approssimazione [2] a
tipo di segnale. segmenti 0 a gardini) e con l'ausilio
Partendo da una forma d'onda si di un generatore di nurneri casuali
utilizza la STFT per ottenere 10 viene costruito 10 spettro complesso
spettro del segnale. della componente stocastica.
Un algoritmo di individuazione dei L'applicazione della IFFT seguita
picchi frequenziali consente di da overlapp&add consente poi di
individuare solo Ie componenti operame la sintesi.
sinusoidali del suono che vengono La Fig.1 mostra un esempio di
sintetizzate mediante sintesi separazione delle componenti
additiva fomendo percio la deterministica e stocastica di un
componente deterministica. suono di flauto.
Per ottenere il residuo, si opera una
sottrazione nel tempo tra 3. CONCLUSIONI
quest'ultima forma d'onda e quella
del suono di partenza. L'ambiente di lavoro che si viene a
Se la separazione avviene in creare mediante l'applicazione del
frequenza invece, dopo aver concetto di separazione delle due
ottenuto la componente componenti del suono apre
deterministica con il procedimento nurnerose possibilitil di
appena visto, viene applicata la elaborazione che vanno ad
STFT per otteneme 10 spettro; si aggiungersi a quelle tradizionali di
sottrae infine il modulo di tale alterazione delle scale dei tempi e
spettro a quello del segnale delle frequenze. Alcuni esempi
originale. Cosi facendo si ottiene il possono essere la reverse synthesis
modulo della spettro della caratterizzata dalla presenza di una
componente rumorosa che vernl o entrambe Ie componenti con asse
modellata con un segnale dei tempi rovesciati e la cross
stocastico. Poiche tale segnale ne synthesis che consiste nella
consente la completa descrizione applicazione delle caratteristiche
solamente mediante Ie spettrali di un suono ad un altro per
caratteristiche di ampiezza e creare degli effetti di
frequenza, non e necessario ibridizzazione. La possibilitil offerta

186
all'operatore eli intervenrre processo eli separazione alla voce.
sull'inviluppo spettrale della Le procedure eli caIcoio utilizzate in
componente rumorosa, infine, AS4 sono state sviIuppate in
consente eli ottenere suoni che si TurboPascaI e pruducono dei files
ditferenziano da quelli degli contenenti iI risultato
strumenti tradizionali su cui si e dell'operazi.one effettuata. Qualora
operata Ia separazione e eli si vogliano operare delle
conseguenza fomisce un sistema eli elaborazioni con aItri strumenti
produzione eli suoni totaImente (phase vocoder ...) e possibile usare
nuovi. Risultati interessanti si due procedure che consentono Ie
ottengono anche appIicando iI conversioni numeriche desiderate.

Fig.l Suene complete, componente deterministica, compenente stocastica

REFERENCES
[1] De PoIi G. and APiccialli and C. Roads. 1991. Rappresentation ofmusical
signals, Cambridge, Massachusettes: MIT Press.

[2] Mariuz, Sandro. 1992. Separazione della componente deterministica e


stocastica dei suoni, Tesi eli Iaurea, Universit:a eli Padova, Padova.

187
SINTESI POLARE
Applicazioni in campo musicale di filtri
digitali operanti allimite della stabilita'

Antonio Pellecchia, Alessandra de Vitis

CRM, Centro Ricerche Musicali


Via La Marmora 18,00185 Roma

Abstract

In this paper a description The technological


is given of the fundamental aspects peculiarity of the Fly30 system is
of scientific research as developed described in its hardware and
by the CRM - Centro Ricerche software characteristics as well in
Musicali in Rome concerning its possibilities of real time
sonological and musical problems. applications.
A common requirement The central subject of the
of either scientific than artistic field paper is the implementation of
is experimentation on the audio special algorithms for the
signal, analysis and controlled processing and synthesis of acoustic
manipulation of data which may signals which do not make use of
occur only in the presence of pre-memorized waveforms, but
powerful processes of numerical rather of numerical filters operating
calculus. at the limit of stability.
Through informatics and The paper presents the
technological development in the system of Polar Synthesis which is
field of audio signals a substantial analitically described and
modification was obtained of both completed by application data.
work methods and quality of
sonological and musical research.

188
Polar Synthesis Software possono generare forme d'onda
tramite lettura ciclica di tabelle
In questa sezione verra' prememorizzate 0 tramite un
descritto il software del sistema generatore di impulsi controllabile
Fly30 per l'implementazione in ampiezza, frequenza e duty
musicale di filtri digitali operanti al cycle. Un generatore di rumore
limite della stabilita'. viene utilizzato per dare naturalezza
Polar Synthesis e' un ed un certo grado di aleatorieta' al
programma interattivo operante suono generato.
sotto ambiente operativo Microsoft Queste possibili sorgenti di
Windows, e gestisce come suono vengono miscelate tra loro,
hardware richiesto una interfaccia producendo un segnale digitale che
MIDI dotata di driver windows e viene elaborato dal generatore di
una scheda di elaborazione inviluppo. Questo realizza la
numerica basata suI processore funzione d'ampiezza nel tempo a
numerico in virgola mobile partire dal comando di NOTEON
TMS320C30. proveniente dall'interfaccia MIDI,
L'algoritmo su cui si basa il frno al comando di NOTEOFF
programma Polar Synthesis prende relativo a quel tasto.
spunto dalla letteratura classica Questo segnale digitale viene
della elaborazione numerica dei cosi' utilizzato per innescare Ie
segnali ed in particolare dagli studi oscillazioni dei cinque filtri
effettuati da Klatt per la sintesi risonanti successivi. Ogni filtro
della voce attraverso filtri passa realizza una equazione aIle
banda. differenze ricorsiva del secondo
Questo algoritmo consiste ordine, composta da due ritardi e
nell'utilizzo di un certo numero di due moltiplicazioni per due
filtri digitali del secondo ordine coefficienti legati non linearmente
posti in parallelo 0 in cascata. Ogni alia frequenza di risonanza e alIa
filtro digitale viene controllato in banda passante.
frequenza di risonanza e banda Un pannello grafico
passante, tramite dei potenziometri visualizza il diagramma polare con
virtuali 0 inserimento numerico Ie posizioni dei poli dei filtri.
manuale. Tramite it mouse e' possibile variare
La sorgente eccitatrice dei Ie posizioni dei poli ed ascoltare in
filtri puo' essere catturata dal tempo reale Ie variazioni di suono
mondo esterno tramite microfoni, dovute aIle variazioni dei
oppure puo' essere sintetizzata coefficienti dei filtri.
internamente. In questo caso si

189
Windows Polar Synthesis 1.0
1~Fil1tH7 a7IilJ;S(l·t1 ;''l1li'et1'lB'i lR7~?:ln1t l)4?fiOI?~'O
H'148.,1',O ';;71 Ulll',lfl lJ1;~ r,lI!WIO D:J1 r,~6f11O Tn lf1~111()

.'A \ \
\
\\
\
\.

Si puo' notare nella mostrato il controllo MIDI: ad ogni


ilIustrazione in alto a sinistra la tasto midi e' associato l'intero
fmestra principale con il disegno insieme di parametri associati
dell'algoritmo: diverse sorgenti all'algoritmo. In basso a destra
sonore vengono dirette verso un troviamo il generatore di inviluppi,
generatore di inviluppo e un capace di gestire inviluppi ad 8
modulo di filtraggio per poi segmenti eventualmente ciclici, con
proseguire verso il modulo di controlli per il rilascio del tasto e
uscita. Al centro a partire da per la sensibilita' alIa velocita' di
sinistra sono visibili Ie diverse pressione del tasto. In alto a destra
finestre di controllo sia numerico troviamo il controllo dei filtri
che tramite potenziometri virtuali digitali del secondo ordine:
dei diversi parametri delI'algoritmo: frequenza di risonanza e banda
oltre ai livelli di ogni sorgente passante. I filtri digitali sono
troviamo il controllo di frequenza controllabili anche cambiando con
per gli oscillatori ed il controllo del il mouse la posizione dei poli
duty cycle. In basso in evidenza e' corrispondenti ai filtri.

190
Oscillatori Polari Multipli modulo unitario, e quindi avra' un
valore massimo pari alIa meta'
In questa sezione si dell'ordine della equazione aIle
descrivera' un metoda per differenze generatrice.
sintetizzare forme d'onda canoniche Negli esempi che seguiranno
quali ronda quadra, l'onda si e' fatto uso di equazioni aIle
triangolare e l'onda dente di sega differenze finite il cui ordine e' stato
trarnite filtri digitali operanti al scelto in funzione del fme
limite della stabilita' di ordine esplicativo dei grafici polari. Si
superiore al secondo. possono comunque realizzare
L'oscillatore polare semplice equazioni aIle differenze fmite di
e' in grado di generare una ordine comunque elevato
componente sinusoidale pura a semplicemente modificando il
frequenza e arnpiezza dipendente grado delle equazioni riportate.
dai suoi coefficienti. L'oscillazione
viene generata trarnite un impulso Onda Quadra
di innesco che influenzera' Sono qui riportate l'equazione
rarnpiezza delI'onda generata. aIle differenze fmite, la funzione di
L'equazione aIle differenze trasferimento e iI diagramrna polare
fmite su cui si basa l'oscillatore del filtro digitale generante un'onda
polare semplice e' la seguente: quadra.
yn =2cos(2rif/ jC)yn - 1 - yn- 2 y. = y.-,-y.-8+y.-9+ X.-X.-8
In cui Ie e' la frequenza di
carnpionarnento di lavoro ella
frequenza della sinusoide generata.
n problema consiste nel
generare forme d'onda con
contenuto armonico complesso
senza ricorrere alIa sovrapposizione
di piu' oscillatori polari semplici
ognuno generante Ie diverse
componenti armoniche. Per ottenere
cio' dobbiarno aurnentare l'ordine
della equazione aIle differenze
fmite per poter ottenere piu' modi di
oscillazione. II nurnero di
componenti armoniche generate La frequenza fondarnentale
sara' pari al nurnero di poli con delI'onda generata e' proporzionale
parte irnrnaginaria positiva e con alIa posizione del primo polo con

191
fase positiva, ed in questa caso e' fase positiva, ed in questa caso e'
pari ad un sedicesimo della pari ad un sedicesimo della
frequenza di campionamento. II frequenza di campionamento. Un
polo a frequenza zero viene polo a frequenza zero viene
annullato dallo zero alIa stessa annullato dallo zero alla stessa
frequenza e cio' garantisce l'assenza frequenza, mentre il restante polo a
del valor medio nell'onda generata. frequenza zero determinera' il valor
Come si puo' notare dalla posizione medio nella forma d'onda. Anche in
dei poli, it filtro e' in grade di questa caso vengono generate tutte
generare tutte Ie armoniche dispari Ie armoniche dispari della
della fondamentale fmo alla meta' fondamentale fmo alla meta' della
della frequenza di campionamento. frequenza di campionamento.

Onda Triangolare Onda Dente di Sega


Vengono ora riportate Sono qui riportate l'equazione
l'equazione alle differenze fmite, la alle differenze finite, la funzione di
funzione di trasferimento e it trasferimento e il diagramma polare
diagramma polare del filtro digitale del filtro digitale generante un'onda
generante un'onda triangolare a dente di sega.
segno costante. y'=2y.-'-Y.-2+Y.-'-2y.-.+y.-,o+.x.-ax-,+ax-.-.x.-.

(l-a~+a~l)z
(z'l-1)(z-If

La frequenza fondamentale
dell'onda generata e' proporzionale
La frequenza fondamentale alla posizione del primo polo con
dell'onda generata e' proporzionale fase positiva, ed in questo caso e'
alla posizione del primo polo con pari ad un ottavo della frequenza di

192
campionamento. I tre poli a BffiLIOGRAFIA
frequenza zero vengono annullati da [1] Adrien J., Causse M. and
tre zeri alla stessa frequenza e Ducasse E., 1988, "Sound synthesis
quindi I'onda generata e' a valor by physical models, application to
medio nullo. Come si puo' notare strings", in Proceedings 1988
dalla posizione dei poli, vengono Audio Engineering Society
generate tutte Ie annoniche pari e Convention, Paris, New York.
dispari della fondamentale [rno alIa [2] Mathews M.V., and
meta' della frequenza di Pierce J.R., 1989, "Current
campionamento. II coefficiente a Directions in computer music
deve assumere il valore (k-l )/(k-3) research", Cambridge: MIT
ove k e' l'ordine del filtro (in questa PRESS.
caso 10). [3] De Vitis A., Pellecchia
A., 1991 "Fly30: a DSP system for
Conclusioni Real-time control of audio signals.
In questa articolo sono stati Aspects of research and musical
presentati alcuni esempi di utilizzo interaction", in International
di filtri digitali di ordine elevato per Workshop on Man-Machine
la sintesi del suono. Questi risultati Interaction in Live Perfonnance,
sono stati ottenuti grazie alla Pisa.
notevole precisione di calcolo del [4] De Vitis A., Lupone M.,
nostro sistema di elaborazione Pellecchia, A., 1991 "CRM:from
numerica dei segnali in tempo reale the FlylO to the Fly30 system", in
Fly30. La sintesi con oscillatori
Atti del IX Colloquio di Infonnatica
polari non presenta i problemi
Musicale, Genova.
dovuti al troncamento e al
[5] De Vitis A., Pellecchia
fenomeno dell'aliasing tipici della
A., 1992 "Fly30: un sistema
sintesi per lettura tabellare. Inoltre
programmabile per l'elaborazione
l'uso di filtri digitali consente la
simulazione di modelli fisici di numerica dei segnali musicali in
strumenti musicali, la simulazione tempo reale",in Atti del XX
delle risonanze per simpatia fra Convegno Nazionale di Acustica in
elementi oscillanti come Ie corde, la Italia, Roma.
simulazione dei transitori rumorosi [6] De Vitis A., Pellecchia
negli attacchi, la simulazione degli A., 1992 "Music in discrete time" in
ambienti d'ascolto e cosi' via, cose International Workshop on Models
impossibili da ottenere con Ie sole and Representation of Musical Si-
sintesi per lettura tabellare. gnal, Capri.

193
ANALYSIS, SYNTHESIS AND
MODIFICATION OF PSEUDOPERIODIC
SOUND SIGNALS BY MEANS OF PITCH
SYNCHRONOUS TECHNIQUES

Piccialli A. , Cavaliere S. , Ortosecco I.

Dipartimento di Scienze Fisiche Universita' di Napoli


Mostra d'Oltremare PAD. 20
80125 Napoli
fax +39 81 2394508
E-mail acel@na.infn.it .

real sounds both from music


and voice: at most real world
Introduction acoustical signals show a clear
periodic basic feature; added
In our communication, we re- to it we can find in pseudohar-
visit pitch synchronous tech- monic sounds fast attack tran-
niques for the analysis and syn- sients, slow amplitude and fre-
thesis of quasi steady-state sig- quency modulation, and finally
nals, on the basis of some new aperiodic components.
results. This approach is nat- Basic idea therefore, is to sepa-
urally connected to the tech- rate the underlying quasi peri-
niques of synchronous granu- odic feature from the aperiodic
lar synthesis (see [1]), giving components, what can be done
them both the proper tool for by means of properly tuned
the analysis of natural sounds comb filters or related tech-
and an efficient algorithm for niques. Only after this sep-
resynthesis and modification. aration both pitch detection
It is well known that exact and pitch synchronous analy-
periodicity is never met in sis will produce proper results,

194
avoiding inaccuracies coming
from superimposed aperiodic
components. On this basis ,
also, the characterization of
the sound under analysis and
the estimate of the underlying
impulse response, will carry
more accurate results. For this Figure 1: Typical pseudoperiodic
estimate we propose a novel spectrum with superimposed Comb
method of interpolation in the filter
frequency domain, which gives
good results both for resynthe- The signals to be analysed
sis and modifications. are generally time limited, and
Finally the estimated impulse show an amplitude envelope
responses can be used for acting as a time window on the
resynthesis and modifications , oscillating components. The
carrying both efficient and spectrum therefore is made of
high quality reconstruction. sharp bands tuned on a dis-
crete set of armonically related
Separating the pseudope- frequencies (see fig 1).
riodic features from noise The use of Comb filters ,
and transients adaptively tuned on the fre-
quency peaks (see fig. 1),
As far as real world pseudo- allows the separation, on a
periodic signals are concerned, proper scale, of the armonic
the underlying periodic fea- signal from the aperiodic com-
tures can be hidden by aperi- ponents. Psychoacoustical cri-
odic components such as noise , teria [3], can be also used to
transients, and even slow mod- choose the proper scale of anal-
ulations. These components ysis, or equivalently the proper
,though relevant from a psy- filter bandwidth. Two differ-
choacustical point of view (e.g. ent approaches can be used for
see [2], [3]), can introduce se- filtering:
vere errors in the estimatation
• Use of Wavelet Comb
of both pitch period and pitch
Transform [4] (multiresolu-
synchronous spectral compo-
tion analysis), allowing the
nents.
decomposition of the sig-

195
-2 0: : - - - - :.,-000-:c:----=-2000=---::-:3000==----4C'":000==----S--000-::-:----6OOOc::':-::-----=7:="000:c::----=S-::-000=-=---"7::
9000
1

XI04 File: trumpet Filtered Signal


2-----------------!..------"-------r-------

-2 ---1.,-000:c::---.,-2000=---::-:3000==----4"C:
==----:S--000-::-:----:6OOO"::'::"::-----=7:='000=-=----=S-::"000=-=---~9000
000
LI

6000,--,- - - - - - - File: trumpet ~oisy Signal


.:..::::.:....::=c.:._:.:.::.::..::..:::~=_ _

..\OOO~

-6000 0;""------:-1"::"000=----::"2000::-:-::----3:-:000=---..\-:000----S-000----6OOO----7-000----S-000----9000

Figure 2 a

196
~ - - '---'--~-----r-"--------n

0,3

nal at different scale lev-


els. The so called resid- 07

1),6·

ual at the properly choosen


scale is the harmonic sig-
nal, while the other part is
just the aperiodic compo-
nent. "n 500 1000 t 500 2000 2!1OO JOOO 3500 4000

Figure 2: Energy distribution for sig-


Gil Use of classi- nallai]
cal Comb filters, tuned on
the frequency peaks of the
spectrum (actually on the of the "clean" signal may nev-
baricenter of a frequency erthless call for an upsam-
band); they have been de- pIe of the signal [8], useful
signed using a prototype mainly because the pitch pe-
passband with selected fea- riod doesn't equal an integer
tures: length bandwidth number of samples.This may
and bandpass shape; happen when the starling sam-
pling rate is not high enough,
As a result from these analy- and when we are analyzing
sis fig. 2a shows a time domain high frequency signals. In
signal together with its evalu- this circumstance proper up-
ated periodic and noisy com- sample, with interpola.ting fil-
ponents. ters may produce significant
improvements.
Extracting After that we proceed with the
pitch-synchronous grains determination of pitch periods.
Zero crossing detection leads
Second step in the analysis to the determination of local
of pseudo armonk sounds is energies, computed as the ab-
the extraction of pitch syn- solute value of the negative
chronous grains from a whole to positive transitions. Refer-
sound [5][6][7]. The above de- ring to an underlying physi-
scribed separation eliminates cal model, excitation plus fil-
errors in the evaluation of pitch tering, energy will have lo-
periods, due to noise, fast tran- cal maxima where the exci-
sients and sound articulation, tating pulses are located (see
as stated above. Processing fig.2). Referring to the same

197
figure it is a straightforward
task to locate a grid in the time
domain which will then ease
.'1" """""""""",
60
"""""h""h", !
.

the determination of the pitch


synchronous transitions. Due Figure 3: Pitch evolution for lail
to preliminary separation, this
"~---------,
will be possible even forlow en- "'v
ergy part of the signals, or dur- III

ing transitions or sound/voice


"
articulation. These transitions III

will allow the separation of in-


dividual frames, and will lead 2<lOO 2500
500 1000 1500
to a· representation of the sig-
nal, made of pitch synchronous Figure 4: formant evolution for lail
vectors.
scribed advantages due to the
proposed separation are also
Spectra
granted here, leading to more
of pitch sinchronous grains
accurate and reliable evalua-
and time/frequency repre- tion of all these parameters,
sentation
discarding misleading contri-
butions. In fig. 3 we show the
Main purpose of pitch syn-
evolution of the pitch and in
chronous techniques is the
fig. 4 the evolution of the first
evaluation of the Fourier coeffi-
formant in the sound [ail.
cients for the selected pitch pe-
riod of the signal. Such anal-
Estimate of the pulse re-
ysis allows a carefull estimate
sponse of the individual
of the slow evolution over time
grains
of the significant features of
the signals, avoiding the use of
If the purpose of the analysis
classical analysis tools such as
is to evaluate the parameters
windowing in the time domain.
for resynthesis, then it is nec-
Using DFT, then we can both
essary to estimate the impulse
describe the spectral content,
response of the system under
frame after frame, or describe
analysis (instrument or voice);
the evolution of each harmonic
that is to obtain the proper de-
over time. The above de-
convolution of the sound sig-

198
....
l.,r---------------,
J~r
:~

I I ...
~~~
I~f

01,
i ./v\. \ -0.&
i
\~t
i
·2000"-
-1.5
I
.3000
0 ""
;00
100 :00 lOO
"'"
Figure 5: Impulse response for vowel Figure 7: Impulse response for a vi-
[a] olin tone
l~..:::.'lll" _

Transform of the system (in


the case of an underlying exci-
tation mechanism) j these sam-
ples are taken at pitch inter-
vals. Therefore, using proper
interpolating laws between the
Figure 6: Impulse response for vowel known frequency samples, we
[i] can make an estimate of the
complete Fourier Transform or
nal from the excitating pulses. at least of a sampled version
If the signal belongs to a class of it, at a frequency grid finer
for which a source excitation than the spectral lines of the
model is valid, as in the case of starting grain. Using proper
voice, we can use well known interpolating laws which better
methods from literature, such embody features from the un-
as linear prediction, homomor- derlying physical mechanism,
phic processing or Discrete All we can make estimates of the
Pole modelling [9]. This is the FT and therefore of the im-
case of fig. 5 for vowel [a] and pulse response of the system.
fig.6 for [i]. The interpolating law can both
In the general case of har- allow reproduction of sound
monic signals, we propose the and also modification of its fea-
use of interpolation in the fre- tures. We show in fig. 5 the
quency domain. In fact the estimate of pulse responses of
harmonic components of singu- violin and in fig.6 of trumpet
lar grains, can be seen as sam- obtained with this method.
ples of the continous Fourier

199
2
.,0'
.., The harmonic part is simply
superimposed with the proper
0.'
amplitude envelope (it can be

-0.'

-,
'1 ~It expressed with relatively few
bits, due to the reduced dy-
namics, and also compacted in
-1.$
0 . '00 "0 200 '"'0 proper loops). Further analy-
Figure 8: Impulse response for a sis can be carried on, to obtain
trumpet tone the features and the evolution
of this aperiodic part, leading,
at least in some cases, to even
Resynthesis and modifica-
more efficient resynthesis .
tion
Using the above techniques
some modifications of a start-
Resynthesis c~n be obtained
ing sound as time stretching
using two channels: the first
or lengthening, pitch modifica-
one carries pseudo harmonic
tions and crOBS synthesis are
signals, while the second one
easely performed with both ef-
carries asynchronous inhar-
ficiency and quality.
monic signals. The first part
is the convolution of the im-
pulse response with a pulse
train (with proper amplitudes
in order to obtain a requested References
sound envelope, and proper
pitch evolution, to obtain a re- [1] De Poli G., Piccialli A.,
quested sound inflection and Roads C."Representation
articulation). This basic mech- of musical signals" MIT
anism requires a very low pro- Press, 1991.
cessing effort, due to the zeros
[2] Plomp R."Aspect of Tone
in the excitation. sensation" Academic
More complex excitation, and
Press, New York, 1976.
mul-
tipulse techniques [10] require [3] Lieberman P. "Speech
higher computational rates but physicology, speech per-
carry high quality results both ception and acoustic pho-
for reproduction of sound and netics" E.Blumstein Cam-
its modifications. bridge University Press,

200
1988. [10] EI-Jaroudi A., Makhoul J.
"Discrete All Pole Mod-
[4] Evangelista G. "Comb and elling" IEEE ASSP vol 39
multiplexed wavelet trans- n 2 1991.
forms and their appli-
cations to Signal Pro-
cessing". IEEE Trans on
ASSP (coming issue).

[5] Medan Y., Eyal Yair.


"Pitch synchronous spec-
tral analysis scheme for
voiced speech" IEEE Trans
on ASSP, 1989.

[6] Hess." Pitch


determination of Speech
Signals" Springer Verlag,
New York, 1983.

[7] Nathan K.S., Lee Y.T.,


Silverman H.F. J1 A time-
varying analysis method
for rapid transitions in
speech" IEEE Trans on SP
Vol 39 n. 4 1991

[8] Medan Y., Yair E. and


Chazan D." Super resolu-
tion pitch deter-
mination of speech signals
in speech" IEEE Trans. on
SP Vol 39 n. 1 1991

[9] Kroon P., De-


prettere E.F., Sluyter R.
"Regular-pulse excitation"
IEEE ASSP vol ASSP-34
1986.

201
MULTIPLE FEEDBACK DELAY NETWORKS FOR
SOUND PROCESSING

Davide Rocchesso

C.S.c. - D.E.I., Universita degli Studi di Padova


via Gradenigo, 6/A 35100 PADOVA Italy
fax: +39498287699 E-mail: roc@paola.dei.unipd.it

1 Introduction cuss some efficient realizations.


For the moment we will focus
Recursive comb filters are widely on single-input-single-output net-
used in signal processing, partic- works, even if in many applica-
ularly in audio applications. Dig- tions. i.t is advantageous to adopt
ital reverberators are often based multl-mput and/or multi-output;
on a parallel of comb filters or on a for example, this is the case ofmul-
series of all-pass filters, obtained tichannel reverberators [8], where
from the comb by addition of a uncorrelated outputs are needed
feedforward path. Comb struc- to enhance the spatial quality of
tures are also used for sound syn- sound.
thesis in techniques derived from
physical models like the Karplus-
Strong algorithm [4]. 2 Basic Formulation
In the recent past, has been un-
derlined [2, I] the necessity of a Our M.FDN is built by using N de-
generalization of the comb filter lay umts. Each of them has a time
through a multiple feedback delay length in seconds of Tj = mjT,
network (MFDN) in order to ob- where T = 1/F s is the sampling
period. The complete structure is
tain a sufficient time density cou- reported on Figure 1 and is repre-
ple~ ~ith a high frequency density
sented by the following relations:
III dIgital reverberators. lthink that
properties of MFDN's are largely N
unknown and it would be useful y(t) = LCiSi(t) + dx(t)
to know much more about these
structures to evaluate the possibil-
ity of using them in a wider range
{ 8i(t + 7:; = tai,j8j(t)
j=1
+ biX(t)
of applications. (1)
The main purpose of this paper is to where Sj, with I < i < N are
explore the mathematical proper- the outputs of delay TInesand they
ties of MFDN's focusing on some are a subset of the state of this lin-
possible applications, and to dis- ear system. Using Z-"transform, we

202
its state as the collection of the out-
puts of each unitary delay. Under
this definition, the state-transition
matrix A can be written in such a
way that expression 5 holds.

Figure 1: General Scheme for a Multiple


Feedback Delay Network

can rewrite equations 1 in the fol-


AA
T

~ ~ ~
[(

System poles can be found to be


iJ (5)

lowing matrix representation: the solutions of either:


y(z) = CTS(z) + dx(z) det[A-D(z-I)j=O (6)
{ S(z) = D(z)[As(z) + bx(z)j
or:
(2)
where det[zI - Xl = 0 z:f 0 (7)
sT(z) = [ SI (Z)S2(Z) ...SN(Z) ] ,
From 5 it is clear that A is uni-
bT = [ b1b2 ·.·bN ] , tary if and only if A is unitary. It
cT = [ CIC2··· C N ] , follows that if A is unitary, then A

D(z) = [~.~ml ~~~2... ~ ] is unitary and its eigenvalues have


magnitude one; in this case, sys-
tem poles are of magnitude one and
o 0... z-mN the MFDN has only non-decaying
will be called the "delay matrix" eigenmodes. J.M. Jot called this
particular MFDN a "reference fil-
and [al,1 al,2 al,N ] ter".
A= a2,1 a2,2 ... a2,N

aN, I aN,2 ... aN,N 3 Some Useful Restric-


will be called the "feedback ma-
trix". tions
From equations 2 we can easily
obtain the transfer function of our We can further restrict the range
MFDN: of possibility for matrix A, impos-
ing it to have a circulant (Toepliz)
H(z) = ~i:~ = cT[I-D(z)Aj-1D(z)b+d. structure. It is known that this
(3) structure allows an easy computa-
tion of its eigenvalues from Fast
SinceD(z) isdiagonal,D(zr
1
= Fourier Transform of a row (or a
D(z-I), and we can rewrite equa- column) .
tion 3 as follows: Consider the matrix A, having a
H(z) = cT[D(z-l) - Aj-1b + d. (4) Toepliz structure:
a(O) a(l)
a(N-I)]
A= ~.(.N-l) a(O) a(N - 2)
In order to obtain a state vari- [
able formulation of 1, we define a(l) a(O)

203
We can say that eigenvalues of A and they have phases:
are:
P(A)} = {A(k)} =
¢ = [0 f 3" r (II)

DFT([a(O) ... a(N - IW) This example will be used in some


(8)
illustrations that follow.O

If we have a matrix that is uni- Using N equal-length delay lines,


tary and circulant at the same time where m is length, one can
we can have eigenvalues on the spread Nzm poles over the range
unit circle and determine the exact [0 ... if] and one can compute
phase of them. In this way we can the exact value of every pole.
know exactly the frequency where For the structure to be practically
resonances could occur. We are useful, an attenuation coefficient
not sure that each system pole cor- must be multiplied by the output
responds to a resonance because of each delay line to give the feed-
we have not considered the effect back signal.Call
of zeros in 4; in particular, cancel- g = [91 92 .. , 9N]
lations may occur, but this will be the vector of these coefficients. In
clearer later. the special case where all the coef-
Example 1 ficients are equal we define gm def
gi, for every i. Unfortunately, for
Consider a MFDN with N = 3 some choices of B and C this case
reduces to a simple comb filter be-
~~ [i:r~ ;e]feedback matrix: cause of cancellations that occur in
equation 4. If we change B and
C, other resonances appear around
al a2 ao the preceding peaks.
to bt- unitary and circulant. If m is constant for all delay lines,
This leads to: equation 6 becomes:
det[A - zmIJ = 0 (12)
and it is clear that system poles
There are several solutions to this correspond to m-degree complex
system. For example we can add roots of {Ai}. Suppose that we
the condition a 1 = a3, in order to can arrange the system coefficients
reduce the complexity of multipli- in order to have a number of res-
cations. If al = a3 we obtain: onances equal to the number of
aT(z) = H 2 -I 2] and poles.
If we define the frequency density
Al [~
=
-I
~I ~l]
2 2
D f as the number of resonances
per Hertz, in our case we have:
The eigenvalues are easily com- Nm
Df = - (13)
puted using a DFf: Fs
Define the time density D t as the
A=DFT([ ~ =f ~ r) (10) number of pulses per second found

204
in the impulse response. We dis- 0.990.99 40.9960.998
criminate from the initial time den-
sity D ti and the final time density -0.000
D tf , where the former is computed
at the beginning of the impulse re-
sponse, and the latter is computed
after the reverberation time R t • R t
is defined as the time involved in
an attenuation of 60dB in the im-
pulse response. Figure 2: !fj;
In order to obtain dense rever-
beration after the first reflections
(i.e. after 80 msec), we could use
slightly different delay lengths, for
example by changing one or more
sample each delay length. In this x«m.
way, every mT seconds, the time The shape of ~ is depicted in Fig-
density is multiplied by N. ure 2, for a : 0.99 :::; a :::; 1.
With N = 3, m = 882 and
Fs = 44100H z, the frequency
density is D f = 0.06, usually too The first deri vative of 92 has a
low for good reverberations. With I
minimum at a = e- m, that is
N = 8 we could have D f = 0.16,
that is enough for a good reverber- equal to -~~ and is not lower than
atar. 4.2.10- 4 .
It is possible to control the de- It means that, with m = 882, a
cay characteristics by modifying variation of 20 units in the delay
the transfer function of each de- lengths implies a maximum de-
lay unit instead of modifying the viation in the exact coefficient of
feedback matrix. In general, if one 8.4 . 10- 3 . G"'nerally speaking, if
inserts a gain 9i at the output of Pd is the percent of deviation on
each delay line in the MFDN, one delay length, it determines a max-
can replace D(z) with D(z/a) in imum of - Pdf e percent of devia-
equation 2 simply by respecting tion in the exact coefficient.
the following condition: For example, with 9m = 0.933, a
Pd = 1% produces a deviation of
9i = am, (14) decay time less than -0.107 sec-
onds on a R t ~ 1.992sec.
The advantage of choosing nearly-
equal-length delay lines is that one There is a strict relationship be-
can control the decay time with a tween MFDN's and waveguides.
single coefficient. Consider the parallel junction of
SUPlose that 91 = am and 92 = N waveguides (e.g. cylindrical
am x where x is a variation in tubes). Let us name aj,j the reflec-
length. Under these assumption tion coefficient for the j-th waveg-
we have: uide and ak,j the transmission co-
efficient from the j-th to the k-th
W; = In aa m +x :::::: In aa m waveguide. If rj is the character-
where the approximation is valid if istic admittance of the j-th waveg-

205
uide, then we have [7]:
\11 I

( 15) \11
\11
0.5
••

In this way we obtain a scatter- -0.5 0.5

ing matrix that is symmetrical. If

.. •
we properly scramble the order of
incoming and outgoing waves at •• -0.5


the parallel junction of three equal
impeadence waveguides, the ma-
• - •
trix A of Example 1 can be viewed
now as a scattering matrix.
Figure 3: Distribution of Poles for Ex-
ample 2
Some better feedback matrices
may be found, departing from a
physical point of view and look- a -1+a l+a ]
ing at the frequency response of A=t l+a a -l+a
[ -1+a l+a a
the filter. In order to obtain color-
less reverberation we would like where ex = ~. This choice gives
to avoid clustering of resonance the distribution of poles of Fig-
peaks. In principle, this could ure 3, given a delay length m = 8.
be achieved by choosing equally o
spaced eigenvalues for the feed-
back matrix, but this choice would
lead to an identity matrix, that sim-
ply corresponds to a parallel of
4 Hints for Implementa-
comb filters. This solution must tion
be neglected because of its poor
time response. In section 3 we have shown a gen-
The latter observation looks like eral way of finding feedback matri-
another form of the well-known ces, regardless of their dimension.
"Indeterminacy Principle", always The circulant structure offers the
found in time-frequency analysis. possibility to find eigenvalues eas-
In order to achieve a dense time re- ily, and is suitable to a VLSI im-
sponse, we must look at solutions plementation because it involves a
where the elements of the feed- repetitive pattern of operations.
back matrix are all significantly Nevertheless, even under this re-
non zero. striction, the product of a N x
N matrix by aN-dimensional
Example 2 column involves N 2 multiplica-
tions/additions.
A good solution for N - 3 is For low values of N it is possible
the following: to find matrices allowing a number
aT(z) = H a -1+a l+a] and of operations to be linear in N. This

206
values that are strictly inside the
unit circle it is possible to design
Sum of a Out; - Outj a MFDN without attenuation coef-
Outputs ficients, thus saving N multiplica-
tions. In this case we can control
reverberation time by using a table
look-up that gives values for N co-
efficients, for each discrete value
Figure 4: Operations Associated with of the control variable.
Every Delay Line of Example 2 A simple way of simulating fre-
quency dependent losses is the use
is the case of the matrices of Ex- of interpolated delay lines. Some
amples 1 and 2. For the matrix of digital sound processors imple-
Example 2 we can arrange the op- ment such delays in an efficient
erations associated with each delay manner [6]. By choosing the in-
line as in Figure 4. terpolation coefficient one obtains
J.M. Jot [3] proposes to use matri- a different amount of dissipation
ces that are a generalization of that over frequency, having the maxi-
of example 1 to any given dimen- mum filtering effect at half the dis-
sion. We have just underlined the tance between two samples. This
fact that they are related to the par- technique preserves the linearity
allel junction of N waveguides. of phase by giving a fractional
This choice suffers from the fact constant delay at all frequencies.
that for increasing dimensions the It also helps to increase temporal
matrix tends to become an identity density by reducing the superim-
matrix, thus providing an unsatis- position of echoes.
fying temporal behavior. To over-
come this drawback Jot proposes
the use of block unitary matrices. 5 Simulating Resonators
Our approach of studying the po-
sition of eigenvalues via a short MFDN's with short delay lines
FFT allows one to experiment with may be used to produce resonances
many different structures for an ar- irregularly spread over frequency.
bitrary dimension, in order to find A possible application could be
both an efficient implementation the simulation of resonances in the
and a good distribution of poles. body of a string instrument.
A well-known work by Mathews
Sometimes it could be convenient and Kohut [5] has shown that in
to slightly depart from reference this kind of simulation, the exact
filters by choosing matrices whose position and heigth of resonances
eigenvalues are not exactly one in is not important; on the contrary
absolute value. This choice can be they stated that:
justified by fairly simple involved - the peak frequencies must be ir-
operation and it does not give a regularly spaced with respect to the
dramatic change in spectral evolu- harmonic frequencies of any tone.
tion, at least in the case of slight - the Q's of the resonances must
perturbations of a reference filter. be sufficiently large so the re-
Furthermore, starting from eigen- sponse curve is "steep" almost ev-

207
erywhere, that is to say, the mag-
nitude of the derivative of the im-
pulse response curve must be large
almost everywhere.
- the peaks must be sufficiently
close together so the depth of the
interventing valleys does not ex-
ceed about 15dB.
In Figure 5 are reported the fre-
quency responses of the filter hav-
ing the matrix A of Example 1 and
d = 0, for different values of delay Delays [Samples]=[64 64 64];
lengths, Band C. 8=[0.70.6..0.7]; C=[0.8 ..0.1 0.5]
Acting on delay lengths it is pos-
sible to move poles in frequency,
while acting on B and C we can
re-shape the frequency response.
The gain g determines the maxi-
mum peak to valley distance.

Using unitary and circulant matri-


ces we can compute the exact posi-
tion of system poles under a small
delay length deviation.
System poles are the solutions of
equation 6 that can be written Delays [Samples]=[64 59 67];
in terms of algebraic complements 8=[0.70.6..0.7]; C=[0.8 ..o.t 0.5]

Di,j:
det[A - D(z-I)] =
(al,1 - zm, )DI,I - al,2DI,2+ (16)
... =0
Put ml + Dm in place of ml and
obtain:
(all - zm, )D I I - al 2D I 2 + ...
+ ;m'(1 - z6m)DI,1 ,;" 0 '
If ,X is a system pole using m I,
suppose'x + D'x a pole using ml +
Dm. With the approximation: Delays [Samples]=[64 62 67];
8=[0.1 1.0 ..o.7j'. C=[0.8 ..o.l 0.5]
(..\ + o..\)m l ~ ..\ml + ml..\ml-I o..\ (17)
the equation 16 becomes:
ml Om..\ -l o..\2+(ml +om)o..\-..\ 1-6m+..\ ~ 0
Figure 5: Frequency Responses for Dif-
(18)
And this equation can be solved to ferent Values of B and C
give an approximated formula for
D'x:
0..\ ~ ml + om+/K (19)
2mlom..\-1

208
where 6 is the discriminant of olin Resonances" The Jour-
equation 18 and the choice of - nal of the Acoustical Soci-
or + sign must be done accord- ety of America 53(6): 1620-
ingly to the fact that a delay ex- 1626 1973.
pansion implies a frequency con-
traction (8f < 0), while a de- [6] A. Paladin and D. Rocchesso
lay contraction implies a frequency "A Dispersive Resonator in
expansion (of> 0). Real-Time on MARS Work-
station International Com-
/I

In a practical case, we can give puter Music Conference. San


a worst-case value to the devia- Jose - California, Oct. 1992.
tion 0).. considering a deviation 8m [7] J.O. Smith "Music Applica-
equal to the maximum deviation tions of Digital Waveguides"
for each delay line. CCRMA Report No. STAN-
In this case we have a change in M-39 1987
every peak frequency equal to:
5>' = (_1__
m+6m
l..)m>.
m [8] J. Stautner and M. Puckette
"Designing Multichannel Re-
verberators" Computer Mu-
References sic Joumal6(l): 52-651982.

[1] M.A. Gerzon "Unitary (En-


ergy Preserving) Multichan-
nel Networks with Feed-
backs" Electronics Letters V
12(11) 278-2791976.
[2] J. M. Jot. and A. Chaigne
"Digital Delay Networks for
Designing Artificial Rever-
berators." 90th Convention
of the Audio Engineering So-
ciety. Paris, Feb. 1991.
[3] J. M. Jot. "Etude
et Realisation d'un spatial-
isateur de son" PhD Thesis.
TELECOM Paris 92 E 019.
Paris, Sept. 1992.
[4] K. Karplus and A. Strong
"Digital Synthesis of
Plucked-String and Drum
Timbres" Computer Music
Journal 7(2): 43-55 1983.
[5] M. Mathews and J. Kohut
"Electronic Simulation of Vi-

209
A REAL TIME CLARINET MODEL
ON MARS WORKSTATION
Davide Rocchesso, Fabio Turra

C.S.c. - D.E.!., Universiti degli Studi di Padova


via Gradenigo 6/A 35100 PADOVA, Italy
fax +39.49.8287699 E-mail roc@paola.dei.unipd.it

This paper presents the The bell behaves as a couple of


development of a workbench for filters, reflecting low frequencies
studying a model of clarinet. The inside tube, with a sign change,
purpose was to verify the existing and transmitting high frequencies
models and to study their possible outside.
simplifications, and to obtain a In an early realization we
real time realization, suitable for adopted a static reed model,
an interactive use. Moreover, where the air flow depends only
starting from clarinet model, we on the pressure difference on the
wished to obtain general reed, and can be obtained by a
computational schemes for wide table lookup. The input variables
instruments families, without are incoming pressure and mouth
leaving physical models pressure. The output variable is
representation. the outgoing pressure, obtained by
The model was implemented on addiction of the incoming pressure
MARS (IRIS, Rome), a and the product of the air flow
workstation based on a board and the acoustic impedance of the
containing two DSP X20, digital tube.
audio signal processors, operating The base model was gradually
in fixed point. The whole work enriched. The first improvement
was done using the assembly consisted in using a dynamic
language of the processors, in model of the reed, i.e. with
order to obtain a high level of memory [2]. In this model a
parallelism. differential equation represents
The starting model is a clarinet reed dynamics. The reed position
model widely tested in literature at every time sample is obtained
[I]. The clarinet is divided into solving this equation by means of
two blocks: an exciter, the reed, back finite differences. A special
and a resonator, the bore. The block controls the non-linearity
bore is divided into a cylindrical introduced by the closed reed
tube and a flare, the bell. The situation: when the reed beats on
variables interacting between the mouthpiece, it nullifies reed
exciter and resonator are the velocity and reed acceleration (we
incoming and the outgoing hypothesize anelastic bump). The
pressure waves, travelling inside information relative to reed
tube. The tube behaves as a velocity, in this quantized model,
lossless waveguide, and can be is brought by the reed position at
simulated by a simple delay line. the previous sample instant.

210
Air flow calculation is based on displacement (x), for a given
an experimental law for wind constant value of previous step
ins~ents, which gives a reed displacement (xl).
relatIOn between the reed position The three-dimensional look-up
x, the pressure gap on the reed dp table was implemented on nsp
and the air flow u trough the reed using three tables. The minimum
narrow. In the case of the clarinet dim~nsion needed, for having no
that law is: adVIsable loss of sound quality, is
2 3
U = A 'ldPlf3 . X 74 • sign(dp) , of8 kwords.
where A is a physical constant A second improvement of the
~odel, which brought a further
typical of woodwind instruments.
In this equation air inertance mcrease of realism was the
should be added, but its effect is introduction of a nois~ generator.
the same as a lowpass filter, and it In fact, the sounds produced by
can be included in the effect of actl:Ial .instruments are not exactly
resonator filters. The air flow penodic, but they contain a noise
calculated using the fonner component. That noise is a
equation must be added to the air minimum part of sound, but it is a
flow pushed by the reed in its very significant part, and its
movement. All these relations are absence can give a sensation of
summed up in a non linear artificiality. In a woodwind
equation, which can be solved by instrument, the noise is caused by
numeric methods. The values of turbulent air motion, producing
air flow relative to given values of random pressure variations
pressure gap, reed position and depending on flow velocity and
reed velocity are written in a reed displacement. A noise
three-dimensional look-up table. generator model, developed for
Figure 1 shows a view of air flow human voice [3], was adapted to
variation (Zu), depending on clarinet, applying it to the narrow
pressure gap (dp) and reed produced by the reed. In that
model remarkable simplifications
may be introduced, neglecting the
dependence on reed position and
approximating the internal
3000
2000
resistance of the noise generator
1000 as a constant, so that it can be
a
-1000
included in tube impedance. A
::I -2000 comparison between original
N -3000
-4000
model output and approximated
·5000 model. output s~owed the validity
-6000
-7000
of snnphficatlOns. The noise
·8000 pressure can be represented, after
all, .as a pseud.o-random sequence
havmg a non lmear function of air
flow as its envelope. Air flow is
filtered by a lowpass filter to
Figure 1. View of air-flow variation ensure stability. In figure 2 a
block scheme of the model of

211
dynamic reed with noise generator pressure wave, a reflected wave
is shown. The variables shown in and two transmitted waves, into
the picture are the mouth blowing the tube and into the hole. This
kind of junction is simulated
Nou.
i. .,..raror, adopting a lattice scheme [5] and
introducing a lowpass filter to
realize the frequency dependent
reflection coefficient p. As P
depends also on distance between
the key and the hole, filter band is
made variable: the gradual hole
closure corresponds to the gradual
closure of the filter. The reflection
coefficient is negative, so the
scheme contains also a sign
change.
The hole can be moved along
the tube by varying the length of
Figure 2, Model of dynamic reed both the delay lines simulating the
tube, as illustrated in figure 3.
The scheme here individualized
pressure (PO), the pressure gap on presents a good degree of
the reed (dp), the reed position at generality, because it can be
present time (x) and one or two easily suited to the simulation of
samples before (xl, x2), the noise other wind instruments (oboe,
pressure (pn), the incoming trumpet, etc.) and, with some
pressure wave (Pi), the pressure effort but without structural
wave reflected by the tube (pr) modifications, also of instruments
and the air flow (Zu) normalized of other families (strings, etc.).
to pressure dimensions. The choice of the table lookup for
the solution of non linear

~
equations, beyond making
REED BELL calculation more efficient,
" Pi Dd6y /bU Hau Dd6y /bU ~ facilitates instrument changes,
requiring only tables substitution
Figure 3. Scheme of the instrument [6]. The non linear relation seen
for the clarinet becames valid for
double reed woodwinds by just
At last, the clarinet model was changing the exponents of
enriched by adding a hole, that pressure gap and reed position.
can be gradually closed and By also changing the sign of the
moved along the tube. For the constant form of the differential
hole, we adopted a model which equation, it is also possible to
takes into account the effects of a simulate brass instruments [2].
key (or a finger) positioned above The most interesting point is that
hole aperture [4]. The hole it is possible, modifying the non
behaves as a discontinuity of the linear equation (i.e. changing the
tube and produces, on a travelling three-dimensional look-up table),

212
to obtain new woodwinds which parameters were made available
have intennediate characteristics to experiment real time variations,
between single reed and double in order to make a gradual and
reed instruments. intuitive change of instrument
The insertion of a hole in the timbre possible.
resonator is able to produce some
subtle phenomenons that can be References.
encountered in real instruments,
particularly during transition [1] M. E. Mc Intyre, R. T.
phases, and that cannot be Schumacher, 1. Woodhouse: "On
obtained with simpler models. the oscillations of musical
It is interesting the result given instruments", Journal of the
by this model when simulating the Acoustical Society of America
gradual opening of the hole. 74(5): 1325-1345, 1983.
Figure 4 shows the evolution of [2] D. H. Keefe, M. Park:
the spectrum and the wavefonn of "Tutorial on Physical Models of
the pressure wave travelling inside Wind Instruments: lI, Time
the tube, during hole opening. The Domain Simulations", Seattle:
hole was, in this case, placed University of Washington,
quite near the reed, to exasperate Systematic Musicology Technical
the frequency change (almost to Report No. 9002, 1990.
the fifth hannonica). [3] M. M. Sondhi, 1. Schroeter:
"A hybrid time-frequency domain
articulatory speech synthesizer",
IEEE Transactions on Acoustic
Speech and Signal Processing"
35(7): 955-967, 1987.
[4] G. R. Plitnik, W. 1. Strong:
"Numerical method for
calculating input impedances of
the oboe", Journal of the
Acoustical Society of America,
65(3): 816-825, 1979.
[5] G. Borin, G. De Poli,
S.Puppin, A Sarti: "Generalizing
physical models timbre class",
Proc. Int. Conf. on Physical
Models, Grenoble, 1990.
[6] D. Rocchesso, F. Turra: ''A
generalized excitation for real-
time sound synthesis by physical
models", Stockholm Music
Figure 4. Evolution of the spectrum and Acoustics Conference,
the wavefonn during hole hopening Stockholm, 1993.

The final result is a satisfying


model of clarinet, where different

213
FFT-BASED RESYNTHESIS FOR REAL-TIME
TRANSFORMATION OF TIMBRE
Zack Settel & Cort Lippe

IReAM, 31 rue St-Merri, Paris, 75004, France


email: settel@ircam.fr&lippe@ircam.fr

Introduction. for the prototyping and devel-


opment of signal processing
The Fast Fourier Transform applications intended for use by
(FFT) is a powerful general- musicians. Development in the
purpose algorithm widely used Max programming environment
in signal analysis. FFTs are use- [3] tends to be simple and quite
ful when the spectral informa- rapid: digital signal processing
tion of a signal is needed, such as (DSP) programming in Max re-
in pitch tracking or vocoding al- quires no compilation; control
gorithms. The FFT can be com- and DSP objects run on the same
bined with the Inverse Fast processor, and the DSP library
Fourier Transform (IFFT) in provides a wide range of unit
order to resynthesize signals generators, including the FFT
based on its analyses. This appli- and IFFT modules. Techniques
cation of the FFT/IFFT is of for filtering, cross-synthesis,
particular interest in electro- noise reduction, and dynamic
acoustic music because it allows spectral shaping have been ex-
for a high degree of control of a plored, as well as control struc-
given signal's spectral informa- tures derived from real-time sig-
tion (an important aspect of tim- nal analyses via pitch-tracking
bre) allowing for flexible, and and envelope following. These
efficient implementation of sig- real-time musical applications
nal processing algorithms. offer composers an intuitive ap-
proach to timbral transformation
This paper presents real-time in electro-acoustic music, and
musical applications using the new possibilities in the domain
IRCAM Signal Processing of live signal processing that
Workstation (ISPW) [1] which promise to be of general interest
make use of FFT/IFFT-based to musicians.
resynthesis for timbral transfor-
mation in a compositional con-
text. Taking a pragmatic ap- The FFT in Real Time.
proach, the authors have devel-
oped a user interface in the Max Traditionally the FFT/IFFT has
programming environment [2] been widely used outside of real-

214
time for various signal analy- steps: (1), windowing of the in-
sis/re-synthesis applications that put signals, (2) transformation
modify the durations and spec- of the input signals into the
tra of pre-recorded sound [4]. spectral domain using the FFf,
With the ability to use the (3) operations on the signals'
FFT/IFFT in real-time, live sig- spectra, (4) resynthesis of the
nal-processing in the spectral modified spectra using the IFFf,
domain becomes possible, offer- (5) and windowing of the output
ing attractive alternatives to signal. Operations in the spectral
standard time-domain signal domain include applying func-
processing techniques. Some of tions (often stored in tables),
these alternatives offer a great convolution (complex multipli-
deal of power, run-time econ- cation), addition, and taking the
omy, and flexibility, as com- square root (used in obtaining an
pared with standard time-domain amplitude spectrum); data in this
techniques [5]. In addition, the domain are in the form of rect-
FFT offers both a high degree of angular coordinates. Due to the
precision in the spectral domain, inherent delay introduced by the
and straightforward means for FFf/IFFf process, we use 512
exploitation of this information. point FFfs for live signal pro-
Finally, since real-time use of cessing when responsiveness is
the FFf has been prohibitive for important. Differences in the
musicians in the past due to com- choice of spectral domain opera-
putational limitations of com- tions' kinds of input signals used,
puter music systems, this re- and signal routing determine the
search offers some relatively nature of a given application:
new possibilities in the domain small changes to the topology of
of real time. the DSP configuration can result
in significant changes to its func-
tionality. Thus, we are able to
Algorithms and Operations. reuse much of our code in di-
verse applications. For example,
All of the signal processing ap- though functionally dissimilar,
plications discussed in this paper the following two applications
modify incoming signals and are for phase rotation and filtering
based on the same general DSP differ only slightly in terms of
configuration. Using an overlap- their implementation. see
add technique, the DSP configu- following figure:
ration includes the following

215
phase rotation filtering

toIFFr

toIFFf

Applications. signal A spectral


envelope
High-resolution filtering .,7\
[ bI
01M
~BJ
Highly detailed time varying
spectral envelopes can be pro-
CO

[jllllh....... 1
.

lit: J
duced and controlled by rela- (convolution)
tively simple means. A look-up
table can be used to describe a ~
spectral envelope in the imple- Lull..........!
result
mentation of a graphic EQ of up
to 512 bands. The spectrum of
the input signal is convolved, Low dimensional control of
point by point, with the data in complex spectral envelopes
the look-up table, producing a The spectral envelope used in the
filtered signal. Using a noise above filtering application can
source as the input signal, it is also be obtained through signal
also possible to do subtractive analysis, in which case a second
synthesis efficiently. Because we input signal, signal B, is needed.
are able to alter the spectral en- Signal B is analyzed for its spec-
velope in real time at the control tral envelope, or amplitude
rate (up to 1kHz), we may mod- spectrum, that describes how
ify our spectral envelope graphi- signal A will be filtered.
cally or algorithmicaly, hearing Obtaining a spectral envelope
the results immediately. see fol- from an audio signal offers sev-
lowing figure: eral interesting possibilities:

216
spectral envelopes can be spectrum is convolved with the
"collected" instead of being amplitude spectrum of signal B.
specified, and can change at a Thus, the pitch/phase informa-
very fast rate (audio rate), offer- tion of signal A and the time
ing a powerful method of dy- varying spectral envelope of sig-
namic filtering. Also, audio sig- nal B are combined to form the
nals produced by standard signal output signal. Favorable results
processing modules such as a are produced when Signal A has
frequency modulation (FM) pair a relatively constant energy level
(one oscillator modulating the and broadband spectrum, and
frequency of another) are of when signal B has a well defined
particular interest because they time varying spectral envelope.
can produce rich, easily modi- In the following example of a
fied, smoothly and nonlinearly vocoder, text can be decoupled
varying spectra [6] which can from the speaker or singer's
yield complex time varying "voice quality", allowing one to
spectral envelopes. In place of modify attributes of the voice
FM, other standard signal pro- such as noise content, inhar-
cessing modules can be used that monicity, and inflection, inde-
offer rich varying spectral in- pendently of the text material.
formation using relatively sim- see following figure:
ple means with few control pa-
rameters. One of the advantages signal A signal B
of using standard modules is that (pulse train) (sung or spoken
electronic musicians are familiar text)
with them, and have a certain
degree of control and under-
I' III II II II II1I I ~ • ~Od'J\5!J

llillLUJlllJJ + bllndl...uul
standing of their spectra. see
following figure: (amplitude spectrum)

signal A signal B
(broadband FM) lbJ:lh PRJ I ...

rAW?if'I result

[JIB II II I
"pOlh r
,
fA/0J'J\o;?i
llIllIiJJljJJjJ
(amplitude spedmm)
Mapping qualities of one signal
to another
A simple FM pair may be used
to provide an easily controlled,
ldiWUg.um..l constant-energy broadband
result spectrum for use in cross syn-
thesis as signal A. Musically, we
Cross synthesis have found that in some cases,
In this application two input sig- the relationship between signal A
nals are required: signal A's and signal B can become much

217
more unified if certain parame- specify the "panning" (phase off-
ters of signal B are used to con- set) for up to 512 frequency
trol signal A. In other words, components. see following fig-
real-time continuous control pa- ure:
rameters can be derived from signal A
signal B and used to control sig-
nal A. Pitch and envelope fol-
lowing of signal B can yield ex- lIJJ:liJ h .. Pm I
~ftOtI2)~
pressive infonnation which can
be used to control the pitch and
the intensity of frequency modu-
lation of signal A. see following
figure: right (0
phase offset table
(roDlrol) (x=frequency, y= phase offset)
signal A "'-"p~itc"':"h-tr-ac"':"k-er" ~ signal B
(FM ) and (sung or spoken
envelope text)

,
R
[ 7\510 ~ f51 __ foDo_w_er.-' EO &Ii i-\J'l\(\?JA
1•• I..lh..... .1 .. Dhu,h.... .I imaginary

to left to right
dl._....1
L....result loudspeaker loudspeaker

Future Directions
Frequency dependent spatializa- The authors are currently
tion working on alternative methods
In the spectral domain, the of sampling that operate in the
phases of a given signal's fre- spectral domain. Many interest-
quency components can be inde- ing techniques for sound ma-
pendently rotated in order to nipulation in this domain are
change the component's energy proposed by the phase vocoder
distribution in the real and [7, 8]. Along with the possibility
imaginary part of the output sig- of modifying a sound's
nal. Since the real and imaginary spectrum and duration
parts of the IFFf's output can be independently, we would like to
assigned to separate output chan- perform transposition indepen-
nels, which are in turn connected dent of the spectral envelope
to different loud-speakers, it is (formant structure), thus allow-
possible to control a given fre- ing one to change the pitch of a
quency's energy level in each sound without seriously altering
loud-speaker using phase rota- its timbral quality.
tion. The user interface of this
application permits users to
graphically or algorithmically

218
Music JoumalI5(3), pp. 41-49.
Conclusion [2] Puckette, M. 1988. "The
Patcher." In C. Lischka and J.
With the arrival of the real-time Fritsch, eds. Proceedings of the
FFT/IFFr in flexible, relatively 1988 International Computer
general, and easily pro- Music Conference. San
grammable DSP/control envi- Francisco: International
ronments such as Max, non- Computer Music Association.
engineers may begin to explore [3] Puckette, M., 1991. "FrS:
new possibilities in signal pro- A Real-time Monitor for
cessing. Real-time convolution Multiprocessor Music
can be quite straightforward Synthesis." Music Conference.
and is a powerful tool for trans- San Francisco: Computer Music
forming sounds. The flexibility Association, pp. 420-429.
with which spectral transforma- [4] Haddad R, and Parsons,T
tions can be done is appealing. 1991, "Digital Signal Processing,
Our DSP configuration is fairly Theory, Applications and
simple, and changes to its topol- Hardware", Computer Science
ogy and parameters can be made Press (ISBN 0-7167-8206-5)
quickly. Control signals result- [5] Gordon, J and Strawn J,
ing from detection and tracking 1987. "An introduction to the
of musical parameters offer phase vocoder", Proceedings J

composers and performers a CCRMA, Department of Music,


rich palette of possibilities lend- Stanford University, February
ing themselves equally well to 1987.
studio and live performance [6] Chowning, J. 1973. "The
applications. Synthesis of Complex Audio
Spectra by means of Frequency
Modulation" Journal of the
Acknowledgements. Acoustical Society of America
21(7), pp. 526-534.
The authors would like to thank [7] Dolson, M. 1986. "The
Miller Puckette, Stefan Bilbao, phase vocoder: a tutorial",
and Philippe Depalle for their Computer Music Journal, 10(4),
invaluable technical and musical Winter 1986
insights. [8] Nieberle, Rand Warstat,
M 1992. "Implementation of an
analysis/synthesis system on a
References. DSP56001 for general purpose
sound processing", Proceedings
[1] Lindemann, E., Starkier, of the 1992 International
M., and Dechelle, F. 1991. "The Computer Music Conference.
Architecture of the IRCAM San Jose: International
Music Workstation." Computer Computer Music Association.

219
220
Capitolo 6

WORKSTATION MUSICALI

221
222
Sezione 6a

Stazione di Lavoro
Musicale lntelligente

223
224
The cognitive level of the Intelligent Musical Workstation
Antonio Camurri
DIST • University of Genova
Computer Music and AI Labs
Via Opera Pia IliA -16145 Genova
e-mail: music@dist.dist.unige.it

The Intelligent Musical • XpetreX Development Tool (X-


Workstation (IMW) is a computer windows PETRi nets EXecutor):
environment for computer music a High-Level Petri nets-based tool
production and scientific research. for music and multimedia
It has been developed by the modeling and execution [1];
laboratories LIM-DSI of the • WinProcnelHARP (WINdows
University of Milan and DIST PROlog tool Combining logic and
Computer Music and AI Labs of semantic NEts for Hybrid Action
the University of Genoa in the Representation and Planning): a
framework of the project LRC hybrid AI tool for music and
C4: MUSIC, Stazione di lavoro multimedia composition [2,4,6];
musicale intelligente - Intelligent
Musical Workstation (IMW) , • Tools based on artificial neural
Progetto C: Sistemi avanzati di networks:
produttivita' individuale, • ENA (Experimental Neural
Sottoprogetto 7: Sistemi di Accompanist): real-time chords
supporto al lavoro intellettuale, accompaniment of a single
Progetto Finalizzato Sistemi melody based on back-
Inforrnatici e Calcolo Parallelo of propagation [7];
the Consiglio Nazionale delle SOUL (Self-Organizing
Ricerche (CNR). Universal Listener): sound
The global architecture and the classification based on Kohonen
main tools are described in [3]. neural networks [8].
Here we focus on the cognitive
XpetreX Development Tool
level of the IMW, that is, on the
high-level modules for sound and It is a Petri net-based software
music processing based on tool that allows users to model
Artificial Intelligence (AI), and execute music and multimedia
developed at DIST Computer objects by means of High-Level
Music and AI labs. (Colored) Timed Petri nets. Its
main features include:
DIST-IMW Software Modules
The main DIST-IMW modules • Colored Timed Petri net model;
are the following: • Dynamic binding of external C
code associated to transitions;

225
• Object oriented extensions to graphical user interface which
Petri nets; allows an easy editing/browsing
of Petri net models, as it is
• Efficient Petri nets execution
sketched in figures 1 and 2. The
kernel, derived from the Petrex
system has been implemented in
system.
C language under Unix/X-
XpetreX Development Tool is
Windows/Motif.
characterized by an advanced

00
°'\i
o 0
(2]/
/

Figure 1: The XpetreX Editor/Browser main menu

226
00
0
O"\'o Il-----if-~r_"~:c_~~~+'_-+-E_,,;~_~~+l--+..-::......-t--t----t---+--+---t---t--j--
fd1 / 1l---+--+-+--l---+-tII!

/ /?/ 1l----i--+--1---t--l---li ·/~ii-"·" •···....ii-..,..


inPut_~ ~t_~
'.._.._ ..<i '\.---/

/r.! /

..<

[~~t:::lj

,....
·..·~ii- .., .v"~

~uthr~ rAct~Il
'-. "..........~>..
......... _..::,.r~~~£j_ .....
'._" .'"

Figure 2: Introducing an instance of a Petri net class (small window on the


right) as a refinement of a transition (main window). I/O places in the instance
are assigned by the user to suitable places in the net in the main window.

WinProcne/HARP different nature and levels of


WinProcne/HARP is a system for music and multimedia objects,
music and multimedia from the symbolic, knowledge
composition, based on a hybrid level, dealing with multi-level,
knowledge representation and abstract representations and
reasoning architecture. It is based reasoning mechanisms, to the
on the integration of different physical, analogical level, dealing
formalisms, able to manage the with low-level processes and

227
signals (including metaphors like figures interacting with real actors
force fields, abstract potentials). on stage.
The knowledge level is realized A detailed description of this
by integrating a semantic network system can be found in [2, 4, 6],
language of the family of KL- and in a paper in preparation. An
ONE and Krypton with a temporal example application is shown
logic language and production elsewhere in these proceedings
rules. The analogical level is (see the paper by Massucco,
modeled as an active memory Mercurio, and Palmieri).
based on metaphors (the AIM - Figure 3 shows system
active isomorphic memory), architecture. Figure 4 gives an
implemented as a concurrent idea of the high-level graphical
object-oriented system, similar to user interface of the system.
Hewitt Actors system. The system is implemented in
Our system has been designed for C++ and Prolog. Two versions
computer assisted composition are available: the former runs
and analysis, and for a particular, under Windows 3.1 (Microsoft
generalized 'multimedia' domain: a Visual C++ and Arity Prolog 6.x),
theatrical automation project, the latter under Silicon Graphics
where the system is delegated to Indigo R4000, under Unix/X-
manage and integrate sound, Windows!Motif (native C++ and
music, and three-dimensional Sicstus Prolog compilers).
computer animations of humanoid

r L eve
SJIm b OIC A na I021ca
. I L evel
.....---
WlnProcne I Experts
Symbolic DB n
T-Box IA. 1 t I~
Iexpert I
l'rIY e I'll
-"'" ;:::..
r
-.::: ",...
f Iexpert I
A-Box ttL ~ a
IrIV c II
I Prolog Engine I ~
e IAnalogical KB I

Figure 3: WinProcnelHARP system architecture

228
Figure 4: WinProcnelHARP graphical inteiface

which the system generates and


DIST IMW Users integrates in real-time humanoid
XpetreX has been extended for figure animations and computer
experimental use also in industrial music, is in preparation.
set ups (FMS).
WinProcne/HARP is currently WinProcne/HARP is also
used by composers. The system experimented in collaboration
has been utilized by Giuliano with Marc Leman (University of
Palmieri (composer) and Mario Ghent) in a tonal centers
lorio (director) in a recognition research problem [5].
theatrical/musical event at Palazzo Acknowledgments
Ducale (Genova), and in a concert
at Villa Arconati (Milano). A XpetreX graphical editor/browser
multimedia theatrical event at has been implemented by Paolo
Theatre H.O.? Altrove, Genova, Franchi, Cesare Mastroianni and
directed by lorio and Palmieri, in Roberto Sagoleo.

229
WinProcnelHARP version for [4] A Camurri, M.Frixione,
Windows 3.1 has been C.hmocenti, and R.Zaccaria, "A
implemented by Alessandro model of representation and
Catorcini, Carlo hmocenti, communication of music and
Alberto Massari, Alex Massucco, multimedia knowledge", Proc. ECAI-
Marco Mercurio. The SGI 92, Wien, 1992.
implementation is by Stefano [5] ACamurri, and M.Leman,
Andomo and Nicola Serini. "Hybrid Representation of Music
Important contributes to the Knowledge • A case study on the
cognitive model and the design automatic recognition of tone
are due to Renato zaccaria and centers", Proc. International
Marcello Frixione. Workshop on Models and
DIST·IMW Bibliography Representations of Musical Signals,
Capri, 5-7 October 1992, University
[1] ACamurri, G.Haus, and of Napoli, and AIM!.
R.Zaccaria, "Describing and
performing musical processes", [6] ACamurri, C.hmocenti, and
Interface, Vo1.15, No.1, 1986, pp.1- C.Massucco, "A Multi-Paradigm
23, Swets & Zeitlinger, Lisse, The Software Environment for the Real-
Netherlands. A revised version is Time Processing of Sound, Music,
included in P.Morasso and and Multimedia", Knowledge-Based
V.Tagliasco (Eds.), Human Systems, Butterworth-Heinemann,
Movement Understanding, North 1993 (Forthcoming).
Holland, 1986. [7] ACamurri, M.Capocaccia, and
[2] A Camurri, C.Canepa, R.Zaccaria. "Experimental Neural
M.Frixione, and R.Zaccaria, "HARP: Accompanist (ENA)". Proceedings
A System for Intelligent Composer's International Neural Networks
Assistance", IEEE COMPUTER , Conference INNC-90, Paris, Kluwer
Vo1.24, No.7, July 1991, pp.64-67. Academic Publisher, 1990.
An extended version is included in D. [8] U.Bertelli, C.Bima, ACamurri,
Baggi (Ed.), Readings in Computer L.Cattaneo, PJacono, P.Podesta',
Generated Music, IEEE Computer R.Zaccaria. "Progetto SOUL: Un
Society Press, 1992. sensore acustico intelligente adattivo
[3] A Camurri, and G.Haus, per il riconoscimento di sorgenti
"Architettura e ambienti operativi sonore", Proceedings IX Colloquio
della Stazione di Lavoro Musicale Informatica Musicale, AIM! and
Intelligente", Proc. IX Colloquium on DIST Universita' di Genova, 1991.
Musical Informatics, Genoa, 13-16
November 1991, AIMI, and DIST -
University of Genova.

230
Toward a cognitive modelfor the representation and
reasoning on music and multimedia knowledge
A.Camurri, A.Catorcini, M.Frixione,C.lnnocenti, A.Massari, R.Zaccaria
DIST - Universita di Genova
Laboratorio di Informatica Musicale
Via Opera Pia IliA -16145 Genova
e-mail: music@dist.dist.unige.it

Abstract: This paper introduces a abstraction mechanisms and


cognitive model for the functional programming to make
representation and real-time easier the design process of
processing of music and multimedia, composition algorithms. Other
based on artificial intelligence (AI) important steps in this direction are
techniques. The paper discusses the introduction of object-oriented
some basic issues on the techniques (pope, 1991), and other
requirements of AI-based computer extensions to allow visual
music systems. Then, our proposal programming capabilities, as in MAX
of a representation scheme - at the and Edit20IMARS, or to integrate
basis of an implemented system, hypermedia and multimedia objects
called WinProcnelHARP is in the music language, as in SmOKe
introduced. (Pope, 1992). However, these are
programming languages, including in
1. Requirements of music and their design more or less special
purpose features for music
multimedia AI-based systems processing. They generally are low-
Systems for sound and music level tools, since often the composer
processing are generally based on is forced to work (program) at the
tools and programming languages level of MIDI messages: these
designed for low-level manipulation languages have a very poor idea of
of music scores and composition music, and, above all, it is very
algorithms (Loy and Abbott, 1985). difficult to abstract to higher level
Music V (Mathews, 1969) and representations since these
cmusic (Moore, 1990) are two well- languages do not support high-level
known examples of such a class of tasks. From a cognitive science
traditional systems. More recent viewpoint, these tools do not make
languages, such as fugue any assumption, for example, on the
(Dannenberg, 1991), provide more problems of modeling human
elegant formalizations, e.g., aiming cognitive and perceptual skills.
at the unification of the score and the These problems are the main focus
orchestra languages into one of the related research field on
language, and supporting some cognitive science and music (see for
example Howell, West, Cross, 1991;

231
Krumhansl, 1990; Leman, 1989, analysis and musicological
1992; Todd, 1992). In this scenario, applications (Howell, West, and
research in AI and music plays a Cross, 1991).
significant role: on one hand, The aim of our work is both to
research efforts are directed toward develop and experiment a high-level
the defInition of higher-level models representation scheme for music and
for the manipulation of music in multimedia knowledge, based on AI
multi-media information systems 1; techniques, and to implement a
on the other hand, the research system for the "intelligent"
focuses on the defInition of assistance of users (including
'intelligent' systems for music composers and musicologists).
processing, with the aim of both a
deeper understanding of the In this paper, we focus on the
perceptual and cognitive aspects of cognitive model at the basis of our
the 'musical mind', and to provide AI-based system. Before introducing
composers, musicologists, and our model, we discuss some basic
researchers in general with more issues at the basis of this new
powerful, higher-level computer generation of systems.
languages and tools. For example, The fIrst issue deals with 'how' to
signifIcant studies are currently in approach AI for music research and
progess on several areas: real-time applications: our convinction is that
performance and interpretation (De research should be grounded on a
Poli, Irone, and Vidolin, 1990; bottom-up approach to AI, starting
Bresin, De Poli, and Vidolin, 1991), from the psycho-physical, perceptual
computer assisted music and cognitive fmdings. In other
composition (Baird et aI., 1990; words, we deem that realistic, and
Courtot, 1992; Ames, 1992), therefore useful, AI models for
listening (Leman, 1991), tutoring music processing cannot be simply
(Dannenberg et aI., 1990), music grounded upon symbolic, abstract
entities, such as, for example, the
1In these last years, both researchers
starting notions of a textbook on the
and industries have shown a growing
interest in such kind of systems: in
standard music notation. It is
fields such as user-interface design, important to start from the psycho-
data and knowledge base management physical, signal levels at which the
systems, the integration of capabilities human auditory system operates:
encompassing graphic, animation, many fundamental aspects of music
voice, sound and music into a flexible, understanding have their roots in its
high-level interactive multi-media
computing system is one of the major structure and behaviour. This implies
goals. In the human-computer that many music concepts often
interaction field, important aspects given as starting axioms in several
regard the spatial and temporal logic-based music languages, should
management and processing of be considered, as a matter of fact, as
integrated multi-media objects.
the emerging result of the natural

232
processing of the auditory system. allow composers, musicologists to
For example, the concept of tonality defme and reason on plans,
and the theory of the circle of fifths strategies, on the basis given
can be defmed symbolically as a requirements, providing both formal
serie of axioms in a sort of 'abstract' and informal analysis capabilities for
music knowledge base (KB); in an inspecting the objects represented.
alternative approach, they have been Next, given the complexity of the
demonstrated as an emergent problem domain, such systems
structure derived from models of the should be characterized by
human ear (Leman, 1992). In integrated representations, each
(Camurri and Leman, 1992) such an dealing with particular aspects:
approach is faced in the framework combining different,
of our representation architecture. complementary representations is
This bottom-up approach is therefore a key issue in this kind of
uncommon in AI-based systems, and representation and reasoning
this is one of the basic source of architectures (a multi-paradigm
their failure or scarce utility, even approach, from the point of view of
from a theoretical point of view, the computational mechanism). For
since their reasoning models lack example, taxonomic representations
(part of) the semantics of the objects in terms of semantic networks
represented. Other examples of (Brachman and Schmolze, 1985) are
bottom-up approaches starting from appropriate for inheritance and
the signal, perceptual level of classification inference mechanisms;
representation can be found in (De production rules are appropriate for
Poli, Piccialli, and Roads, 1991), representing logical implications;
(Leman, 1992), and (McAngus reasoning on actions and plans
Todd, 1992). requires still further mechanisms,
The problem discussed above can be such as analogical reasoning, i.e.,
put in strict relation, from an AI reasoning on the domain by means
viewpoint, with the well-known of "metaphorical" representations. In
problems of symbol-grounding and fact, metaphors are widely used in
situatedness (Hamad, 1990; Kirsh, music, e.g., referring to the real
1991; Steels, 1991). world dynamics or kinematics (see
Other important issues are the for example Camurri, Morasso,
following: we need formalisms able Tagliasco, and Zaccaria, 1986): the
to manage the different levels of metaphor of navigation in a space of
abstraction of music and multimedia force or potential fields is
objects, from the symbolic, abstract particularly useful, as it is discussed
representations (e.g., the voices of a in section 2. From a cognitive point
canon) to the physical, analogical of view, this is the problem of
representations, like the signal representing music imagery and
perceived by the human ear. mental models of music knowledge.
Furthermore, such formalisms should For example, the research of

233
McAngus Todd (1992) and Leman All the previous issues has been
(1992) is based on the allusion of discussed in the context of music
musical expression and perception to knowledge. One of our main goals is
physical motion, including concepts to integrate both music and
of energy and mass: musical multimedia knowledge in a single
phrasing has its origin in the representation architecture. Let us
kinematic and dynamic variations discuss shortly what we mean as
involved in single motor actions. multimedia mowledge. Let us
This is grounded on the psycho- consider the two following
physical structure of the human scenarios: (i) a multimedia theatrical
auditory system. automation machine: a system
In this short overview of some of the delegated to manage and integrate
crucial issues on music sound, music, and either three-
representation, it is interesting to dimensional computer animations of
recall the three constraints discussed humanoid flgures (e.g., dance
in (Courtot, 1992) in the framework movements) or the movement of a
of the deflnition of the role of real autonomous robot in a theatre
composition assisted environments: stage (e.g., a real vehicle on wheels,
"] )...assisting the composer to equipped with on-board acoustic,
precisely formalize the syntax of proximity, infrared, and possibly
musical structure. 2) ... handle the other sensors, an on-board computer
conceptual level of composers. 3) ... possibly connected with a more
allow a composer to add new powerful remote computer via a
musical composition programs, radio link): such an 'agent' should be
without any skill in the particular able to move, navigate, react to
programming language chosen for events happening on stage (e.g.,
the system" (p.191). The last point actions performed by the actors),
tackles a fundamental issue, hear, and possibly execute musical
regarding the learning capabilities, tasks; (ii) a museal framework: an
Le., how to automatically update the autonomous robot, very similar in its
system's knowledge (e.g., new architecture to the theatrical
analysis data, new composition machine, operates in real time in a
strategies), for example by means of museum exhibition, and is able to
generalizations processes starting welcome, entertain, guide, and
from examples presented by the instruct visitors.
user. The solutions proposed in These two examples give an idea of
literature, such as the purely what kind of "multimedia"
symbolic approaches (Courtot, 1992; knowledge we refer to. The basic
Widmer, 1992), and the learning
objectives here are: (i) integration
systems based on neural networks, of different modalities and
are interesting early attempts in this competences; (ii) high level, multi-
direction. modal capabilities of interaction

234
with the real world: the system actions on the world (e.g., drive a
should build up and reason on a sound device, or a robot movement),
realistic model of what is happening or carry out actions in a simulative
on stage, e.g., for deciding how to model of the world, or both. Actions
interpret or generate a music object, and simulations can be triggered by
far more than a simple "triggering" the content of the symbolic memory,
mechanism. Interaction therefore and, in tum, can generate new
means a deeper mechanism than a symbolic information. Moreover,
simple temporal synchronization of actions and simulations in the AIM
chunks of music and have a corresponding symbolic
graphics/animation data. representation in the symbolic
memory; the AIM executes
2. The Cognitive Architecture performances/simulations and/or
Our cognitive model fmds its roots
actions, described in the symbolic
and motivations in the crucial issues memory, and returns to it the
discussed in the previous section. It relevant aspects discovered during
is a general, abstract cognitive model the execution. The three components
for an autonomous agent exhibiting of figure 1 can be sketched as
integrated music and multimedia follows:
competence. In this context, we refer 1. LTM (Long Tenn Memory): it is the
to "multimedia lmowledge" as the pennanent, "encyclopedic" storage of
lmowledge necessary to an general music and multimedia
knowledge;
autonomous agent for interacting
with the real world, and exhibiting 2. STM (Short Tenn Memory): it is the
skills like those required in the actual "context", or "reasoning
horizon", regarding the state of the
theatrical and museal projects affairs of the world and the problems
described in the previous section. actually faced by the agent;
In figure 1, our cognitive model 3. AIM (Active Isomorphic Memory): it
architecture is depicted. The three is an "active" (capable of generating
boxes are different "active storage" and perceiving changes) world model,
elements for lmowledge where a double mapping exists
representation and reasoning. between perceptual data I motor
Arrows indicate information commands and internal data, and
between some perceptual I motor
exchanges during planning/acting. processes and internal "analogical
The active isomorphic memory models" of them.
(AIM) is a component based on the
analogical kind of representation Arrows LTM H STM in figure 1
mentioned above. The symbolic indicate the information flow that
memory contains the symbolic keeps the context updated. As long
representations that constitute the as the horizon of the action evolves,
"high level" lmowledge of the new long term information is
system. The AIM can either drive instantiated in the STM to build a
(plan and execute) external, real local description of the state of the

235
affairs. For example, in the process knowledge in the LTM (arrow LTM
of composing a piece in a classic -7 STM); starting from this context,
sonata fonn, the context related to a model of the action is generated
the knowledge on the exposition part (arrow STM -7 AIM); the AIM
of the sonata will only be initially feeds back its "discoveries" to the
instantiated by the system. This STM (by simulations, perfonnance
instantiation of the symbolic or analogical reasoning based on
description of the exposition (e.g., metaphors like navigation in force
the defInition of the two contrasting fields), and asks for new information
themes and their evolution) is (arrow STM r- AIM). The STM in
considered short tenn because its its tum asks for relevant infomation
existence is bounded to the current the LTM and possibly updates it if
composer's task. The subsequent new lang-tenn knowledge is
parts of the sonata fonn will be
discovered (arrow LTM r- STM).
considered for instantiation into the
context only when necessary. Once The arrow STM r- AIM indicates
instantiated, the context is completed the sensing path. Apart from low
with the corresponding information level sensing, managed directly in
in the AIM (e.g., the low-level, the AIM, it carries signals which do
operational semantics of the not update, but rather complete the
instanced symbols, as it will be representations in the two upper
described in the following memories, or are requested by
subsection). The mechanism of reasoning processes. On the
context defInition is very important, contrary, the arrow LTM r- STM is
and is one of the key issues of our an infonnation path which provides
architecture; planning/reasoning learning from acting.
systems which do not ground the Our model stands on the assumption
representation on some kind of that every action execution always
context mechanism fail to be involves a certain amount of
realistic (McCarthy, personal planning: it may range from pure
communication). adaptation up to a complete planning
process (e.g., discovering a
The STM includes the information
sequence of sub-actions; designing
concerning the specifIc events
postures for a redundant complex
represented in the AIM. In other
robot like a humanoid model). We
words, the STM represents a single expect the agent to learn from
context, the bounded horizon of the experience, to re-use conclusions
action represented in the AIM: it is a and results from previous
one-to-one symbolic representation experiments or reasoning processes.
of the entities and of the events in Learning has basically two
the AIM. The arrows in fIgure 1 components: extracting relevant
schematise the relations between the information, and classifying it. The
various components: a specifIc fanner requires an "attention"
context is created starting from the

236
mechanism, for selecting the pieces idea is also inspired to the regions of
of knowledge which are relevant. the brain which are mapped onto
The latter requires a suitable relevant sensor-motor parts of the
architecture necessarily centered body (see for example Sparks,
around the "storage" of knowledge. 1987). The importance of the
Unlike other architectures, our metaphor of mental models in music
model is suited for upgrading its has been already pointed out
knowledge. The AIM ~ STM arrow (Krumhansl, 1990; Leman, 1991).
indicates the information flow due to The mental model we refer to is
experience, a cognitive feedback more general, since our goal is to
from the state of the affairs of the represent both music and
action towards the two symbolic multimedia: it is a common
memory boxes, to store information representation framework for both
valid in the actual context, or valid models of actions in a three-
universally. dimensional (simulated or real)
The possibility of learning is mainly world (e.g., choreographic situations
due to the choice of the cognitive represented by the system), and
scheme for representing actions music imagery, based on
which is as uniform as possible, and psychoacoustics and analogical
constructed around the metaphor of reasoning (see for example
"showing by examples" actions, McAngus Todd, 1992).
skills and goals. The semantics of One important role of the AIM is
the action representation scheme is that of a motion generator based on
the way in which the AIM works. reference "pictures", or "snapshots"
of parts of the world. We call icons
2.1 The Active Isomorphic these multi-dimensional snapshots
Memory stored in the AIM. According to the
representation context, icons can
The AIM is basically a complete,
refer to different kinds of
isomorphic model of the world. It is
knowledge:
isomorphic because, in a sense, it
has "the same shape" of possible or (i) geometrical descriptions of
actual situations of the external situations in a model of the world
world: changing the value of a (they can be analogical
variable in the AIM is equivalent to representations, i.e., geometrical
change the value of the metaphors of a different domain,
corresponding variable in the (real or e.g., music): simple measures on a
hypothetic) represented situation. picture can reveal that, say, object A
The AIM is analogous to a mental is "on the left" with respect to object
model in the sense of Johnson-Laird B. This is useful for spatial and
(1983): it can be a model of the real geometrical reasoning purposes;
world used to figure out the possible (ii) landscapes of energy, force
outcomes of a given action. This fields, as a powerful metaphor for

237
reasoning on actions and plans. Here be represented in it. For example,
we have an analogical reasoning with respect to disjunction, it is
mechanism, based on measurements impossible to represent in the AIM
and simulation. that things are so or so, without
Both previous reasoning mechanisms saying exactly which is the case.
are complementary to typical These two properties are
symbolic deductive systems. consequences of the "iconic" nature
The AIM is therefore active, since it of its representations. On the
generates the suitable motor contrary, the symbolic memory is
activities to make icons become more expressive from this point of
"true" (or, better, to change the view. The possibility of explicitely
world to fit the icon descriptions). expressing generalizations, and of
representing incomplete pieces of
Icons can be "forbidden snapshots" know ledge, and of reasoning with
as well, as to say, examples of states them, is one of the main reasons to
that must not be reached. We call introduce this symbolic level of
them informally repelling icons (or representation.
repellors), while attractive icons are
'good' icons, representing states of An important advantage of an iconic
the world to be reached. This scheme is that motion processes
framework recall the basic (which "interpolate" from one icon
mechanism at the basis of the AIM: to another, like graphic interpolation
navigation inforcefields. between the so called key postures
in the cartoons terminology) can use
Another role of the AIM is purely composable generation processes.
simulative: in this case, icons are The impossibility of composing
used as energy shapes which can be actions in time and space (by adding,
followed by means of algorithms superimposing partially or totally the
simulating attraction forces on a effects of different actions, without
moving point, e.g., to reach a consistency problems) is one of the
minimum of energy or a given goal weak points when using purely
position. symbolic, logic-based approaches. In
Two important properties of the our architecture, two or more icons
AIM are to be kept into account: (i) can be simultaneously active, and
the AIM can always represent the the resulting action comes from the
evolution of a single state of affairs superimposition of the different,
at once. That is, no general independent generation processes
information can be represented in it; into the AIM, which integrates the
(ii) the AIM contains always effects of the different processes in
complete knowledge with respect to the same way the world does. This
the relevant aspects of the also has a good evidence and
represented state of affairs. No expressive strength: a complex
incomplete piece of information can action can be captured, at any

238
instant, as a set of active icons in the (ii) simulation, (low generative
AIM to be satisfied, and a set of activity): to "substitute" (a mental
repulsive situations to keep away model, disconnected from the real
from. world) the world for reasoning, or to
We have introduced icons as the anticipate and foresee the behavior
driving mechanism for motion of (some portion ot) the world;
processes: this constitutes only one (iii) planning, (high generative
half of their role in the AIM. In a activity): to discover how to get to a
dual way, icons can be generated by desired snapshot and/or staying
perceptual processes. See for away from an unwanted situation, at
example the iconic representation of different levels of abstraction.
the tonal circle of fIfth in (Leman,
1991), where the iconic Local minima of energy are the main
representation is implemented using sources of infonnation returned to
the Kohonen's self-organizing neural the STM and the LTM. At every
networks (Kohonen, 1984). local minimum, a repelling icon, and
Such perceived snapshots are the a corresponding symbol in the
basis for decisions about what to do symbolic memory, can be generated.
next in a plan; in other words, they After such a new symbol is
allow the action representation to be generated, the STM can be updated,
"situated" (Brooks, 1991). and the action starts again. This
In the same way, more complex cycle can be iterated while
recognition processes, involving upgrading both the LTM and the
explicit reasoning using symbolic STM. For example, after the first
knowledge, can be carried out in the iteration, the representation of the
LTM and STM; the reasoning action is composed by an initial
process, in this case, uses the situation, a final situation, and is
situations fed back by the AIM. now enriched by an intennediate
"bad" situation, with additional
Planning carried out in the AIM is a space/time information. Examples of
particular fonn of metaphorical such process are discussed in later
problem solving using simulation, sections.
since it is based on the abstract A generative scheme of motion
force fields metaphor. based on force fields has good
Summing up, the generating activity properties for being the basic
of the AIM ranges three distinct reasoning mechanism in the AIM,
roles: since:
(i) execution, (no generative • force fields can be composed;
activity): the AIM is an interface • they can integrate different levels
between the symbolic entities in the of representation: kinematics,
context (STM) and the real world; force, internal and real values;

239
• they use local representations; 3. A computational model
• algoritluns for exploring force From the point of view of the
fields are known; local minima representation and reasoning
(as well as other energy shapes) architecture, the formal realization of
are source of information; the cognitive model described so far
consists of a hybrid scheme,
• force fields are a metaphor which
combining different formalisms, as it
is sufficiently general: it is used
is shown in figure 3.
in several domains.
The symbolic memory is realized by
Finally, an interesting feature
the two following schemes, for
regarding the AIM is its intemal
representing different kinds of
simplicity. It has basically four
knowledge: (i) multiple-inheritance
components, depicted in figure 2: a
semantic networks, a formalism
set of icons (the current context), a
appropriate for defining terms and
generative process, a minima
for describing objects and the
detector, and a clock. The AIM
taxonomic class/membership
inputs icons (in the actual context),
relations among them. This
and gives back new icons,
formalism has been extended to
corresponding to the discovered
represent and reason on time, actions
relevant situations; output icons
and plans; (ii) production rules,
carry proper ordering information, a
appropriate for representing logical
subset of complete timing
implication. This integration of
information, for which an elementary
taxonomic representations with rules
clock is necessary.
is similar to the CLASP system
Concluding, we have introduced a (Yen, Neches, and MacGregor,
representation and reasoning 1991).
framework based on the metaphor of
The AIM is mainly based on a
force fields. These descriptions let
situated action model and on
the user think of and perform a set of
reasoning on analogies: the force
actions in terms of the intuitive
field metaphor described in the
natural dynamics of navigation in
previous section is its main inference
attractor fields, as a promising
engine.
alternative to production rules: given
their composable generation From the point of view of the
processes, force fields can be used computational mechanisms, the
to easily model complex behaviours, system integrates three paradigms:
otherwise very difficult to model as (i) a representation language of the
rules. Force fields also give a family of KL-ONE (Brachman and
different viewpoint to the composer Schmolze, 1985), extended with
or the musicologist, and provide time management primitives for
powerful manipulation primitives. reasoning on time and actions; (ii) a
rule-based system, operating on

240
terms defmed in the terminological and P.Wegner (Eels.), Cambridge, MIT
language, and with a full access to Press, 1987.
the Prolog language; (iii) the AIM c.Ames, "AI and Music",
component, implemented as a Encyclopedia of ArtijicialIntelligence,
concurrent, object oriented system, 2nd ed., 1992, New York: John Wiley
similar to Hewitt's actor system and Sons.
(Agha and Hewitt, 1990), C.Ames, "Quantifying Musical Merit",
implemented in C-I+. Interface, Vol.21 , No.1, pp.53-93,
The low-level sound representation 1992, Swets & Zeitlinger.
and data on real-time performance, R.J.Brachman and J. G. Schmolze, "An
as well as particular recognition and overview of the KL-ONE knowledge
synthesis algorithms, are therefore representation system", Cognitive
represented in the system as classes Science, 9, pp.I71-216, 1985.
(actors) in an object-oriented R.A.Brooks, "Intelligence without
concurrent environment: they representation", ArtijicialIntelligence,
correspond (are "hooked") to terms Vo1.47, no.I-3, 1991.
in the Prolog KB. The AIM is
A.Camurri, C.Canepa, M. Frixione,
therefore implemented as a network and R. Zaccaria, "HARP: A framework
of actors: each actor is therefore a and a system for intelligent composer's
class hooked to an object in the assistance", IEEE COMPUTER,
symbolic level, which is the Vo1.24, No.7, pp.64-67, July 1991.
repository of high level entities, A.Camurri, M.Frixione, C.Innocenti,
scores, composition rules, high-level R.Zaccaria, "Representing and
descriptions and defmitions in reasoning on music and multimedia
general. The activation of such a knowledge", DIST Techn. Report,
network of actors produces a University of Genova, 1993.
simulation/execution in the AIM. A.Camurri, F.Giuffrida, G.Vercelli,
More details on the system, a new and R.Zaccaria. "A system for real-
version of WinProcnelHARP, and on time control of human models on
musical and multimedia applications stage", this volume.
can be found in (Camurri et aI, in A.Camurri, P.Morasso, V.Tagliasco,
preparation) and elsewhere in these and R.Zaccaria, "Dance and movement
proceedings (Camurri, Giuffrida, notation", in P.Morasso and
Vercelli and Zaccaria, 1993; V.Tagliasco (Eds.), Human Movement
Massucco, Mercurio, and Palmieri, Understanding, North Holland, 1986.
1993). F.Courtot, "CARLA: Knowledge
Acquisition and Induction for
References Computer Assisted Composition",
Interface, Vol.21 , pp.191-217,
G.Agha, C.Hewitt, "Actors: A Swets&Zeitlinger.
Conceptual Foundation for Concurrent
Object-Oriented Programming", in R.Dannenberg, C.Lee Fraley,
Research Directions in Object- P.Velikonja, "Fugue: A Functional
Oriented Programming, B.Schriver Language for Sound Synthesis", IEEE

241
COMPUTER, Vol. 24, No.7, July C.Massucco, M.Mercurio, and
1991, pp.36-41. G.Palmieri, "Real-time processing and
G.De Poli, L.Irone, AVidolin, "Music performance using WinProcne{HARP",
score interpretation using a multi-level this volume.
knowledge base", Inteiface, Vo1.l9, M.V.Mathews, The Technology of
No.2-3, pp.137-146, 1990, Swets & Computer Music, MIT Press, Boston,
Zeitlinger. 1969.
G.De Poli, APiccialli, and C.Roads N.P. McAngus Todd, "The dynamics
(Eds.), Representations of Musical of dynamics: A model of musical
Signals, MIT Press, 1991. expression", J. ACOUSI. Soc. Am.,
S.Harnad, "The symbol grounding Vol.91 , No.6, June 1992, pp.3540-
problem", Physica D, Vo1.42, No.I-3, 3550.
1990. F.R.Moore, Elements of Computer
P.Howell, R.West, I.Cross (Eds.) Music, Prentice Hall, Elglewood Cliffs,
Representing Musical Structure, N.J., 1990.
Cognitive Science Series, Academic S.T.Pope (Ed.), The Well-Tempered
Press, 1991 Object, 1991, MIT Press.
P.N.Johnson-Laird, Mental models. S.T.Pope, "The Interim DynaPiano: An
Cambridge University Press, integrated computer tool and
Cambridge, 1983. instrument for composers", Computer
T.Kohonen, Self-organization and Music Journal, Vo1.l6, No.3, Fall
associative memory, Berlin: Springer- 1992, MIT Press.
Verlag, 1984. S.T.Pope, "The SmOKe Music
C.L.Krumhansl, Cognitive foundations Representation, Description Language,
of musical pitch, Oxford University and Interchange Format", Proc. IntI.
Press, New York, 1990. Workshop on Models and
Representations of Musical Signals,
M.Leman, "The ontogenesis of tonal
semantics: results of a computer Capri 5-7 October 1992, AIMl (Italian
study". In P.Todd & G.Loy (eds.), Association on Musical Informatics),
and Dipartimento di Scienze Fisiche,
Music and connectionism. Cambridge,
University of Napoli.
MA : The MIT Press, 1991.
J.Yen, R.Neches, and R.MacGregor,
M.Leman, "Tone context by pattem-
integration over time". In D. Baggi, "CLASP: Integrating Term
(Ed.), Readings in computer generated Subsumption Systems and Production
Systems", IEEE Transactions on
music, pp.117-137, IEEE Computer
Knowledge and Data Engineering,
Scociety Press, 1992.
Vol.3., No.1, pp.25-32, March 1993.
G.Loy, and C.Abbott, "Programming
Languages for Computer Music
Synthesis, Performance and
Composition", ACM Computing
Surveys, Vol. 17, No.2, June 1985,
pp.235-265.

242
Long Term H Short Term Active
Memory Memory Isomorphic ( ) WORLD
H Memory

Symbolic Memory

Figure 1. The general architecture of the cognitive model at the basis of the
WinProcne/HARP system.

{ICONS}

GENERATIVE PROCESS ICONS

.
/ "\--
.......•..•...,
:
{ICONS} j, •

\, rrI'i
CLOCK MINIMA DETECTOR

AIM
Figure 2. The structure of the Active Isomorphic Memory.

Firing conditions
Symb. constructors LTM
Symb. destructors
T-Box

- .- ~rodUCtion rules )
simulative engine - - - - ~ - - - - - - -J.... ~­ A-Box
,

Expert and icon STM


.. _---------------------_.'
actor system
AIM

Figure 3. WinProcne/HARP system's architecture.

243
DUE AMBIENTI SPERIMENTALI
DEDICATI ALLA SINTESI LASY
Jacques Chareyron, Daniele Rizzi
L.I.M - D.S.I, Universita di Milano
Via Comelico, 39
1-20135 Milano (Italy)
fax +39255006.373
e-mail: music@imiucca.csLunimi.it

Abstract

Sound synthesis with cellular automata (LASy) is an application of the


cellular automata model to the digital signal processing field.
Two working environments for the implementation of the LASy al-
gorithm have been implemented at LIM on two hardware set-up.
The LASy application uses a Macintosh computer equipped with an ex-
pansion card including a DSP and a DAC. The LASy application may be
seen as a MIDI expander for LASy synthesis with accomodation for pro-
gramming and building sounds.
The application allows the building and the grouping of the basic ele-
ments of the synthesis. The generated tone may be heard immediately by
using whether the computer's keyboard or a MIDI-connected musical
keyboard; the evolving waveform is displayed in real-time.
The performance section of the LASy application allows four voices
polyphonic synthesis under MIDI controL
CellAut software runs on a NeXT computer suited for digital proces-
sing. It allows parametrical and graphical editing of the transition rule;
changes take effect imrnediatly since the synthesis algorithm and the user
interface handler are two concurrent processes.
The user gets control on a vaste array of other minor parameters, such
as the kind and the frequency of the waveform; these parameters are use-
ful in refining a single timbre or for exotic use of the application.
Another feature of the software is the "pitch tracking" mode: the signal
coming from a microphone is sampled using the ADC and analyzed by a
program running on the DSP. Pitch and power characteristics are extrac-
ted from the input, so the user may control the automaton's evolution and
the corresponding timbre's synthesis with changes in the pitch and the
tone of his voice.
This research has been supported by the Italian National Research Council in the frame
of the MUSIC Topic (LRC C4): "INTELLIGENT MUSIC WORKSTATION",
Subproject 7: SISTEMI DI SUPPORTO AL LAVORO INTELLETTUALE, Finalized
Project SISTEMI INFORMAno E CALCOLO PARALLELO.

244
La tecnica di sintesi timbrica LASy dispone di moduli per la
con automi cellulari (LASy) e costruzione e la modificazione
una applicazione al campo della degli e1ementi di base della sintesi
elaborazione dei segnali digitali (forme d'onde e regole di transi-
del modello matematico degli zione). Questi e1ementi possono
automi cellulari, definiti da uno essere combinati in diversi modi
stato iniziale e da una funzione di e completati con l'aggiunta di
transizione. inviluppi e modulazioni per for-
Riempiendo 10 stato iniziale mare uno "strumento" LASy. A
con i valori di una forma d'onda ogni instante della fase di costru-
di partenza e applicando ricorsi- zione e possibile ascoltare il
vamente la funzione di transi- suono risultante agendo sulla ta-
zione, si ottiene una successione stiera alfanumerica del calcola-
di statio II risultato puo essere tore e il mouse, 0 su una tastiera
interpretato come la descrizione musicale collegata via MIDI.
di un segnale digitale. L'evoluzione della forma d'onda
Due ambienti di lavoro per 10 corrispondente viene visualizzata
studio e l'utilizzo dell'algoritmo in tempo reale sullo schermo
di sintesi LASy sono stati svilup- collegato al calcolatore.
pati al LIM su due diverse piatta- La costruzione di uno "stru-
forme hardware. mento" LASy puo essere condotta
a diversi livelli a secondo della
L'applicazione LASy utilizza familiarita dell'utente con il mec-
un calcolatore Macintosh dotato canismo della sintesi. II livello
di una scheda di espansione con "zero" constite nella semplice
DSP e convertitore digitale-ana- scelta di uno degli strumenti
logico. Si tratta di una realizza- forniti insieme all'applicazione.
zione dedicata sia alIa sperimen- Al primo livello l'utente costrui-
tazione che all'esecuzione musi- sce nuovi suoni combinando gli
cale. I risultati della ricerca suI elementi di base della sintesi
suono possono essere sfruttati per (forme d'onde e regole di transi-
l'esecuzione musicale all'interno zione) forniti a questo scopo. Al
della stessa applicazione anche da secondo livello l'utente provede
parte di un musicista che non alIa costruzione degli e1ementi di
abbia familiarita con la tecnica di base della sintesi. A questa scopo
sintesi. In effetti la combinazione l'applicazione LASy dispone di
calcolatore-scheda DSP-applica- una serie di algoritmi specializ-
zione LASy puo essere vista zati per ogni tipo di elemento. Al
come un expander MIDI dedicato terzo Hvello si ha un controllo as-
alIa sintesi LASy dotato di ampie soluto sulla costruzione 0 la mo-
capacita di programmazione e di dificazione "bit per bit" degli
costruzione del suono e di una li- elementi della sintesi grazie a due
breria di suoni preprogrammati. moduli dedicati: un editor in
modo "testo", per l'introduzione

245
diretta dei valori scelti, e un edi- struzione. Come strumento musi-
tor grafico con il quale si pub cale, l'applicazione e pienamente
"disegnare" la curva che rappre- funzionante, rna sono encora pre-
senta graficamente la tabella da senti difetti nella qualita del
elaborare. L'applicazione LASy suono emesso rispetto a materiale
dispone inoltre di una serie di di livello professionale. Una
operatori che applicano una tra- parte di questi difetti e legata ai
sformazione su una zona selezio- limiti della base hardware adot-
nata degli elementi della sintesi. tata. Si sta lavorando per elimi-
nare gli altri.
La sezione dell'applicazione
LASy dedicata all'esecuzione L'applicazione CellAut nasce
musicale offre diversi modi di come banco di prova per 10
sintesi, fino ad un massimo di studio della tecnica di sintesi
polifonia a quattro voci. La sin- LASy. L'ambiente di sviluppo e
tesi utilizza 10 strumento cor- consistito in un calcolatore
rente, anche in fase di costru- NeXT, sia per i suoi requisiti
zione 0 di modificazione. hardware che per i tools di svi-
Nell'assenza di collegamenti luppo a disposizione. Sulla pia-
MIDI, la tastiera alfanumerica stra madre del calcolatore sono
del calcolatore pub essere usata presenti sia i DAC sia un co-
come una elementare tastiera processore dedicato DSP, a cui
musicale. n collegamento ad una e state aggiunto un campiona-
tastiera MIDI esterna (0 a un se- tore esterno; e disponibile per-
quencer) autorizza un controllo cib tutto l'hardware necessario
piu completo degli elementi della per l'analisi e la sintesi del se-
sintesi. L'applicazione LASy e gnale digitale.moltre, il sistema
compatibile con il MIDI Manager operativo offre una serie di tool
del sistema Apple Macintosh. E' che consentono di semplificare
possibile fare girare LASy in 10 sviluppo dell'applicazione e
background mentre risponde ai di utilizzare efficacemente l'in-
messaggi MIDI mandati da terfaccia utente grafica.
un'altra applicazione (tipo se- 11 risultato di una sintesi
quencer) che gira in primo piano LASy dipende da tre parametri
sullo stesso elaboratore. LASy ri- principali: la forma d'onda ini-
sponde ai messaggi MIDI di Note ziale, la funzione d'intorno sO e
On, Note Off, Program Change, la funzione fO, legate dalla re-
After Touch, Pitch Bend e due lazione:
Control generici a scelta. Y n = f. s( Y n-p, Y n-p+h Y n-p+2),
Come ambiente di ricerca, la dove Y D...p-l rappresenta la
versione attuale dell'applicazione forma d'onda iniziale e Yp... k>
LASy e completa e operativa. k>p il segnale prodotto dalla
Potra essere completata ulterior- sintesi. CellAut permette di in-
mente con altri algoritmi di co- tervenire sui tre parametri

246
principali, di ascoltare e visua- la tabella, l'operazione e ese-
lizzare graficamente il risultato guibile durante la sintesi, per-
della sintesi, e di trasferirlo mettendo di ascoltare diretta-
automaticamente verso altre mente i suoi effetti suI suono
applicazioni, ad esempio per generato. A causa della sua ge-
produrre un sonogramma. neralita l'effetto di fO e solita-
La forma d'onda iniziale YO... p-l mente difficile da stabilire a
puo essere scelta dall'utente tra priori; i risultati migliori si ot-
un insieme di onde classiche, tengono con la funzione identita
quali la sinusoidale 0 la qua- (f(x) = x) e ogni funzione "vi-
drata, oppure come un fram- cina". Allontanandosi da questa
mento di un segnale campio- insieme, l'evoluzione risulta
nato; in questo caso LASy viene troppo veloce per poter essere
utilizzata non come algoritmo di apprezzata e porta in tempi
sintesi rna come filtro digitale. molto ridotti ad un segnale
La funzione d'intorno sO ha nullo 0 ad un segnale composto
forma: da tutte Ie armoniche alIa mas-
s(Yn-p, Y n-p+}, Y n-p+2) = ao x Y n-p + sima energia.
al x Yn-p+l + a2 x Y n-p+2; Per ottenere un controllo pili
con ao, a}, a2 >0 e ao+al+a2 < 1. preciso del suono prodotto, si e
sO definisce un filtro digitale di collegata l'uscita dell'algoritmo
tipo passa-basso, Ie cui caratte- LASy ad un pitch detector. Esso
ristiche dipendono dai tre pa- monitorizza !'input rilevato da
rametri ao, al e a2, modificabili un microfono e ne stabilisce
tramite un pannello. l'altezza, l'energia e se debba
fO e una funzione, in gene- essere considerato vocale 0
rale non lineare, che opera suI menD; indi il programma decide
segnale filtrato da sO; ragioni di se produrre il suono sintetizzato
efficienza consigliano di precal- (nel caso di un input vocale), e
colare fO in una tabella di ri- ne cambia la frequenza per ugu-
cerca, 0 lookup table, in modo agliarla a quella rilevata dal mi-
da ridurre e uniformare il crofono. L'altezza del suono
tempo di calcolo di ogni istanza prodotto viene percio stabilita
della stessa funzione. Un effetto dalla voce stessa dell'utente, in
secondario e che l'utente si una maniera pili intuitiva e im-
trova a disposizione sia una mediata della usuale interazione
funzione parametrica, sia una con la tastiera.
tavola di valori visibili e modi-
ficabili graficamente. E possi- BibHografia
bile sia modificare fO cam- J. Chareyron, "Digital Synthesis
biando i suoi coefficienti, sia of Self-Modifying Waveforms by
agendo con il mouse suI suo Means of Linear Automata",
grafico. Grazie aIle poche ope- Computer Music Journal, Vo1.14,
razioni necessarie ad aggiornare NA, 1990, pp.25-41.

247
PC-MUSIC - EVOLUZIONE DEL LINGUAGGIO
CMUSIC PER AMBIENTE MS-DOS
Pietro Fischetti

DIST - Universita' di Genova


Laboratorio di Informatica Musicale
Via Opera Pia lIlA 16145 Genova

Abstract Lavoro Musicale Intelligente"


[1]. Viene presentato il trasporto
This paper describes a set of del linguaggio CMUSIC da
programs for musical·and digital ambiente UNIX ad ambiente MS-
signal processing called "PC- DOS e una raccolta di programmi
MUSIC" . This Software has standard per I' elaborazione dei
been implemented in C language, segnali sonori, infine viene
running on MS-DOS systems and descritta una libreria per la
is actually composed by: gestione virtuale a pagine della
CMUSIC for MS-DOS, ENS memoria di un PC-ffiM.
(programs and library .for Digital L Hardware utilizzato e' un
1

Signal Processing), GED comune PC-ffiM in ambiente


(Graphic Editor), VMS (Virtual MS-DOS (Ver. 3.30 0 sup.). II
Memory System). The system is software utilizzato e1 il
modular and easily transportable. linguaggio MicroSoft C Ver.
5.10. Le restanti parti del modulo
DMTOOL riguardano driver per
1. Introduzione la comunicazione tra PC,
Campionatore AKAI S900 e
II presente articolo illustra una MIDI Roland MPU-40l [6].
raccolta di programmi per la
composizione musicale, e
I' elaborazione sonora sviluppati 2. Descrizione del sistema
presso il laboratorio di PC-MUSIC e' composto dai
Informatica Musicale moduli Hw/Sw descritti in
dell 'Universita, di Genova DIST Figural, in particolare:
appartenenti al modulo la parte Hardware comprende un
DIST.Music.Tool (DMTOOL) comune PC-ffiM, schede
del progetto "Una Stazione di AD/DA, eventuali schede DSP.

248
II software e' composto da: 4. II Driver HDMUSIC
HDMUSIC: e' un driver
configurabile per HDMUSIC permette il
I'interfacciamento PC - schede trasferimento di eampioni,
AD/DA via DMA. (shortsam), tra il DMA del PC e
GED: Editor grafico di campioni Ia seheda AD/DA [5]. La
sonori. frequenza di trasferimento puo'
ENS: Raccolta di programmi per essere impostata da 8 a 84KHz
I' elaborazione numerica dei (mono) e da 4 a 42KHz (stereo).
segnali.
VMS: gestore virtuale della
memoria. 5. L'editor Grafico GED
CMUSIC: compilatore.
GED visualizza file sonori su un
video grafieo ad alta definizione
3. I carnpiolli sOllori (VGA). L'utilizzo e' simile ad un
eomune oseilloseopio (vedi
I possibili formati dei campioni Figura 2). Sono disponibili
sonori utilizzati (in banda base) eomandi per modifieare
sono i seguenti: I' ampiezza e I' estensione
Floatsam: dati binari reali a temporale, CUT & PASTE,
virgola mobile in singola LOAD, SAVE.
precisione (6 digit) a 32 bit.
Shortsam: dati binari interi a 16
bit in complemento a 2. 6. Le Utility ENS
Un file di eampioni sonori puo'
essere 0 menD preceduto da ENS e
I una raecolta di
un' intestazione (header) ehe programml e Iibrerie per
eonsiste in una sequenza di I' elaborazione numeriea dei
informazioni riguardanti segnali sonori (vedi figura 3).
prineipalmente il nome del file, il Filtri Numeriei: FIR, HR,
numero dei eanali (mono, stereo, Median, analisi LPC, adaptive
o quadrifonici) e Ia frequenza di LMS, Noise Reduction, Comb.
eampionamento. Viene fornita Trasformata veloee di Fourier.
una libreria di funzioni per Generatori di segnali: per sintesi
I'ingresso/useita dei dati a basso di Fourier, inviluppi, funzioni
o alto livello, e per il trattamento distoreenti, e rumori .
dell' header. Riverbero digitale.
Visualizzatori di segnale: a 3
dimensioni, istogramma, spettro.

249
MIX e DEMIX di files sonori. virtuali a 32 bit portando quindi
Phase Vocoder la dimensione della memoria
Windowing principale a 4Gb virtuali. V.M.S.
Convertitori di formato qualsiasi e' stato utilizzato con CMUSIC
(floatsam/shortsam/ASCII/USER) ed esempi dettagliati di utilizzo si
Possibilita' di multielaborazione e trovano in [7].
elaborazione multicanale. Tali
programmi elaborano file in
formato floatsam con 0 senza 8. II linguaggio CMUSIC per
header. MS-DOS
E' disponibile un esempio di
interfaccia con schede DSP L'attuale versione (2.1) di
(Texas TMS320C30) per la CMUSIC presenta Ie stesse
generazione di segnali mediante caratteristiche offerte dalla
sintesi di Fourier. versione originale per UNIX [2],
Le librerie contengono funzioni con I' aggiunta dell' unita' di
di utilita' per I' elaborazione generazione per la sintesi
numerica dei segnali [3]. granulare [4]. In particolare
CMUSIC traduce uno spartito (un
semplice file di testo) in un file di
7. II gestore di memoria campioni sonori. Lo spartito
virtuale V.M.S. (Virtual contiene la definizione degli
Memory System) strumenti (combinazioni di unita I

di generazione), generatori di
V.M.S. e' una libreria di funzioni funzioni e la lista delle note.
scritte in Linguaggio C per la Dato che il sistema operativo
gestione virtuale della memoria a MS-DOS, limitando la gestione
richiesta di pagina. E' della memoria principale ad un
configurabile in quanta e' massimo di 640Kb impedisce la
possibile specificare la scrittura di spartiti con un elevato
dimensione di una pagina (max numero di strumenti 0 con
64Kb) , il numero massimo di strumenti particolarmente
pagine in memoria, l'utilizzo complessi [7], si e ' ricorso al
eventuale delle aree di swap gestore V.M.S. Ma dato che
(memoria estesa (XMS), memoria questi, come tutti i gestori di
espansa (EMS), memorie di memoria virtuale, rallenta
massa), e il tipo di algoritmo di inevitabilmente Ie prestazioni del
sostituzione desiderato. V.M.S. sistema sono state realizzate 3
utilizza per I' accesso ad un versioni di CMUSIC, una per
blocco di memoria indirizzi spartiti semplici, una per spartiti

250
con molti strumenti e una per [3] "Programs for Digital Signal
strumenti particolarmente Processing" , Edited by the
complessi. Nel primo caso non Digital Signal Processing
viene utilizzato V.M. S. , nel Committee, IEEE
secondo caso I' intera lista degli Acoustics,Speech and Signal
strumenti utilizzati viene posta da Processing Society, IEEE Press,
V.M.S. in memoria virtuale, e 1979.
nel terzo caso vengono poste in [4] D.L.Jones, T.L.Parks
memoria virtuale solo quelle "Generation and Combination of
unita I di generazione che Grains for Music Synthesis" ,
richiedono numerosi buffers per Computer Music Journal Vol 12,
la memorizzazione temporanea N.2 summer 1988.
dei campioni elaborati. [5] "Hdmusic: Inteifaccia PC
scheda PITDAD", Rapporto
Interno DIST - Laboratorio di
9. Conclusioni informatica Musicale, Universita'
di Genova.
II software presentato e' soggetto [6] A.Camurri, F.Giuffrida, P.
a sviluppi continui. La semplicita' Podesta': "DMTOOL: un
e la flessibilita' di utilizzo 10 Ambiente Software per
rendono aperto a tutte Ie possibili l'elaborazione di campioni", Atti
soluzioni di miglioramento delle VIII CIM Ottobre 1989 Cagliari.
prestazioni. E' disponibile presso [7] P.Fischetti: "PC-MUSIC: Una
il LIM-DIST dell'Universita' di stazione di lavoro per
Genova un manuale dettagliato l'elaborazione di suono e
del sistema descritto [7]. musica", Rapporto Interno DIST
Laboratorio di informatica
Musicale, Universita' di Genova.
BIBLIOGRAFIA

[1] A.Camurri, G.Haus:


"Architettura e Ambienti operativi
della stazione di lavoro
intelligente" , IX Colloquio di
Informatica Musicale - Genova,
Novembre 1991.
[2] Computer Audio Research
Laboratory: "CARL Startup Kit"
University of California, San
Diego 1985.

251
Other Environments
(UNIX,Mac,MS-DOS)

ETHE/i'HET

ASCII ASCII
Keyboard Keyboard

PC-l

................T ..

MIDI
Devices
(synthesizer, :
samplers :
etc.) :

Figural - II sistema PC-MUSIC.

. -,
m ..b ....

-
111_...,...
11\ {\ / ~ 1\ r \ 1\ ..........
n:l:&ru . . ._

.....
r .......

1
\ I 1\
JT T\ ,....
\
;=.. .
101. . 11..
1If .... blwl::;1r;

d .....

\/ \J \ l\f \/

..,.. ... - ..
......: ,.
..ltilrrl .tll6'Ulri at . . . IBClIa
rIIIIIf.IIId . . . J.: 'nil •••
'II_aDIl . . I:I~U
,,In.lCII:a:r-ll
lallA .....: 1
Clr'lllr'llWIBlUnu: I'DII
......- ......... !lI1

Figura 2 - II programma GED.

252
~.
• .....
---
.~
...
Figura 3 - A1cuni programmi contenuti nel modulo ENS

253
"STAZIONE DI LAVORO
MUSICALE INTELLIGENTE":
L'AMBIENTE INTEGRATO MACINTOSH-NEXT

Goffredo Haus, Isabella Pighi

Laboratorio di Informatica Musicale


Dipartimento di Scienze dell'Informazione
Universita degli Studi di Milano
via Comelico 39
1-20135 Milano (Italia)
fax +39255006373
e-mail: music@imiucca.csi.unimi.it

Abstract

The Macintosh/NeXT Integrated environment of IMW is a set of both


software & hardware modules hierarchically structured and
communicating each other by protocols and standard formats.
The Symbolic-Structural Environment consists of six software modules
for Macintosh: ScoreSegmenter, ModelSynth, ScoreSynth, Functional
Performer, MusSer and Sequencer MIDI/DSP. The main developments
occurred during last two years mainly affect ModelSynth, ScoreSynth
and Functional Performer modules.
The Operating-Executing Environment consists of six software modules
for either Macintosh or NeXT: Driver MIDI/DSP, LASy Workbench,
Waver, TimbreLab, CASP (Cellular Automata Sound Processor) and
VoiceLab. The main developments occurred during last two years
mainly affect Driver MIDI/DSP, LASy, TimbreLab, CASP and
VoiceLab modules.
In this paper we give an architectural overview of the current state of the
project and a survey of all the software modules we have developed.
General architectural characteristics and high level specification of the
modules are specified with more details in the MUSIC Tech. Report
Series published by the Italian National Research Council.

This research has been supported by the Italian National Research Council in the frame
of the MUSIC Topic (LRC C4): "INTELLIGENT MUSIC WORKSTATION",
Subproject 7: SISTEMI DI SUPPORTO AL LAVORO INTELLETIUALE, Finalized
Project SISTEMI INFORMATICI E CALCOLO PARALLELO.

254
Architettura della SLMI. 11 terzo diagramma illustra Ie
funzionalita operanti in tempo
La Stazione di Lavoro Musicale reale della SLMI [2], i messaggi
Intelligente (SLMI) e un presenti nei canali di comuni-
ambiente integrato Macintosh- cazione sono:
NeXT costituito da un insieme • MIDI: messaggi MIDI;
di moduli hardware e software • TAB: parametri di sintesi;
organizzati in modo gerarchico • DS/AS: segnale digitale/ana-
e tra lora comunicanti mediante logico;
un insieme di protocolli e • MMT: comunicazione vir-
standard di formati [1]. Nei tuale via MIDI Management
primi due diagrammi della Tools.
pagina seguente sono riportati i I coIlegamenti illustrati sono in
collegamenti funziona1i esistenti veridl virtuali in quanto
all'intemo della SLMI a livello all'interno della SLMI alcuni
simbolico/ strutturale e a livello moduli entrano in concorrenza
esecutiv%perativo. specialmente per quanto
A parte i comandi MIDI, che riguarda l'utilizzo delle schede
attualmente costituiscono l'unico DSP; tali vincoli portano aIle
tipo di comunicazione fra i seguenti conc1usioni:
diversi ambienti h/w-s/w della • Waver, LASy e il Driver
SLMI (Macintosh e NeXT), i MIDI/DSP richiedono l'uso del
moduli comunicano mediante medesimo DSP, sono quindi
file il cui tipo di formato e mutuamente esclusivi [3];
specificato nel seguente elenco: • ScoreSynth, Temper,
• SM!: Standard MIDI File MusSer, Traslitteratore, Funct-
formato 0 e 1; ional Performer non possono
• PNF: file utilizzati da essere utilizzati contemporanea-
ScoreSynth contenenti modelli mente suI medesimo calcolatore
di Reti di Petri; in quanta richiedono una
• TCF: file di formato Tab continua interazione con l'utente
Converter; e quindi un continuo utilizzo del
• PCF: file di formato Pro- display;
fessional Composer; • il Sequencer MIDI/DSP pub
• SSF: file prodotti da comunicare mediante MMT con
ScoreSegmenter durante la fase tutte Ie applicazioni che
di analisi; possiedono tale tipo di
• SDF: file di formato interfaccia;
Sound Designer; • il Driver MIDI/DSP pub
• LTF: file prodotti da essere utilizzato concorrente-
LASy Workbench; mente al Sequencer MIDI/DSP
• SND: file sonori di in quanto il Sequencer
formato NeXT. MIDI/DSP opzionalmente disat-
tiva il possesso della scheda;

255
AmbieRte Ambieate
Macinlosh Macintosh
r--·_·_·_·...·:
~:-T::;:~::;"f'r:dkMSU:NI 1,--
I '!' ,
I
~'i I~ iSDF
;
II
I
I
I
I
I
i

It .
;
N i:._.. ~._ ....I .._ .._ _ w.__• • ._. ._ _ ._.I--._.._.:
U1
0"1

1. Livello SIW 1: 2. Livello SIW 2: 3. Funzio1Ullitd operanti in tempo reale


Moduli a basso Iivello per Ie attivita di Moduli ad alto livello per Ie attivita di dellaSLMl.
analisi-sintesi del suono, la costruzione e composizione, analisi/sintesi di testi
l'elaborazione di campioni e modelli musicali e di strutture di testi musicali,
timbrici e la produzione di master audio performance musicale e audio visuale,
numerici. orchestrazione, DTP musicale.
inoltre, possono essere eseguiti ScoreSynth 3.0
concorrentemente tante istanze ScoreSynth 3.0 [4] consiste in
di Driver MIDI/DSP, ciascuno un EditorlEsecutore/Debugger
con un microprogramma di Reti di Petri Musicali
indipendente dagli altri, quante (RPM). Partendo dalla defini-
sono Ie schede DSP disponibili; zione di Reti di Petri PT
• il Driver MIDI/DSP comu- (posti/transizioni) temporizzate
nica mediante MMT con tutte Ie dotate di capacita sui posti, di
applicazioni interfacciabili; molteplicita sugli archi e
• FilterAut, CellAut, morfismi di tipo raffinamento,
VoiceLab richiedono l'uso dello alIa definizione delle RPM si e
unico DSP disponibile attual- giunti associando ai nodi di tipo
mente in ambiente NeXT, cio posto il concetto di Oggetto
esclude l'esecuzione concorrente Musicale e ai nodi di tipo
suI medesimo calcolatore; transizione quello di Algoritmo
• l'interfaccia MIDI mette in di trasformazione degli Oggetti
comunicazione Ie applicazioni Musicali stessi. Un Oggetto
agenti su diversi calcolatori, e Musicale e una sequenza di
possibile infatti una eventuale messaggi MIDI; un Algoritmo
comunicazione fra i due am- e una sequenza di Operatori
bienti controllando, ad esempio, Musicali applicabili sia ai
il modulo VoiceLab mediante il parametri che specificano una
SequencerMIDI/DSP. nota: altezza, intensita, durata,
timbro (canale MIDI); sia
I moduli della SLMI all'ordine in cui Ie note sono
disposte all'interno di un
Inteifaccia SLMI Oggetto Musicale.
Interfaccia SLMI e una inter-
faccia utente di tipo ipertestuale Score Segmenter
per la Stazione di Lavoro Obiettivo dell'applicazione e la
Musicale Intelligente. E' segmentazione di brani musi-
caratterizzata da un alto grado cali quale primo passo per una
di ipermedialita e di inter- varieta di applicazioni musico-
azione e da una grafica iconica logiche ed in particolare per la
intuitiva. L'interfaccia consente strumentazione automatica me-
di utilizzare i programmi diante elaboratore. Si tratta
commerciali di proprio cioe della ricerca ed identi-
interesse insieme con tutti i ficazione dei vari oggetti
moduli della SLMI 0 con una musicali e delle relazioni tra di
parte di essi, definendo cosl essi di cui e composto il brano.
opportuni ambienti operativi
specialistici (per composizione, ModelSynth
musicologia, editoria, produ- E' in corso la realizzazione del
zione, etc.). modulo ModelSynth, il quale

257
permette di ricavare automa- zionati fornendo il risultato
ticamente un modello di Reti di della lora applicazione in
Petri a partire dall'analisi tempo reale.
effettuata da ScoreSegmenter. I n modulo accetta in input una
modelli di Reti di Petri serie di frammenti musicali
prodotti sono a lora volta codificati in forma di Standard
utilizzati per eventuali analisi, MIDI File. A tal fine sono state
eseguiti e/o manipolati dal utilizzati Ie routine ed il ciclo
modulo ScoreSynth. Possiamo di lettura di SMF del modulo
quindi dire che ModelSynth Libreria di routine di
rappresenta un anello di con- lettura/scrittura di Standard
giunzione tra l'analisi e la MIDI File 1.0.
sintesi di partiture musicali.
Temper
MusSer Temper [6] e un'abbreviazione
MusSer e un modulo per la per TEssellating Music PER-
generazione di stringhe former, ovvero esecutore di
numeriche/musicali di natura musica generata automatica-
seriale su scale temperate a mente a partire da animazioni
temperamento pari, nonche del grafiche di tipo tassellazioni.
reperimento di quelle serie che Deve intendersi per tassella-
soddisfano particolari proprieta zione una combinazione
musicali. teoricamente infinita (rna in
pratica limitata dalle dimen-
Functional Performer sioni del disegno) di figure di
II modulo Functional Per- uno 0 pili tipi suI piano, in
former [5] si configura come maniera tale che i lora contorni
strumento per la performance combacino perfettamente.
funzionale in tempo reale di Mediante l'assegnazione di
oggetti musicali. opportuni parametri, Temper
Le strutture musicali costituenti permette la generazione e la
Ie primitive su cui operare produzione di sequenze
sono definite come frammenti musicali rigidamente determi-
melodici cioe sequenze di note nate dalle tassellazioni che Ie
musicali caratterizzate da generano, senza alcun ulteriore
altezza, durata e intensitlL Ai intervento umana; la corri-
frammenti rnelodici sono spondenza e quindi tale da
associati gli oggetti grafici che giustificare 1'uso del termine
ne permettono la manipola- "tassellante" per questo tipo di
zione da parte dell'utente musica.
attraverso gli operatori di
trasformazione implementati. Traslitteratore
Gli operatori di trasformazione L'applicazione consente di
modificano i frammenti sele- realizzare in modo automatico,

258
secondo alcune regole, la modelli timbrici delegati al
trasformazione di un testo controllo della scheda DSP.
letterario in uno spartito
musicale. Driver A1IDIIDSP
Traslitteratore consente altresl Il programma DMD e un
di ascoltare i brani ottenuti driver MIDI costruito per il
dalla traslitterazione, sia contrallo di una scheda DSP
attraverso i suoni prodotti con (Sound Accelerator, Audio-
il sintetizzatore interno al media).
Macintosh oppure mediante Si pub considerare questa
periferiche MIDI. Tale applicazione come un naturale
traduttore e in grade di sviluppo del Sequencer
tradurre anche i simboli di MIDI/DSP; sono state incre-
accelerando e ritardando che si mentate Ie capacita di
possono inserire nel testo, comunicazione a livello MIDI
ottenendo COSl di poter in tempo reale e sono state rese
rimuovere, almena in parte, disponibili pili potenti possi-
sensazioni di appiattimento bilita di controllo timbrico.
espressivo. Ma la caratteristica principale e
senza dubbio rappresentata
SequencerA1IDIIDSP dalla possibilita di inviare
Il Sequencer MIDI/DSP e un messaggi MIDI per il controllo
MIDI Sequencer costruito per della scheda DSP da parte di
il controllo di una scheda DSP, una qualunque applicazione
che viene vista dall'utente come compatibile con il MIDI
un sintetizzatore a cui e Manager.
possibile inviare sequenze di
dati MIDI come a un qualsiasi Waver
dispositivo MIDI estemo. L'applicazione Waver e un
L'applicazione consente di 'banco di lavoro' per la
effettuare registrazione, editing sperimentazione di tecniche di
e riproduzione di sequenze sintesi; attualmente sono
MIDI distribuite su molteplici implementate Ie seguenti:
tracce (max 32). Queste sintesi mediante funzioni di due
possono essere singolarmente variabili, waveshaping, trasfor-
assegnate per pilotare mata wavelet.
l'esecuzione di dispositivi MIDI
esterni eto la scheda DSP LASy Workbench
interna; e anche possibile Il LASy Workbench e
effettuare operazioni di mixing un'applicazione sviluppata per
MIDI sulle tracce. E' possibile la realizzazione dell'algoritmo
inoltre richiamare il modulo di sintesi LASy (Linear
Waver e impostare i differenti Automata Synthesis) su
Macintosh dotato di una scheda

259
DSP (Sound Accelerator 0 in microcodice del DSP
Audiomedia). Motorola 56001.
n LW risponde ad una duplice
funzione: si tratta di uno Cellular Automata Sound
strumento di esecuzione Processor
musicale LASy controllabile in CellAut (eseguibile in ambiente
tempo reale con comandi MIDI NeXT) sintetizza un segnale
o con periferiche di ingresso utilizzando l'algoritmo LASy a
(tastiera ASCII e mouse); partire da una forma d'onda e
inoltre costituisce un ambiente una funzione di transizione
di ricerca e di sperimentazione scelte dall'utente. Quest'ultimo
della sintesi LASy. ha un controllo duplice
sull'evoluzione della sintesi: i
TimbreLab parametri della funzione di
L'applicazione consiste in un transizione sono modificabili
sistema ad architettura aperta tramite oggetti grafici e
per la programmazione l'altezza del suono prodotto e
timbrica con interfaccia controllata dall'altezza di un
iconica. I modelli timbrici segnale analogico rilevato da
possono essere organizzati un microfono.
gerarchicamente; infatti, un Parallelamente a CellAut, e
modello puo sempre essere stata sviluppata una seconda
considerato come un sotto- applicazione, FilterAut, che
modello rappresentato da un consente di filtrare in tempo
nodo in un modello piu reale un segnale analogico e
complesso. contemporaneamente di variare
TimbreLab permette la realiz- i parametri di filtraggio.
zazione delle diverse tecniche Inoltre FilterAut calcola
di sintesi disponibili in l'altezza del segnale e invia la
letteratura. nota corrispondente come una
sequenza di comandi suI canale
VoiceLab MIDI collegato all'elaboratore.
VoiceLab e il risultato di
un'indagine nel campo della La libreria di microprogrammi
sintesi della voce in generale e di sintesi
del parlato in particolare, con Il presente modulo consiste in
controllo dell'andamento into- un insieme di routine per la
nativo del parlato sintetico. sintesi del suono utilizzato da
VoiceLab e eseguibile in quattro software della stazione
ambiente NeXT ed e costituito di lavoro: Driver MIDI/DSP,
da una parte di codice che Sequencer MIDI/DSP, Waver,
viene eseguita suI processore LASy Workbench.
principale e da un modulo di Tali routine sono rese
sintesi del segnale vocale scritto disponibili sotto forma di

260
risorsa nel file "DSP Code Milano", Tech. Report 7/53,
Resources". CNR-PFI2, 1991.

Standard MIDI files Library [2] 1. Pighi & AA.VV.:


Si tratta di una libreria "Integrazione dell'architettura e
sviluppata per elaboratori delle specifiche funzionali di
Macintosh e fornisce un alto livello dei moduli della
insieme di routine per la Stazione di Lavoro Musicale
gestione dei processi di lettura Intelligente - Unita Operativa
e scrittura di Standard MIDI dell'Universita degli Studi di
File di formato '0' e '1 ' Milano", Tech. Report 7/117,
seguendo la sintassi definita nel CNR-PFI2, 1993.
documento Standard MIDI File
V. 1.0 pubblicato dalla Inter- [3] A. Ballista, E. Casali, J.
national MIDI Association. Chareyron, G. Haus: "A
MID/IDSP Sound Processing
SampleBuster Environment for a Computer
L'applicazione consiste nella Music Workstation", Computer
gestione ragionata e guidata di Music Journal, Vol.16, N.3,
una raccolta di campioni audio pp.57-72, MIT Press, 1992.
provenienti da diverse
biblioteche e memorizzati con [4] G. Haus, A. Sametti:
diversi formati fisici, utiliz- "SCORESYNTH: a System for
zabili in ambienti di the Synthesis of Music Scores
produzione musicale professio- based on Petri Nets and a Music
nali e non, con alcune utility Algebra", in "Readings in
per 1a gestione veloce delle Computer Generated Music", D.
apparecchiature esterne, colle- Baggi Ed., pp.53-78, IEEE
gate a1 sistema. Computer Society Press, 1992.
Ovviamente e possibile utiliz-
zare campioni provenienti da [5] G. Haus, A. Stiglitz: "The
qualsiasi biblioteca, essendo i Functional Performer System",
formati scelti di carattere Interface, Vo1.23, N.1, pp.53-
assolutamente generale. 75, Swets & Zeitlinger B.V.,
1993.
Bibliografia
[6] G. Haus, P. Morini:
[1] AA.VV. (A. Stiglitz Ed.): "TEMPER: a System for Music
"Specifiche funzionali di alto Synthesis from Animated Tess-
livello dei moduli della Stazione ellations", Leonardo, Vo1.25,
di Lavoro Musicale Intelligente N.3/4, pp.355-360, Pergamon
Unita Operativa della Journals, 1992.
Universita degli Studi di

261
L'AMBIENTE PER L'ANALISI/RE·SINTESI DI
PARTITURE DELLA "STAZIONE DI LAVORO
MUSICALE INTELLIGENTE"
Goffredo Hans, Alberto Sametti

Laboratorio di Infonnatica Musicale


Dipartimento di Scienze dell'Infonnazione
Universita degli Studi di Milano
via Comelico 39
1-20135 Milano (Italia)
fax +39 255006373
e-mail: music@imiucca.csi.unimi.it

Abstract

In this work we describe the analysis/re-synthesis environment of the


"Intelligent Music Workstation". It consists of three software modules:
a) ScoreSegmenter, which is able to decompose a score into a set of basic
music objects and a number of transformation relationships among
various occurrences of basic music objects within the score; it tries to
recognize the main theme of the piece, if any exists; then it finds any
instance of the main theme considering also any possible transformations
of it; both entire and partial instances are considered; this automatic
process is controlled through many configuration parameters the user can
set;
b) ModelSynth, which synthesizes a Petri Net model representing the
generative structures of the original score; it starts from informations
extracted by ScoreSegmenter such as the number of voices within the
score, the presence of loops, the application of music transformation
(transposition, retrogradation, mirror inversion, ...), etc.;
c) ScoreSynth, which executes that Petri Net model either resynthesizing
the original score or synthesizing new scores, depending on eventual
editing of the fonnal model.

This research has been supported by the Italian National Research Council in the frame of
the MUSIC Topic (LRC C4): "INTELLIGENT MUSIC WORKSTATION", Subproject
7: SISTEMI DI SUPPORTO AL LAVORO INTELLETTUALE, Finalized Project
SISTEMI INFORMATICI E CALCOLO PARALLELO.

262
Introduzione Architettura software

L'ambiente per l'analisi/re-sintesi In questo capitolo analizzeremo


di partiture costituisce il livello quella che e l'architettura
piu astratto (strutturale/simbo- software piu generale, cioe il
Iico) della "Stazione di Lavoro contesto in cui si inserisce il
Musicale Intelligente" [1]. modulo centrale, ModelSynth.
Esso fornisce strumenti Come si vede in Fig. 1,
interattivi per la scomposizione direttamente correlati con esso
automatica di partiture, per la esistono gli altri due moduli
sintesi di modelli generativi ScoreSegmenter [2] [3] e Score-
basati sui risuItati della scom- Synth [4]. Illustriamo brevemen-
posizione e per l'esecuzione di te Ie peculiarita dei tre moduli.
tali modelli.
E' un ambiente mirato allo
incremento della produttivita
individuale tanto del musicologo ( ScoreSynth )
quanto del compositore.
Consente infatti di individuare
strutture nascoste nelle partiture
e di utilizzarIe 0 per dare una
I
rappresentazione piu astratta e ( ModelSynth )
strutturata delle partiture stesse 0
per generare varieta di partiture
con caratteristiche strutturali piu
o meno (a piacere del musicista)
I
affini a quelle di una partitura ScoreSegmenter
data.
L'ambiente e costituito da tre Fig. 1: architettura moduli s/w.
moduli s/w: ScoreSegmenter (per
la scomposizione), M odelSynth ScoreSegmenter si pone 10
(per la sintesi dei modelli obiettivo della segmentazione di
generativi), ScoreSynth (per la brani musicali quale primo passo
esecuzione dei modelli ovvero per una futura loro strumen-
per la sintesi di nuove partiture). tazione automatica proposta
E' bene sottolineare che dall'elaboratore. Si tratta cioe
l'esecuzione con ScoreSynth di della ricerca dei vari oggetti
un modello generato da musicali con cui e composto il
ModelSynth, senza che il brano, dove per oggetti
musicista vi apporti modifica intendiamo quei frammenti
alcuna, avra come effetto la musicali che l'autore ha espresso
generazione della partitura di e che ha ripreso e trasformato,
partenza che era stata scomposta secondo i vari canoni musicali, in
con ScoreSegmenter. relazione al periodo storico e alla

263
forma composItlva. E' uno musicali basati suI formalismo
strumento informatico prototipa- delle reti di Petri. L'obiettivo
Ie che consente di accostarsi ai primario del modulo ScoreSynth
testi musicali non come pura e quello di mettere a disposizione
sequenza di note, bensl vedendo i del 'musicista' uno strumento
testi stessi come un edificio molto potente ed efficace che
espressivo fatto da alcuni permette di considerare l'attivita
e1ementi di base strutturati compositiva da un punto di vista
funzionalmente fra lora in vario architetturale: il 'musicista'
modo. ScoreSegmenter e quindi manipola sequenze, funzioni di
essenzialmente uno strumento di trasformazione e strutture.
analisi. Infine, ModelSynth permette di
ScoreSynth al contrario e uno tradurre, in termini di modelli di
strumento di sintesi. Esso rende reti di Petri eseguibili da
disponibile un ambiente integrato ScoreSynth, i risultati dell'analisi
per la creazione, debugging ed operata dal modulo Score-
esecuzione di modelli di partiture Segmenter.

Partitur;[ ScoreSegmenter )
' - - - - : : r - - - . J Partiture SMF

Tabelle.s~,-._...0::._--' .....
ModelSynth

Fig. 2: flussi di Input/Output.

Specifiche funzionali e notazione musicale tradizionale e


input/output generare un file di supporto
(detto 'file di lavoro '),
Considerando i tre moduli da un corrispondente al file in ingresso,
punto di vista funzionale vediamo tramite cui operare la propria
ora quali trasformazioni analisi, i vari temi riconosciuti in
definiscono e cosa costituisce il formato SMF (Standard MIDI
dominio e il codominio per File 1.0) e, pili importante, una
ciascuni di essi. tabella testuale contenente i
risultati delle ricerche.
ScoreSegmenter Quest'ultima tabella, per ogni
tema 0 frammento trovato,
ScoreSegmenter e In grado di riporta Ie seguenti informazioni:
leggere una partitura espressa in la voce, la nota iniziale e la nota

264
finale all'intemo della partitura, zionalita si rimanda all'articolo
l'operatore eventualmente appli- citato in bibliografia.
cato e la prima nota riconosciuta
del tema. D'ora in poi faremo Criteri di analisi del
riferimento ad ogni riga della modulo ScoreSegmenter.
tabella con i1 termine 'atomo'.
Affinche Ie tabelle della Un primo aspetto fondamentale
ScoreSegmenter siano significa- del modulo consiste nella
tive si devono fissare i parametri realizzazione degli algoritmi per
di impostazione della segmenta- la ricerca delle occorrenze degli
zione in modo da effettuare oggetti musicali, 0 di lora
ricerche esc1usive su frammenti e sottoparti, all'intemo del brano
temi. In modo da evitare che una (oggetti che possono essere
sequenza di note, in una certa individuati dall'elaboratore stesso
posizione nella partitura, o fomiti dall'utente). Come noto,
appartengano a pili temi 0 a questa proposito sono due gli
frammenti. elementi da considerare: gli
attributi della singola nota e la
ModelSynth trasformazione musicale che la
nota stessa ha subito assieme alle
ModelSynth legge i frammenti note che la precedono eto
riconosciuti e memorizzati in seguono. Gli attributi sono la
formato SMF e la tabella testuale durata, l'accento, il nome della
prodotta dall'analisi dello nota, l'altezza in semitoni e
ScoreSegmenter; Ii analizza l'andamento dell'intervallo. Per
iterativamente al fine di quanto riguarda Ie trasforma-
individuare strutture descrivibili zioni, nel presente lavoro sono
mediante reti di Petri; infine, state considerate queUe applicate
sintetizza un modello di reti di agli attributi di posizione, 0
Petri generativo gerarchico, grado, sia nella scala tonale che
eseguibile da ScoreSynth, avente i nella scala diatonica. Tali
frammenti in formato SMF trasformazioni sono state
associati ad opportuni nodi del realizzate attraverso tre tipi
modello. diversi di operatori algebrici e
lora combinazioni, precisamente
ScoreSynth si tratta dell'operatore di
trasposizione, di inversione
Come detto ScoreSynth permette speculare e di retrogradazione.
l'editing, il debugging' e la Tali operatori applicati, come
esecuzione di modelli di reti di detto, al nome delle note,
Petri orientati alla sintesi di realizzano Ie corrispondenti
partiture musicali MIDI. Per una trasformazioni musicali tonali,
dettagliata descrizione delle fun- applicati aU'altezza in semitoni, Ie
trasformazioni reali. Per fomire

265
buona duttilita a questo strumento nella definizione del tema
di analisi, si e reso possibile (tonalita, metrica, lunghezza del
intervenire, in mondo interattivo, fraseggio, numero minimo di
per variare notevolmente 10 'stile' ripetizioni, ecc.), sia circa la
con cui Ie ricerche sono condotte, ripetizione stessa degli oggetti.
scegliendo quali attributi
considerare, quali trasformazioni Criteri di analisi/re-sintesi
e quale tasso di variabilita del modulo ModelSynth.
applicare alle analisi tonali.
Un secondo aspetto e la In sostanza do che ModelSynth
segmentazione vera e propria. compie e una operazione inversa
Per semplicita ci siamo basati rispetto a quella della
sulla forma musicale della fuga, ScoreSegmenter. Infatti cerca di
generalizzandola poi per Ie altre ricostituire la struttura del branD
forme, anche se sarebbe musicale analizzato e atomizzato.
opportuno ampliare l'algoritmo al Chiaramente questa operazione
fine di individuare, con maggior per non essere un passaggio
precisione e specificita, il tema ridondante deve conferire alIa
vero e proprio del branD nelle struttura finale del branD alcune
altre forme. Per la individuazione caratteristiche significative.
degli oggetti ci si e basati non Queste caratteristiche si possono
solo sulle lorD ripetizioni, che riassumere in una sola:
sono evidentemente un aspetto rappresentare con modelli
necessario rna non sufficiente ai flessibili il contenuto informativo
fini musicali. Un tema ha infatti del brano, dove per
anche un impianto tonale (ad 'informativo' intendiamo Ie
eccezione della musica moderna) relazione tra gli atomi costituenti
e metrico dai quali non e possibile e Ie lora trasformazioni nel
prescindere. Per tener conto corso dello svolgimento. Cio che
anche di questi aspetti vengono viene realizzato e quindi l'analisi
formulate delle richieste per Ie di una analisi, allo scopo di, da
note degli ipotetici temi, in una parte, fornire una
relazione aH'appartenenza e rappresentazione alternativa,
affermazione della tonalita a dall'altra estrarre e codificare
seconda tel tipo di oggetto (tetico, l'informazione per la resintesi.
acefalo e anacrusico) e in La prima analisi compiuta da
relazione aHa esaustivita e ModelSynth e quella di operare il
compiutezza del pensiero parsing della tabella prodotta
musicale, quale effetto della dallo ScoreSegmenter, generata a
metrica. Anche per questa fase di seguito di una richiesta di
individuazione dei temi si e segmentazione di un brano,
predisposta una serie di parametri utilizzando come supporto il file
da specificare interattivamente sia di lavoro relativo. Questa fase
circa gli e1ementi da valutare permette la rappresentazione in

266
un formato intermedio dei codificata in un modello. I temi
risultati contenuti nelle tabelle. svolgono solamente il ruolo dei
Ovvero salva tutti i temi dati.
riconosciuti assegnando un L'analisi procede considerando
codice di riferimento, suddivide singolarmente Ie voci. Le
tutti gli atomi per voce, riordina strutture che si cerca di
Ie loro occorrenze secondo una riconoscere sono i loop semplici,
chiave temporale e recupera dal i loop con selezione e la
file di lavoro tutte quelle parti ripetizione di pattern specifici.
(sequenze di note e pause), che Con loop semplici intendiamo Ie
possiamo chiamare 'scarti', successioni di un tema 0 di una
giudicati non significativi dallo sua trasformazione. Ad esempio,
algoritmo di segmentazione, rna considerando il tema A e un
che costituiscono la colla tra i operatore T, una successione
vari atomi. come
Secondariamente vengono consi-
derati gli operatori applicati agli T (A) - T (A) - T (A) - T (A)
atomi. Poiche gli operatori
riconosciuti dallo ScoreSegmen- puC> essere rappresentata da una
ter sono un sottoinsieme stretto rete macro che realizzi un loop
di quelli disponibili in Score- alla quale vengono passati come
Synth, questa operazione e parametri il tema, l'operatore e
immediata. il numero delle ripetizioni.
A questo punto ModelSynth ha a Un loop con selezione invece si
disposizione tutte Ie informazioni ha se varia l'operatore applicato
necessarie per operare la propria ad un medesimo tema. Ad
analisi e quindi la costruzione esempio, nella successione:
automatica di un modello di Reti
T (A) - R (A) - I (A)
di Petri, in formato ScoreSynth,
corrispondente al branD di
partenza e che metta in evidenza, Questa tipologia di costrutto
se presenti, i costrutti relazionali utilizza sempre come base una
e Ie funzioni di trasformazioni rete di tipo loop, con la
utilizzate. Fondamentale in differenza che l'oggetto del
questa fase e la possibilita di looping non e un singolo
sfruttare i meccanismi di elemento, rna una sottorete
chiamate parametriche a reti macro alla quale vengono passati
gerarchiche messa a disposizione come parametri gli operatori da
da ScoreSynth. Questi realizzano applicare ad ogni ciclo.
la separazione tra struttura del Oltre alla risoluzione dei loop
branD e temi. La struttura viene vanno riconosciuti anche i
identificata dalle relazioni tra i pattern. E i pattern possono
temi, Ie loro ripetizioni e Ie loro essere ricercati a diversi livelli.
trasformazioni, e viene quindi Al livello degli operatori; ad

267
esempio in successioni relative ad tutti i temi completi riconosciuti
uno stesso tema come: da ScoreSegmenter. La figura 6,
invece, mostra come nel model1o
T(A) - R(A) - T(A) - puo essere rappresentata la
R(A) - T(A) - R(A) applicazione di un operatore ad
un tema: l'operatore nella riga
Allivel10 dei temi; come in sottostante indica che la
trasforrnazione da esso operata fa
T(A) - R(B) - I(C) -
T(A) - R(B) - I(C) riferimento al tema "Theme2".
Questa semplice rete potrebbe
o infine a livel10 di intere reti. essere invocata pili volte in un
Le fasi pill significative della modello. Pertanto verrebbe
analisi del ModelSynth si possono
quindi riassumere mediante i Start PieceX
seguenti passi:
i) creazione di una rappre-
sentazione intermedia;
~
ii) riconoscimento di pattern Fig. 3: la rete generata
di Iivello piiJ alto.
sugli operatori;
iii) riconoscimento di loop
semplici 0 con selezione;
iv) riconoscimento di pattern
sui temi;
v) riconoscimento di pattern
su intere reti;
vi) torna a v).
II processo di analisi continua
solamente se il passo v) e in
grado di operare almeno un
riconoscimento.

Nele reti di Petri raffigurate


nelle figure seguenti viene
mostrato quale potrebbe essere il
risultato, almena per quanto
riguarda gli aspetti pili generali, Fig. 4: la rete "PieceX".
dell'analisi operata da Model-
Synth su un branD di nome
"PieceX". In particolare la figura Themel Theme2 Theme3 Theme4
3 rappresenta la rete di pill alto
livel10 del modello; la figura 4 la
sottorete relativa al1e parti in
gioco (in questo esempio, tre); e,
<D<D<D<D
Fig. 5: la rete con tutti
la figura 5, la rete che 'raccoglie' i temi riconosciuti.

268
una coerente estensione del
(0) ~n ~;F\maX) modulo ScoreSynth.
~ In te.rmini ancor piu generali,
posslamo pensare ad un
OpX = P: I, $, [Tema2, Il, ? + 7 appr?ccio, quale quello qui
segmto nel caso delle partiture
Fig..6: un. esempio di appli- musicali, applicato al caso dei
cazlOne dl un operatore di processi multimediali, dove si
trasposizione ad un tema.
possano scomporre, organizzare
in modelli generativi e
Futuri sviluppi sintetizzare processi costituiti da
suoni, immagini e testi.
L'ambiente per l'analisi/re-sintesi
di partiture qui brevemente Bibliografia
descritto e da considerare un
prototipo per la ricerca [1] I. Pighi e AA.VV.:
sperimentale che avra tante piu "Integrazione dell'architettura e
possibilita di accrescere la delle specifiche funzionali di alto
finezza delle sue fasi di analisi livello dei moduli della Stazione
qu~nta piu sistemtica sperimen-
di Lavoro Musicale Intelligente -
tazlOne potremo attuare, con Ie Unita Operativa dell'Universita
seguenti finalita primarie: degli Studi di Milano", Tech.
I) generalizzazione della Report M/42, CNR-PFI2 MUSIC
capacita di segmentazione della Series, 1993.
ScoreSegmenter alle piu svariate
forme musicali; attualmente, per [2] F. Lonati: Guida operativa
quanto sia in grade di scomporre del modulo "ScoreSegmenter",
quals~voglia partitura, ha Tech. Report M1l8, CNR-PFI2
maggIOre capacita, ovvero MUSIC Series, 1991.
produce meno scarti, con brani
in forma di fuga 0 di sonata; [3] F. Lonati: Note tecniche del
II) estensione delle capacim di modulo "ScoreSegmenter", Tech.
analisi/re-sintesi del ModelSynth Report M/19, CNR-PFI2 MUSIC
per l'individuazione di strutture Series, 1991.
sempre piu profonde; in
partIcolare, si puo potenziare [4] G. Haus, A. Sametti:
ulteriormente l'individuazione di "SCORESYNTH: a System for
strutture affini estendendo the Synthesis of Music Scores
l'attuale nozione di rete macro in based on Petri Nets and a Music
modo tale che una rete macro Algebra", in "Readings in
possa essere considerata come un Computer Generated Music", D.
parametro di un altra rete Baggi Ed., pp.53-78, IEEE
macro; questa possibilita richiede Computer Society Press, 1992.

269
Real Time Processing and Performance using
WinProcnelHARP
Claudio Massucco, Marco Mercurio and Giuliano Palmieri
DIST - University of Genova
Laboratorio di Informatica Musicale
Via Opera Pia IliA -16145 Genova

This paper describes a compositional algorithms, with a


composition project based on the symbolic level based on multiple
utilization of the WinProcne- inheritance semantic networks
HARP system, a tool able to language derived from KL_ONE
represent and process in real time [2], extended with temporal
music knowledge. The goal of primitives. The system allows the
this project is twofold: from one defInition of a formal, symbolic
hand it aims at putting to work the data base integrated with a set of
new version of the system in a analogical experts, a kind of
real compositional framework, "actor" system [1]. They can be
from the other hand it follows the activated by a special actor (the
artistic goals of the composer. Simulative Engine), guided by
The system has been used in a inferences on the network. The
concert at Palazzo Ducale analogical and symbolic levels are
(Genova), and in Villa Arconati connected by an interface that
(Milano). It will be used in defmes the links between
multimedia theatrical event for the analogical experts, their activation
joint management of both order, and manages the
humanoid animations and music. interactions between the two
levels. In other words, it is
WinProcnelHARP in brief possible, once defmed the
WinProcnelHARP is a software symbolic data base that contains
environment that allows the the information on the knowledge
creation, updating and querying of of the problem domain, to 'ground'
knowledge bases using a high- this knowledge onto a set of
level graphical environment analogical experts able to execute
implemented under Microsoft different kinds of transformation,
Windows 3.1 and available on PC and to activate them following a
386/486 [3,4]. The system is scheduling algorithm that uses
based on a hybrid formalism, the information of the symbolic
integrating an object-oriented level and updates the context
concurrent environment dealing according to the events occurring
with sound signals (samples, in real-time.
MIDI messages) and

270
The environment there are notes with long
WinProcne has been used in this durations (even in the range of
project to control three Proteus one minute) and a small amount
synthesizers and a workstation of pitch "verticality". Variations
MARS (developed by Iris are generated by controlling the
Bontempi Farfisa) through MIDI synthesis algorithm and, at the
interface; for each instrument opposite side, short and
notes and MIDI controls are overlapping notes, with less work
generated; some controls, for on the timbre.
example "main volume", are The continuous evolution between
used with the "obvious" the two extremes and inside the
meaning, while others are used different musical moments are
to control the synthesis process, controlled by WinProcne/HARP.
both in the envelope and during The most cryptical phase of the
the execution of a single note. work has been the formalization
The controls associated to the of the logic that guides the
algorithms have been planned for different moments and their
producing timbrical evolution. evolution: the composer uses, for
One of the main problems met by the generation of new notes and
the composer to produce music the control of the evolving notes,
by means of this architecture has rules that he himself in not able to
been the control of a high number describe clearly. The knowledge
of parameters: for every base describing this logic,
instrument, four keyboards (one together the experts that generate
for each MIDI channel) and the corresponding controls, have
fourty cursors (ten controls for been built during long test
each channel), should be sessions, in which the composer
controlled at the same time. described his needs that were
WinProcne/HARP allows the formalized through trials and
real-time control of all these errors. In this way we developed
parameters. a compositional model.
The final result is an environment
The sonic fascia in which the composer can
manipulate, interactively during a
The musical aim of the composer performance, some kind of
is to build a musical situation - the "knobs", Le. active cursors that
fascia - where a continuum of work on bidimentional force
situations are combined according fields (see figure 2). The term
to a logic or metaphors, like those "active" means that the cursor,
of the movement of particles in once placed, does not stay in the
force fields. The concept of fascia chosen position but move in a
can be classified according to force field following a given
different featurs: at the extremes movement law (e.g., toward a

271
minimum, the darken area in the modeling and real-time
figure). The cursors are used to performance. 1bis experiment has
control high level parameters produced significant feedback for
(musical moments, note density, identifying some weak points of
quantity of timbric controls); if our system model. A first public
the cursors move freely there are performance was held on April
some predefmed behaviors (for 1993 at Palazzo Ducale in
example, the cursor that controls Genova, and in a concert at Villa
the density of notes and the Arconati on June in Milano.
quantity of distortion has been set WinProcne will be used for a
to swing, in some minutes, theatrical performance on the
between high density/few controls subject "Frankenstain e
and low density/a lot of controls). Pinocchio", directed by Mario
Anyway. in any moment the Jorio, at the theater "Altrove" in
performer can manually move the Genova (early 1994).
cursor, thus either accelerating or
delaying the evolution of the Bibliography
system or completely changing it. [1] G.Agha, C.Hewitt, Actors: A
According to the state of the Conceptual Foundation for Concurrent
active cursors and the state of the Object-Oriented Programming, in
performance (quantity and pitch Research Directions in Object-Oriented
of the active notes on each Programming, B.Schriver and P.Wegner
channel, volumes, used timbers), (Eds.), Cambridge, MIT Press, 1987.
intermediate level requests (like [2] RJ.Brachman and J. G. Schmolze,
"raise the volume on a given "An overview of the KL-ONE
channel", "make more percussive knowledge representation system",
the envelope for the next notes", Cognitive Science, 9, pp.171-216, 1985.
"modify the timber on a channel") [3] A.Camurri, C.Canepa, M. Frixione,
are generated. These requests and R. Zaccaria, "HARP: A framework
correspond to new assertions in and a system for intelligent composer's
the symbolic subsystem. The assistance", IEEE COMPUTER, Vo1.24,
management of the MIDI controls No.7, pp.64-67, July 1991.
is done by the analogical experts: [4] A.Camurri, C.Canepa, M.Frixione,
a timbre modification, for C.Innocenti, C.Massucco, R.Zaccaria,
example, requires an expert "HARP: un ambiente ad alto livello per
generating a sequence of MIDI l'ausilio alIa composizione", Atti IX
messages that modify the value of Colloquio di Informatica Musicale,
one or more controls in a Genova, 1991.
continuous way. [5] F.Courtot, "CARLA: Knowledge
Acquisition and Induction for Computer
Conclusions Assisted Composition", Interface,
WinProcnelHARP demonstrated Vo1.21, pp.191-217, Swets&Zeitlinger.
to be a flexible tool for music

272
Figure 1: a fragment of the knowledge base.

Figure 2: A force field for the production of afascia object

273
274
Sezione 6b

MARS

275
276
APPLI20:
A DEVELOPMENT TOOL FOR BUILDING
MARS APPLICATIONS WITH
AN EASY TO USE GRAPHICAL INTERFACE
P. Andrenacci, F. Armani, A. Prestigiacomo, C. Rosati
IRIS s.r.l.
Parco La Selva, 151
03018 Paliano (FR), Italy
fax: +39 (775) 533343 - tel: +39 (775) 533441
E-mail: mc2842@mclink.it

MARS is an integrated envi- Why APPLI20?


ronment in which a graphical
user interface, a realtime oper- The high level approach pro-
ating system and two general vided by EDIT20 [4] does
purpose digital signal proces- not satisfy all MARS's pos-
sors are linked together to cre- sible users. Application pro-
ate a general sound processing grammers have more partic-
system that is interactive and ular needs than those which
operates in realtime [1, 4]. EDIT20 can satisfy. The
MARS environment on the
EDIT20 provides a multi- host may require extension by
purpose interactive graphical new programs developed by
approach to define any kind IRIS and by the users them-
of audio object, with imme- selves.
diate sound feedback [1, 4]. A MARS application pro-
However, an application pro- gram should satisfy some gen-
grammer may want to extend eral requirements:
the existing MARS develop-
ment environment on the host ~ to be an autonomous pro-
with his own programs [2]. gram;
APPLI20 is a tool to sup- • its user interface should
port MARS application pro- be easily and quickly pro-
gramming on the host com- duced;
puter. It offers a programming
approach to the MARS sys- • to program and control the
tem through a set of libraries sound generation board or
and interfaces that facilitate to create libraries of sounds
the development of MARS ap- and MIDI environments,
plications. without using EDIT20;

277
• to exchange data and co- APPUCATION
operate with EDIT20 or SSERVBR I
other applications; ~c Data Structure Manager

MSliRVBR

• to optimize the use of I'!'R""""",,Manager


I Graphic

MARS resources, e.g. X20 TSliRVBR


Canmunicatioo &. OSf Reaoorce Manager
I Tool/ciJ

microprograms and memo- HOSTOPERAllNG SYSTEM


ries, when EDIT20 is not
able to do it;
Figure 1: Application framework.
• to exploit the combined
performance and capabili- APPLI20 is the kernel of the
ties of MARS and of the EDIT20 architecture. In this
host computer to cover any sense, EDIT20 is the first com-
aspect that EDIT20 does plex and sophisticated exam-
not or cannot solve (for ex- ple of an application program
ample, to realize a realtime based on APPLI20.
2048 points FFT analysis
with a graphical data pre-
sentation); APPLI20 architecture
• to be coded by means APPLI20 consists of a set of
of high-level standard pro- three libraries that are in a hi-
gramming languages; erarchical relation (SSERVER,
• to reuse code previously MSERVER and TSERVER),
developed and tested; and of a graphical toolkit
that offers graphical objects in
• to have a uniform style in the EDIT20 style. A typi-
coding and presentation. cal MARS application environ-
In order to satisfy all the ment is shown in figure 1.
above requirements, IRIS of- The APPLI20 architecture
fers to the application pro- reflects the MARS data struc-
grammer APPLI20, a set of ture (see fig. 2), which can be
libraries with a well defined divided into three "abstraction
Application Programmer In- layers" :
terface, available in the C Lan-
guage. a. the Performance Environ-
In APPLI20 the program- ment,
mer finds tools to fully pro- b. the Orchestra,
gram and control the sound
generation board and a toolkit c. the DSP.
of graphics objects for build-
ing user interfaces in a uniform The lower level (c) is related
style, while hiding implemen- to the microprogram (J.lP) , reg-
tation details and reducing de- isters (REGS), data memories
velopment time. (DM) and sample memories

278
defined over Orchestra's algo-
MIDI
CHANNEl-'i
rithms, and of a Channel Map
that links tones to MIDI chan-
nels. This environment is sup-
ported by the embedded real-
Performance Environment time MIDI management (voice
y l l i ! - - I ! I - - - - - - - - - - - J a. and tones allocation policy,
events triggering, ... ).
A tone is a complete config-
uration of an algorithm's vari-
ables. It assigns specific val-
ues and MIDI controls to the
parameters of the algorithm.
These are described in terms
of:
• envelope definitions,
X20
• values of static parameters,
~
DSP
'-- --' c. • dynamic parameters' con-
trol structures and tables,
Figure 2: MARS data structure.
ED LFO definitions.
(FUN) of the two X20 DSPs.
It also contains the virtual APPLI20 libraries
memory that maps the data
TSERVER The Transmission
memories of the two X20 chips
Server implements access to
to a single virtual space.
the MARS DSP resources of
The Orchestra (b) offers a
structured point of view of the the level (c), hiding from the
user the details of data com-
DSP layer; it is a collection of munication between the host
algorithms described in terms
and the board. The media for
of [1, 2]: transmission can be chosen at
run time between MIDI and
• number of voices (clones), parallel port. It is also possi-
• entry points (envelopes, ble to save on a file the "log"
parameters, LFOs), of a session for debugging or
archiving purposes.
• audio routing. A summary of the functiona-
lities offered by the TSERVER
The higher level (a) defines includes:
the MIDI Performance Envi-
ronment, based on the Or- .. initialization and test of
chestra. It consists of tones the communication media;

279
.. read and write of X20 .. realtime creation and edit-
registers (to perform ON, ing of algorithm modules
OFF, ... )j (Add, Delete, Connect,
.. installation of X20 micro-
... );
programs; .. creation and editing of the
Orchestra's environment as
.. direct read and write of al- algorithm clones and their
gorithm parameters from/ audio routingj
to X20 data memory (DM)
locations, using physical or .. transmission to the board
virtual addresses; of the X20's micropro-
grams and data mem-
.. load and dump of samples; ory configurations using
.. pack and unpack of MARS TSERVER;
messages. • access to the Orchestra's
DM locations are shared data files in the EDIT20
with the MARS realtime op- format.
erating system which updates
parameters in response to SSERVER The Structure Ser-
MIDI messages: it is the user's ver extends the MSERVER ob-
responsability to avoid con- jects and functionalities in or-
flicts with his/her application. der to allow the definition of
a complete MIDI Performance
MSERVER The Microprogr- Environment, level (a).
am Server offers the progra- It offers an MSERVER-style
mmer a set of objects and fun- approach to defining an Or-
ctionalities to build an Orche- chestra, adding special infor-
stra, hiding the details about mation for the MARS realtime
level (c). control. All this hides the use
The Orchestra working en- of MSERVER, details of real-
vironment consists of algo- time requirements, and details
rithms, each one composed of of DSP aspects.
a collection of cooperating and Moreover, it adds objects
interconnected objects: the (dynamic and static parame-
modules. A module is the ters, LFOs, envelopes, data ta-
"atomic" object, and imple- bles, tones) and functionali-
ments a portion of the micro- ties (Create, Delete, Edit) to
program code, allocating the give values and MIDI control
necessary DM locations for its to an algorithm's entry points,
entry points (used for module to create libraries of tones and
interconnection and data en- to link tones to MIDI channels.
try). SSERVER creates on the
The implemented set of func- host an image of the three-
tions includes: level MARS data structure for

280
the other libraries (TSERVER,
MSERVER and SSERVER) ,
and can be used to develop
non-MARS graphics applica-
tions. Thus it was exten-
sively used at IRIS to de-
velop graphics simulators, ed-
itors and MIDI applications.
The toolkit allows the pro-
grammer to think in terms of
graphics objects with callback
mechanisms. The objects it
manages are windows, list se-
Figure 3: Graphic toolkit's objects. lectors, sliders, buttons, LEDs,
editable text fields, and graph-
a complete MIDI Performance ics areas (see fig. 3).
Environment. The toolkit's current imple-
This image can be transmit- mentation is based on the
ted to the MARS board by Atari GEM Operating Sys-
means of high level object- tem [6], using its AES (Appli-
specific functions that hide cation Environment Services)
the use of TSERVER and and VDI (Virtual Device Inter-
the details of data protocol face) functions to manage the
and I/O access. Furthermore, user interaction with the ob-
SSERVER offers special inter- jects in an event-driven fash-
action modes to automatically IOn.
transmit the image when mod- In order to allow MARS
ified. applications to easily migrate
The image is in EDIT20 to non-Atari platforms [5], a
compatible format. It can great effort is currently being
be saved on the host and re- made at IRIS to evolve the
trieved for subsequent reuse. toolkit to a portable version.
This means that an image can This is being accomplished
be exchanged between EDIT20 using multi-platform libraries
and other SSERVER applica- (Cicero C-Tools, XVT toolkit)
tions. to reimplement the low-level
layer of the APPLI20 graphics
toolkit.
Graphics Toolkit
The APPLI20 graphics toolkit Examples
is a set of libraries origi-
nally created to develop a The following IRIS applica-
MARS application in a graph- tions show how the APPLI20
ical EDIT20 style [1, 2, 4]. libreries are employed to build
However, it has no links with a user interface and to con-

281
Figure 5: CORDAM.
Figure 4: VOCODER.
viously created by means of
trol and monitor in realtime EDIT20.
the MARS board. After that, the graphics
toolkit is used to manage
user actions (slider move-
VOCODER This application ments, data entry, ... ) while
represents a 12 channel voco- TSERVER functions are con-
der, based on two banks of tinuously called to monitor
second order filters, where the the amount of energy in each
frequency, the bandwidth, and VOCODER's channel.
the amplitude of each filter are
controlled in realtime. An in- CORDAM This application
coming signal from one ADC implements the physical model
channel (typically a voice) is of a string composed by 37
analyzed by the first bank of springs and 36 masses.
filters. The analysis results It is an experimental envi-
are used by the second bank ronment that lets the user con-
for the resynthesis of external figure the system, excitate it,
signals coming from another graphically visualize the re-
ADC channel. It is also pos- sults and hear them.
sible to use a noise generator The parameters of the model
and/or a MIDI controlled os- can be changed in realtime
cillator as excitation sources while the string is playing. It
for the resynthesis. is possible to apply an extra in-
This application employes stantaneous force at any point
the graphics toolkit's callback of the string, by simply clicking
mechanism to link user inter- the mouse at that point. One
actions with the activation of can also choose the strength
the other APPLI20 libraries. and the direction of the force.
First, the program initial- The algorithm computation,
izes the MARS board, us- at the MARS sampling rate,
ing MSERVER and SSERVER and the slower redrawing of the
functions to install the MIDI string, are asynchronous. To
performance environment pre- synchronize them, a step-by-

282
FFT This application per-
forms a 2048 band realtime
Fast Fourier analysis on an
input signal and graphically
shows the results on the screen.

The input signal may come


from the outside via the ADC
Figure 6: FFT.
converters, or may be synthe-
sized by any algorithm running
on MARS itself. In this case
the FFT can be used to debug
step mode is also provided that a previously defined algorithm.
makes easier to analyze the be-
haviour of the system. The FFT is based on a spe-
The following parameters are cial X20 microprogram, writ-
configurable: 36 independent ten using ASM20: it divides
masses damping coefficient, the input flow in 2048 sam-
elastic coefficent of the springs ple slices, performs the analy-
and gravitational acceleration. sis which gives a complex re-
It is possible to free/lock in- sult, and computes the mod-
dependently the left and rigth ulus and the logarithm of the
edges of the string. The pro- modulus of this result; the out-
duced sound can be sampled put is saved at a fixed address
and saved in a file to be used in the X20 FUN memory.
for pedagogical purpose.
The FFT program config-
The CORDAM's micropro- ures the DSP chips through
gram is written using the TSERVER functions that load
ASM20 assembler [3] to maxi- data and microprograms into
mize the number of springs and the sound generation board.
masses in the string. EDIT20 Then in realtime it reads data
is used to create an orches- from the X20 FUN memory,
tra from Assembler programs. displays them on the screen
Only one X20 is used. and manages the user interac-
The orchestra is loaded using tion.
MSERVER; the updating of
the parameters and the step- The graphics interface is
by-step mechanism are made built using the APPLI20 tools.
through TSERVER calls. In The callback mechanism al-
order to edit the 36 masses, the lows one to separate code for
user can specify his/her own the user interface from that
formula that is translated and for graphics. FUN memory
evaluated by an interpreter in- data is read continuosly by the
cluded in APPLI20. TSERVER functions.

283
Conclusion "MARS: RT20M/EDIT20
- Development tools and
MARS is an open system that graphical user interface for
can be used to explore and be a sound generation board",
adapted to a wide range of user ICMC '92 Proceedings, pp.
needs. 340-343, Oct. 1992.
APPLI20 has proven to be
a powerful tool to easily and [2] F. Armani, L. Bizzarri,
quickly develop user applica- E. Favreau, A. Paladin:
tions. "MARS-DSP environment
At this time, IRIS has de- and applications", ICMC
veloped applications in many '92 Proceedings, pp. 344-
fields, such as: 347, Oct. 1992.

• Music composition, live [3] E. Favreau, A. Presti-


electronics, recording; giacomo: "ASM20 User's
Guide", IRIS, 1990.
• Education- musical instru-
ments' physical modelling, [4] E. Favreau, S. Sapir: "La
synthesis methods; Stazione MARS: dalla pro-
gettazione di algoritmi alia
• Science - voice encoding/ realizzazione di ambienti
decoding with LPC meth- esecutivi dedicati", X CIM
ods, image processing. Proceedings, Milano 1993.

IRIS is studying the porta- [5] E. Maggi, A. Prestigia-


bility of APPLI20 and, more como: "Portability of the
generally, of the MARS system MARS System", X CIM
to other platforms [5]. Proceedings, Milano 1993.
[6] D. Prochnow: "The GEM
Acknowledgements Operating System Hand-
book", Tab Books Inc.,
Special thanks to Emmanuel 1987.
Favreau who contributed to
the APPLI20 design, and to
Lorenza Bizzarri and Andrea
Paladin who solved the DSP
aspects of the CORDAM and
FFT applications.

References
[1] P. Andrenacci, E. Favre-
au, N. Larosa, A. Prestigia-
como, C. Rosati, S. Sapir:

284
LA STAZIONE MARS: DALLA PROGETTAZIONE DI
ALGORITMI ALLA REALIZZAZIONE DI AMBIENTI
ESECUTIVI DEDICATI
E. Favreau, S. Sapir

I R I S S.r.l.
Parco La Selva 151
03018 Paliano (FR) ITALY
Fax +39 775 533343 - Tel +39 775 533441
E-mail: mc2842@mclink.it

Abstract: MARS is a development system for


every type of real time digital
The Musical Audio Research
signal processing such as analysis,
Station (MARS) is a specialized
synthesis, every type of filters, and
digital machine for real time audio
sound effects. MARS is also a
applications which has been
development system for sounds and
entirely developed by the Italian
MIDI environment that allows
Bontempi-Farfisa research institute
musicians to use it as a musical
IRIS (Istituto di Ricerca per
instrument, once configurated, such
l'Industria dello Spettacolo)
as any MIDI equipment of a
located close to Rome.
musical studio.
MARS has been conceived as an
interactive and integrated
environment for audio research, 1. INTRODUZIONE
musical production and computer MARS e costituito da una scheda di
music pedagogy. It is dedicated to generazione suono chiamata
people who have hit the limits of SMlOOO [4] Questa scheda
the digital musical instruments contiene:
available right now on the market
and to people who like a • un microcontrollore (MC68302)
programmable and flexible sound per la gestione in tempo reale di
machine with real time tutto il sistema;
performance. The easy and • due processori X20 per il
interactive user interface provides trattamento del suono in tempo
a means of graphic definition for reale.
audio objects and an immediate
sound feedback.

285
La potenza di un processore X20 (processi di elaborazione del
pUO essere stimata valutando il suono), timbri (configurazioni dei
numero di operazioni DSP eseguite parametri di controllo degli
in parallello ad una data frequenza algoritmi), orchestre (insieme di
di campionamento. Un X20 puo ad algoritmi) ed ambienti esecutivi
esempio realizzare un banco di 256 MIDI (insieme di timbri e la loro
oscillatori indipendenti mappa MIDI).
controllabili in frequenza ed
EDIT20 consente 10 sviluppo
ampiezza, oppure un banco di 256
immediato di tutti gli oggetti sonori
filtri del secondo ordine. Ma puo
e musicali che contribuiscono alla
anche realizzare una FFf su 2048
definizione di un ambiente
punti, oppure 16 voci FM a 4
esecutivo. Esiste, per ogni oggetto,
operatori 0 16 harmonizer, oppure
uno strumento 0 un editor che ne
una combinazioni di qualsiasi tipo
permette la definizione.
di algoritmi.
Nei prosslml paragrafi verra
La scheda SM1000 e collegata ad
descritta una sessione di lavoro con
un personal computer tramite una
EDIT20 che ne illustrera gli aspetti
linea MIDI e/o una linea parallella.
pili significativi.
Quest'ultimo supporta l'interfaccia
utente per la configurazione di tutto
il sistema.
2. ALGORITMI
Una volta configurata, la scheda
SMlOOO puo essere sconnessa dal Si vuole realizzare l'algoritmo di
personal computer host, diventando Karplus-Strong [7] che simuli Ie
cosl uno strumento controllato da corde pizzicate. Per realizzare un
MIDI. algoritmo e necessario attivare
l'editor ALGO. La videata
n software Ill, [6] che consente la corrispondente a questa editor
configurazione e 1'uso della comprende un menu, una zona di
stazione comprende: disegno e una palette di icone che
.. il sistema operativo RT20M rappresentano i moduli predefiniti
residente sulla scheda SMlOOO, dal sistema per la costruzione degli
incaricato della gestione dei algoritmi.
processori X20, dei messaggi MIDI EDIT20 fomisce circa 100 moduli
e di altri eventi asincroni; che coprono la maggior parte dei
.l'interfaccia utente EDIT20, bisogni nel campo DSP [2]. Questi
ovvero l'ambiente di sviluppo moduli sono ordinati secondi criteri
grafico integrato con editors funzionali.
interattivi, usato per la It Operatori elementari: aritmetici,
progettazione di algoritmi logici, meccanici, e accessi alla

286
memoria estema per campioni 0 feedback costituita da una linea di
linee di ritardo, ...; ritardo e seguita da un filtro passa-
basso del primo ordine per simulare
.. Generatori di segnale: oscillatori
la propagazione smorzata dell' onda
semplici, generatori di rumore,
lungo la corda.
algoritmi pili complessi come per
esempio la sintesi additiva, ...; I moduli di questa algoritmo sono
controllati da parametri i cui valori
.. Generatori di segnale sotto
possono essere definiti molto
campionato: inviluppi, LFO,
facilmente durante 10 sviluppo
vibrato, ...;
dell'algoritmo per verificame il
.. Modulatori e distorsori: FM, buon funzionamento.
DNL,AM, ...;
.. Trasformatori logici: trigger,
Sample&Hold, noise gate, ...; 3. DEBUG
.. Effutti e trasformazioni applicati EDIT20 mette a disposizione,
suI segnale audio nel tempo nell'ambiente ALGO, un insieme di
(ritardo, riverbero, ...), sullo spettro strumenti per il test dell' algoritmo
(filtri), sull'altezza (hannonizer, che consentono di sentire e
pitch follower, ...) e sull'ampiezza visualizzare il segnale. In effutti
(mixer, pan, envelope follower, ...). esistono 4 tipi di sonde che possono
essere inserite in qualsiasi punto
Per costruire l'algoritmo, l'utente dell'algoritmo per seguire il flusso
deve assemblare nell' area grafica del segnale audio:
una serie di moduli interconnessi
fra di loro. Questo viene realizzato, CD4 sonde audio rappresentate da un
in modo interattivo, con l'ausilio aIto-parlante;
del mouse e di combinazioni di .. 4 display numerici rappresentati
tasti. da una piccola finestra quadrata;
L'algoritmo e implementato suI .. 4 tracce di un oscilloscopio
processore X20 al momenta della rappresentate da una sonda;
realizzazione grafica - senza tempo
di attesa 0 di compilazione. .. 4 ingressi per l'iniezione di
segnale analogico rappresentati da
Nel nostro esempio (Fig. 1) si puo una siringa.
osservare l'algoritmo KS. Un
generatore d'inviluppo (env) InoItre esistono altri strumenti
controlla l'ampiezza del rumore comuni a tutti gli altri editors:
(RND) per creare l'eccitazione .. VALUATOR e uno strumento per
della corda. Questa eccitazione I'editing dei valori numerici. Ha la
viene poi inserita in una catena con particolarit a di poter rappresentare

287
e manipolare i dati nell 'unita che si nella figura 1.
desidera (Hz, dB, ms, ...).
La variabile pitch assume
• X-DEBUG e uno strumento che l'attributo DYNAMIC, ClOe
simula una tastiera MIDI. Inoltre, dipendente dai valori di dispositivi
consente di monitorare il contenuto MIDI secondo una regola di
delle memorie dati dei processori conversione scelta fra i 17 preset.
X20.
In questo caso la regola di
• WAVE-LOADER e uno conversione e:
strumento per accedere alIa
pitch=M 1
memoria estema dei processori
X20 per operazioni di dump 0 load con M 1 definito come segue:
di campioni su 0 da file.
M 1 =KSPITCH (key )* 1.0
Con questa regola l'utente fa
4. TIMBRI dipendere i1 valore del ritardo dal
valore del tasto premuto. In effetti
L'algoritmo KS visto il valore del tasto (key) e usato
precedentemente e controllato in come indice di una tabella
modo statico con valori costanti. (KSPITCH) che contiene la
Per poter usarlo in un contesto descrizione della scala di altezza
musicale e dal vivo, e necessario desiderata.
trasformarlo in uno "strumento
musicale", Un timbro di questo algoritmo e
quindi caratterizzato dalla
Questo avviene definendo definizione della variabile pitch,
l' interfaccia timbrica per quanto riguarda la sua risposta
dell' algoritmo. Ovvero scegliendo ai controlli MIDI, e dalla
l'insieme dei parametri che devono definizione dell'inviluppo env, per
essere "suonati" dal vivo, e il suo andamento temporale. In
stabilendo Ie regole di conversione questo modo per produrre diversi
da applicare sui dati ricevuti dai suoni basta definire dei timbri
dispositivi MIDI di controllo. diversi sfruttando un solo
In questo esempio (figura 3), si algoritmo.
controlla la frequenza del suono Gli editor PAR, ENV, LFO e TAB
doe il parametro che fissa la durata intervengono nella definizione del
del ritardo ovvero la lunghezza timbro per fissare rispettivamente i
della corda. parametri di tipo DYNAMIC, gli
Si crea quindi una variabile di inviluppi, gli LFO e Ie tabelle di
nome pitch che sostituira la conversione (come ad esempio
costante del ritardo, fissata a 4 ms KSPITCH) [8].

288
5. ORCHESTRE 6. AMBIENTI ESECUTIVI
L'orchestra e costituita da un Per completare la sessione di
insieme di algoritmi, ognuno dei lavoro con EDIT20, va introdotto il
quali e collegato al sistema di bus concetto di TMAP che e
audio (8 uscite e 4 ingressi). Per l'equivalente delle mappe MIDI
ogni algoritmo e necessario fissare presenti in tutti gli strumenti MIDI
il grado di polifonia, cioe il numero commerciali. La TMAP e descritta
di duplicazioni dell' algoritmo. Cosl tramite l'editor omonimo. Essa
vengono definite Ie famiglie di raccoglie, per ogni algoritmo,
algoritmi dell' orchestra. I'elenco dei timbri utilizzati nella
performance, e la relazione di
In questa esempio (figura 4),
default fra i canali MIDI ed i
l'algoritmo precedente KS e stato
timbri.
duplicato 8 volte, e la famiglia
risultante e stata collegata al bus di La mappa MIDI puo contenere fino
uscita 1. Inoltre sono stati aggiunti a 128 timbri che possono essere
2 effetti, un effetto monofonico assegnati dinamicamente ai canali
(EFFECT_I) e uno stereofonico MIDI durante la performance con
(EFFECT_2). Questi ultimi l'invio di messaggi di Program
algoritmi riprendono il segnale Change.
prodotto dalla famiglia KS suI bus
In questa esempio sono stati
1, 10 trasformano e 10 diffondono
richiamati 2 timbri per l'algoritmo
rispettivamente sui bus di uscita 2,
KS: Plucked e Bowed. n primo
7 e 8.
riproduce un suono di corda
L'orchestra e realizzata tramite pizzicata mentre il secondo
l'editor ORCH. In questa ambiente riproduce un suono di corda
Ie interazioni con la scheda strofinata con l'archetto (la
SMlOOO non sono pili eseguite in differenza dipende soprattutto dalla
tempo reale, rna 1'0rchestra forma dell'inviluppo). Questi
richiede una compilazione prima di timbri sono stati associati
essere salvata ed inviata alla rispettivamente ai canali MIDI #1 e
scheda di generazione suono. #2.
Esiste nell'ambiente ORCH un Anche gli effetti hanno i loro propri
mixer che consente, dopo aver timbri: Effect_a ed Effect_b. Questi
inviato l'orchestra, di bilanciare in timbri sono stati associati ai canali
tempo reale i livelli di uscita dei MIDI #3 e #4.
bus audio.
Una volta fissata la TMAP, l'utente
dispone di un'orchestra multi-
algoritmica pronta per essere usata
dal vivo tramite MIDI usando una

289
master keyboard, un sequencer, 0 REFERENCES
qualsiasi programma 0 dispositivo
di generazione di dati MIDI. In 1. P. Andrenacci, E. Favreau, N.
questa esempio, l'utente potra Larosa, A. Prestigiacomo, C.
attivare gli effetti con un NOTE ON Rosati, S. Sapir, "MARS:
sui canali #3 e #4, e potra suonare RT20MIEDIT20 - Development
l'algoritmo KS in un contesto tools and graphical user
polifonico (8 voci) e multi-timbrico inteiface for the sound
(2 timbri). generation board", ICMC proc.,
pp. 340-343, San Jose 1992.
2. F. Armani, L. Bizzari, E.
7. CONCLUSIONE Favreau, A. Paladin, "MARS -
Con EDIT20 e possibile sviluppare DSP environment and
in modo semplice ed intuitivo applications", ICMC proc., pp.
algoritmi, timbri, orchestre e 344-347, San Jose 1992.
mappe MIDI. Ma EDIT20 rimane 3. P. Andrenacci, F. Armani, A.
uno strumento di uso generale che Prestigiacomo, C. Rosati,
non ottimizza Ie risorse della "APPLI20: a development tool
stazione. Assieme alIa stazione for building MARS applications
MARS ven~ono quindi forniti un with an easy to use grafical
assembler [ ed una liberia "C" [3] inteiface", atti del X CIM,
per chi volesse costruire orchestre Milano 1993.
ottimizzate e sviluppare proprie
interfaccie utente oppure proprie 4. S. Cavaliere, G. Di Giugno, E.
applicazioni di controllo. Guarino, "MARS: X20
architecture and SMIOOO sound
MARS e stato utilizzato in diversi generation board description",
centri di ricerca per 10 studio di ICMC proc., pp. 348-351, San
nuove tecniche di elaborazione del Jose 1992.
suono e per la didattica
delI'informatica musicale. Ma 5. E. Favreau, A. Prestigiacomo,
questa stazione si e rivelata uno "ASM20 User's Guide", IRIS
strumento molto gradito dai doc. interna, 1990.
musicisti soprattutto per quanta 6. IRIS, "Musical Audio Research
riguarda la realizzazione di Station - Guida per l'utente",
ambienti esecutivi per il genere doc. interna, 1992.
"Live Electronics ".
7. K. Karplus, A. Strong, "Digital
synthesis of plucked String and
Drum Timbres", Computer
Music Journal, vol.7, n.2, pp.

290
43-55, Summer 1983.
8. G. Palmieri, S. Sapir, "MARS -
Musical applications", ICMC
proc., pp. 353-353, San Jose
1992.

o
""""""""""""."""""""""""""'""""",':"':':':' (>

Figure 1. Videata dell'editor ALGa

Edi~in ••••

Figure 2. VALUATOR

291
.~".
''''1'''(
DEFAULT
'''''~. 1.00000
0
B
Z''''l.,ITl~

.~,..
U"Z"'I
DEfAULT
".~. 1. 00000
1''''1.,I11~
0
B
Figure 3. PAR: Definizione della variabile pitch

If<. File Edit Dis h Uti lit

.~:.:

1-- -od'./"_----I---~I~J)

Figure 4. aRCH: orchestra DEMO

292
CELLE·FUNZIONE PER LA REALIZZAZIONE DI
SISTEMI MUSICALI ELETTRONICI
E. Guarino, R. Bessegato, E. Maggi

I R IS S.r.l.
Parco La Selva 151
03018 Paliano (PR) ITALY
Fax +39775533343 - Tel +39 775533441
E-mail: mc2842@mclink.it

Abstract
As for any kind of product the designers of musical systems must focus on
the need for cost reduction, time to market, better performance, and
consistent quality. These conflicting exigencies call upon the designer's
ingenuity.
In the last decade silicon technology has moved through important and
major improvements. It has been helped through computer-based design
aids and libraries ofpre-compiled functions.
But this is not enough: the trend toward more complex and complete
systems has justified the rising use of dedicated functions that are
eventually incorporated into the standard libraries.
At IRIS the need to implement families of musical systems has fostered the
study and design of a dedicated library containing macrocells particularly
suited for musical applications.
As a result IRIS has created a Microcontroller Softmacro, suitable for DSP
management and control, and a DSP-like Sound Generator that can both
manage a wide variety of sound-generation algorithms and implement a
poliphony that is adaptable to the system's needs.
In this paper, utilizing IRIS's accumulated knowledge, the pros and cons of
such an approach will be depicted, together with the description of a
particular device designed according to these methodologies.

umane e l'ausilio di sistemi molto


1. Aspetti evolutivi nella pro- costosi che solo Ie grandi compa-
gettazione degli ASIC gnie di semiconduttori possono
permettersi.
La progettazione tradizionale La grande diffusione degli
dei circuiti integrati prevede il di- ASIC (Application Specific
segno a livello di transistor[l ][2), Integrated Circuit), ovvero di cir-
Con questa metodo, oggi identifi- cuiti progettati "su misura", ha po-
cato con l'appellativo di "full tuto verificarsi soltanto dopo l'in-
custom", il progetto di un circuito troduzione di un metodo altemati-
integrato richiede notevoli risorse YO, meglio nota come "semi-

293
custom", che solleva la visuale del tiva (RAM, ROM, adders,
progettista dal transistor alIa cella multipliers, etc.) a fronte di un piu
funzionale. L'approccio ASIC, so- elevato costa di sviluppo;
spinto dall'esigenza di ridurre il • gli approcci Gate-Array e Sea-
costo progettuale, e stato reso of-Gates (GA,sOG) sono preferite
possibile da alcune circostanzd31: in assenza di macrocelIe (RAM,
• l'evoluzione della tecnologia ROM) e/o in virtU del minor costa
del silicio e la capacitil di mante- di sviluppo;
nere entro limiti sufficientemente • la piu recente soluzione
prevedibili i1 comportamento dei Embedded Array (EA) rappresenta
transistor nei circuiti integrati di un compromesso tra SC eGA.
tipo digitale; La tecnologia del silicio ha co-
• la possibilita di rappresentare nosciuto negli ultimi anni uno svi-
con semplici modelli astratti i1 luppo vertiginoso. Questo ha pro-
comportamento di tali circuiti; dotto da un lato la drastica ridu-
• 1a disponibilitil di sistemi CAE zione delle dimensioni dei
abbastanza sofisticati da permet- transistor (di un fattore superiore a
tere di gestire la descrizione, il 10) con conseguente aumento
piazzamento e la simulazione di della densitil degli elementi circui-
vasti insiemi di celle elementari. tali e dall'altro la diminuzione del
La metodologia di progetto numero di difetti per unitil di su-
semi-custom possiede rispetto a perficie che comporta l'aumento
quella full-custom la stessa va- delle dimensioni possibili dei
lenza che ha un linguaggio ad alto chip.
livello rispetto ad un linguaggio La combinazione di questi fat-
assemblativo: minori tempi di tori ha permesso un notevole au-
progetto e superiore complessitil a mento di complessita dei circuiti
scapito dell'efficienza e della velo- integrati in generale e degli ASIC
cita. In particolare, per i circuiti in particolare. Per meglio affron-
integrati, risulta relativamente fa- tare tale complessita, si capisce
cile quantificare la convenienza come sia diventato necessario un
dei diversi approcci: 10 sforzo pro- approccio logico-sistemistico che
gettuale di un full-custom si giu- puo ben definirsi "ad alto livello".
stifica solo se il numero di pezzi Queste circostanze impongono
da produrre ne rende possibile un uso massiccio di software per
l'ammortamento. Viceversa, pro- 10 sviluppo e la verifica dei pro-
duzioni limitate 0 ristrettezza dei getti: per citare qualche esempio, i
tempi di sviluppo privilegiano l'a- simulatori analogici e logici, i mo-
gile rnetodologia ASIC, la Quale a delli comportamentali, i pacchetti
sua volta offre diverse alternative: di sintesi logica e produzione au-
• l'approccio a Standard Cell tomatica di Test-Vectors [4][51.
(SC) ottimizza l'area e si presta I sistemi musicali, sia nella
all'integrazione di grosse celIe ca- parte di sintesi che in quella di
ratterizzate da una struttura ripeti- controllo, non si sottraggono a

294
questa tendenza; allo stesso tempo rna calcolati a velocita inferiori:
l'obbiettivo della massima integra- infine, Ie interazioni del sistema
zione rende consigliabile la pre- can l'estemo (attraverso tastiere,
senza di sezioni "intelligenti" ov- switch 0 potenziometri) sono tipi-
vero programmabili: queste, in- camente asincrone e possono es-
fatti, consentono la semplifica- sere eseguite a velocirn ancora in-
zione del circuito attraverso l'eli- feriori, data la lentezza dell'inter-
minazione di complesse reti com- vento umana rispetto alla scala
binatorie e macchine a stati, oltre temporale di un circuito elettro-
ad una maggiore controllabilita e nico.
linearirn di progetto. Da questa analisi, data l'assenza
Un ostacolo a questa tendenza e suI mercato di componenti funzio-
tuttora costituito dall'alto costa di nali dotati delle caratteristiche
integrazione all'intemo di un si- volute, deriva la scelta di realiz-
stema di un qualsiasi processore zare un insieme di celle-funzione
commerciale. Puo risultare allora atte ad assolvere i compiti de-
conveniente persino il progetto di scritti, e che abbiano la caratteri-
un processore costruito "ad hoc": stica di essere flessibili,
questa soluzione risulta partico- facilmente interfacciabili fra lora e
larmente vantaggiosa per sistemi scalari, cioe di pater aumentare 0
di fascia bassa dove, malgrado la diminuire la complessita
quantita dei controlli da effettuare, architetturale in funzione della
l'area occupata dall'unita di con- complessita del sistema musicale
trollo deve essere ridotta ai nel quale vengono utilizzate.
minimi termini. Da un punta di vista realizza-
tivo, una cella-funzione e un cir-
2. Distribuzione dei compiti cuito costituito da celle elementari
in un sistema musicale che realizza una determinata fun-
zionalirn di tipo logico a un livello
E' possibile individuare nei si- di astrazione superiore.
stemi musicali elettronici alcune La scelta di realizzare una libre-
funzioni caratteristiche differen- ria di questa tipo, se da un lato
ziate in base alla frequenza can la comporta 10 svantaggio di dover
quale esse vanna processate. La impiegare un certo tempo di pro-
realizzazione di tali funzioni si getto e sviluppo, dall'altro da il
differenzia solo per Ie caratteristi- vantaggio di diminuire drastica-
che architetturali, che rispec- mente i tempi di sviluppo dei pro-
chiano i diversi vincoli a cui esse dotti che di tale libreria fanno usa,
sono soggette. La sintesi del di conoscere a priori can una
suono, ad esempio, deve avvenire buona risoluzione il costa di ogni
in maniera rigorosamente sincro- singola parte del sistema e soprat-
na, ripetitiva e veloce; i parametri tutto di modificare senza traumi la
a bassa frequenza (LFO, timers) struttura del sistema in corrispon-
sono anch'essi sincroni e ripetitivi denza di variazioni delle specifi-

295
che nel corso del progetto. del sistema. Questa soluzione, se
Vale la pena di sottolineare che da un lato offre i vantaggi di basso
l'integrazione di celle-funzione costo e minore necessita di
rappresenta una frontiera dell'attu- intervento in fase di produzione,
ale metodologia di progetto del- presenta il rischio di trasmettere a
l'hardware; e sufficiente a questo tutto il chip l'alta probabilita
proposito ricordare che la CEE, d'errore presente nel software. Per
nell'ambito del progetto ESPRIT, ovviare a questo problema, come
ha lanciato il progetto aMI-DE si vedra in seguito, sono
(Open Microprocessor Initiave necessarie procedure di test molto
Deeply Embedded) per l'integra- piu approfondite delle consuete
zione di macrocelle con architet- simulazioni circuitali.
tura complessa (RISC). In questo II DSP dispone di potenza di
programma l'IRIS e' presente con calcolo e lunghezza di parola ade-
il progetto MEDUSA[6l. guate al profilo medio-basso ri-
chiesto dall'applicazione. Ad esso
3. Descrizione di un disposi o e assegnata anche la parte di gene-
tivo realizzato razione dei segnali a bassa fre-
quenza (LFO, timers) e quindi la
Un progetto per il quale e stato sincronizzazione macrotemporale
adottato questo approccio e rap- di tutto il sistema (metronomi,
presentato da un dispositivo desti- scansione della tastiera, controllo
nato alia realizzazione di un gio- LED ed auto spegnimento).
cattolo musicale dai particolari re- Fra i compiti del SMC c'e la
quisiti di prestazioni audio (quali configurazione del sistema, la ge-
la presenza di una sezione ritmica, stione del playback e deli'intera-
la politimbricitil con varie tecniche zione con l'estemo, la trasmissione
di sintesi) e interattivita con l'e- al DSP dei parametri necessari
sterno (controllo di tastiera, alIa sintesi (frequenze, durate,
playback, ecc.). II prodotto doveva timbri).
essere costituito da un singolo Alia cella PWM e affidato il
chip al fine di minimizzare il costa compito della conversione DA.
dei componenti e del loro n dispositivo occupa una super-
assemblaggio. DaIle specifiche ficie di 16 mm2 •
funzionali del prodotto e' stato
possibile definire una struttura 3.1. La cella DSP
modulare per l'integrato, composta
da un'unita di sintesi (DSP), II funzionamento del DSP si
un'unita di controllo (Soft Macro basa su un flusso circolare di dati
Controller - SMC) e una unita di sincrono che mota attomo ad una
conversione (PWM). Dato l'alto memoria, controllato da un mi-
grado di integrazione adottato nel croprogramma. La memoria dati
progetto, e stato scelto di cablare contiene Ie informazioni essen-
il software di controllo all'intemo ziali per realizzare la sintesi. La

296
semplicita della struttura non im- viata da un'informazione di
pedisce l'implementazione di sva- NOTON proveniente dal control-
riati algoritrni per la generazione lore, procede automaticamente al
del suono: modulazione di fase, calcolo dei primi 3 segmenti del-
sintesi tabellare a due oscillatori l'ADSR; il messaggio di
interpolati, generazione di rumore, NOTOFF attiva il quarto.
filtraggio, ecc.; altri semplici al- La dimensione del micropro-
goritrni realizzano un timer pro- gramma e limitata grazie ad una
grammabile e un LFO. L'uscita opportuna organizzazione delle
audio digitale a 16 bit e inviata istruzioni in moduli: la combina-
alla cella PWM. zione dei vari moduli permette di
Da un punto di vista funzionale implementare in maniera sem-
il DSP consta dei seguenti blocchi plice, efficace e fIessibile tutti gli
fondamentali (fig 1): una memoria algoritrni sopra menzionati. La
RAM di dati (DM) 32x16, un'uni- modularita del microprogramma si
ta' aritrnetico-logica (ALU) a 16 rivela estremamente importante:
bit in grado di gestire automatica- essa consente di modificare Ie ca-
mente inviluppi di tipo ADSR, un ratteristiche algoritrniche dei suoni
moltiplicatore (MUL) 10x8, una senza infIuire sull'architettura del
memoria ROM 512x8 contenente DSP.
Ie informazioni di timbro e Ie ta- Nell'implementazione in esame,
belle delle forme d'onda, una se- la cella realizza una polifonia di 8
conda memoria ROM (FREQ) per voci, ognuna delle quali puo' es-
la conversione del codice tasto in sere associata ad un algoritrno di-
frequenza; il fIusso dei dati e re- verso.
golato da una ROM di micropro- La cella PUQ essere facilmente
gramma (128x20), mentre una scalata in funzione della polifonia
semplice interfaccia consente la e della precisione aritmetica ne-
comunicazione con la cella SMC. cessaria.

3.2. La cella PWM

Questa cella realizza la conver-


sione digitale-analogico a 11 bit
senza penalizzare troppo dal punto
di vista qualitativo l'uscita audio;
allo stesso tempo essa riduce al
minimo il costo del modulo di
conversione. Infatti la sperimen-
tazione acustica effettuata durante
10 sviluppo ha confermato l'ido-
fig. 1 neita della conversione PWM gia
a 10 bit, soglia facilmente raggiun-
La gestione degli inviluppi, av- gibile con la frequenza di lavoro

297
del dispositivo (32MHz); trattando
poi in maniera adeguata it bit Nell'intento di risparmiare area,
meno significativo, si e' elevata la sono stati riuniti in un solo blocco
soglia a 11 bit. fisico tutti gli elementi di memoria
necessari:
3.3. La cella SMC - i registri di CPU;
-10 stack;
La struttura di questa cella (fig. - i puntatori alla zona dati (utili
2) risponde alle esigenze di com- per seguire individualmente Ie li-
pleta controllabilitil e di minimiz- nee di polifonia della spartito).
zazione dell' area. II numero di tali elementi e
In particolare essa deve gestire stato determinato in funzione delle
la scansione della tastiera e dei particolari esigenze
puIsanti di comando, interpretame dell'applicazione in:
il significato e trasmettere i para- - 32 registri (8 bit);
metri corrispondenti ai comandi - 8 livelli di stack (16 bit);
ricevuti. Tra questi, la selezione - 8 puntatori (16 bit);
del periodo di temporizzazione il tutto inc1uso in una RAM di
dell'intero sistema. 64x8.
Una parte della gestione ha La larghezza di parola scelta (8
come effetto l'emissione imme- bit) consente l'eventuale esten-
diata di messaggi verso la cella sione dello spazio di memoria al-
DSP, un'altra parte attiva la lettura l'estemo del dispositivo.
e la decodifica di uno spartito, n set di istruzioni e stato scelto
un'altra ancora ha il compito di ri- in base alle caratteristiche del si-
configurare it sistema. stema in cui la cella e inserita.
Ogni istruzione e realizzata me-
diante l'esecuzione di una serie
variabile di nano-istruzioni: defi-
nite queste ultime, diverse se-
quenze generano diverse macro-
istruzioni. La decodifica delle sin-
gole nano-istruzioni viene effettu-
FLOW CTRL ata in due fasi, di cui la prima in-
ALU t----~----
dividua un gruppo omogeneo di
nano-istruzioni, ovvero il sotto-
gruppo di segnali di controllo
coinvolti, e la seconda modifica
effettivamente illoro stato.
Veloci modifiche del set di
istruzioni possono dunque effettu-
arsi sia attraverso la riorganizza-
zione di una nano-routine, sia
fig. 2 mediante un intervento suI se-

298
condo livello di decodifica. brio
Allo scopo di minimizzare il Una parte importante dell'am-
consumo e di sincronizzare il fun- biente di sviluppo e costituito da
zionamento del DSP e del SMC e un pacchetto di software grafico
stato inserito un meccanismo di altamente user-friendly per il mo-
"sleep", tramite il quale la cella nitoraggio e il debug dell'emula-
puC> disabilitare il suo clock fino tore. Questo programma e stato
ad un evento di "awake" prove- scritto in modo da risultare porta-
niente da un timer. bile sia sull'ambiente software che
L'intera cella, esc1usa la memo- sulI'emulatore. Alcuni strumenti
ria di programma e dati, e compo- per la codifica degli spartiti e per
sta da circa 2000 gates e puC> ese- la gestione della comunicazione
guire un'istruzione nel tempo me- MIDI sull'Atari hanno completato
dio di 160 nsec. l'ambiente di sviluppo.
L'uso del simulatore C ha con-
3.4. II software per la cella sentito di provare nuove istruzioni
SMC che, durante i1 test, sono state so-
stituite ad altre poco usate, con ri-
II software di base consiste in sparrnio di memoria nel codice
un linguaggio assembler[7], realiz- oggetto.
zato con LEX e YACC su sistemi II programma di controllo ese-
Unix, e in un simulatore scritto in gue la gestione di tastiera, l'allo-
C che realizza un modello funzio- cazione delle voci nel DSP, i1
nale della cella SMC. controllo dell'esecuzione degli
Entrambi gli strumenti sono spartiti ed alcune funzioni ausilia-
stati portati su macchine Atari rie (gestione di LED, metronomo,
data l'estrema facilita di controlli di volume e timbro).
interfacciamento di questi sistemi Inoltre la comunicazione con il
con piccole schede hardware. DSP consente di regolare la velo-
L'ambiente ha soddisfatto l'esi- cita dei sincronismi e la frequenza
genza di verificare la congruenza del metronomo.
del set di istruzioni definito e la Un notevole vincolo per il pro-
correttezza del programma appli- gramma di controllo e stato rap-
cativo da scrivere in ROM. presentato dalla dimensione mas-
Successivamente, allo scopo di sima della ROM: 4k bytes, meta
approssimare il comportamento dei quali dedicati alla memorizza-
reale del sistema, e stato realizzato zione degli spartiti in dotazione
un emulatore con circuiti FPGA allo strumento.
(ACTEL) i1 quale ha perrnesso, II progetto della cella e pro-
unitamente al software di base, di gredito di pari passo con la defi-
collaudare il programma su un nizione del software di base e dei
supporto hardware. Un altro emu- primi ambienti di sviluppo
latore perrnetteva la sintesi del software necessari alIa stesura dei
suono e la messa a punto dei tim- programmi. Questo sviluppo

299
parallelo incontra tre esigenze:
• poter gia disporre del software [3] L. Abbondandolo, V.
quando il progetto hardware e Liverini: "La progettazione di chip
terminato; dedicati ", Elettronica oggi,
• poter simulare in maniera piu Settembre 1986.
accurata il progetto, attraverso la
virtualizzazione fornita dall'am- [4] R. Libeskind-Hadas, N.
biente di sviluppo; Hasan et alii: "Fault covering
• poter disporre di un feed-back problems in reconfigurable VLSI
continuo sia nella scrittura del systems ", Kluwer Academic
software che nel progetto del- Publishers, 1992.
l'hardware.
[5] D. Ku, G. De Micheli:
4. Conclusioni "High level synthesis of ASIC
under timing and synchronization
L'approccio delle celle-funzioni constraints ", Kluwer Academic
e del software associato al loro Publishers, 1992.
sviluppo comporta 10 svantaggio
della dilatazione del tempo di pro- [6] N. Larosa, C. Rosati:
getto, rna offre gli aspetti positivi "MEDUSA: a powerful MIDI
della flessibilita, della semplicita processor ", X CIM Proc., 1993.
di intervento e della riusabilita dei
prodotti. [7] A. Prestigiacomo: "AX User
L'esperienza ha dimostrato la guide ", documentazione interna
validita della metodologia e fa IRIS, 1992.
presagire come sviluppi l'dentifi-
cazione e la creazione di nuove [8] P. Paravano: "U n
celIe, l'aumento della flessibilita e processore integrato per
scalabilita delle celIe gia esistenti. applicazioni musicali ", Tesi di
Ad esempio, il potenziamento laurea in ingegneria elettronica,
della cella SMC e stato oggetto di Universita dell'Aquila, 1993.
una tesi di laurea[8l

References

[1] C. Mead, L. Conway:


"Introduction to VLSI systems",
Addison Wesley Publishing
Company, 1985.

[2] L. Glasser, D. Dobberpuhl:


"The design and analysis of VLSI
circuits ", Addison Wesley
Publishing Company, 1985.

300
PORTABILITY OF THE MARS SYSTEM
E. Maggi, A. Prestigiacomo

I R I S S.d.
Parco La Selva 151,03018 Paliano (FR) ITALY
Fax +39 775 533343 - Tel +39 775 533441 - E-mail: mc2842@mclink.it

MARS[I][2] is a development transport and adapt it to a new


system for real time audio environment in the class is less than
applications. Its complex hardware the effort of redevelopment.[4] In
and software architecture derives MARS the portability will concern
from the requirements of real time both the development environment
processing, interactivity, graphic and the applications.
interfacing, archiving and Differences among the target
communication. platforms make binary portability
More than 100,000 lines of C impossible. In what follows,
code were written for the ATARI portability will be applied to
host to implement VME and MIDI sources, data and user skills.
communication with the sound The key problem of the designer
board, the GEM-based EDIT20[1][2] is to choose portable interfaces to
and to support the development of the resources used in the project,
MARS applications.f3] such as languages, libraries,
In order to adapt MARS to the operating systems, file systems, I/O
most common hardware and devices, GVIs and networks.
software platforms (Mac, UNIX, Different yet compatible
DOS, Windows) a project to study strategies may consist of: a) the
and improve the portability of selection of resources portable or
MARS was initiated at IRIS. It standard or in some way available
mainly focused on the MARS host on target platforms; b) the
environment. The sound board adaptation (manual or automatic)
encapsulates real time aspects and of resources to target environments.
depends upon the host only for its The first requires a deep analysis
parallel connection. of products and strategies on the
This paper will describe the market. The second may involve
results of that project with some automatic translation, dynamic
reusability and maintenance run-time adaptation, and designing
aspects. for portability.[4] In this case, the
designer has to identify and isolate
Portability Strategies the critical aspects that require
An application is portable across adaptation, such as the MARS
a class of environments to the TSERVER[3] and file manager.
degree that the effort required to This can be done through the

301
design of new portable interfaces operations and on transferring data
that abstract and encapsulate (or via a serial port or a binary file.
hide) non-portable aspects, and Moreover, they have diffurent rules
through a good modular structure for the size, alignment, and padding
that isolates non-portable code. of structures with possible
Even if no object-oriented problems of program memory
methodology or tool is adopted, the consumption and of incorrect
design of interfaces should refer to access to structures and structure
some basic principles upon which members. Directives, like pragmas
object-oriented and, usually, good and command-line flags, can be
design is founded (abstraction, used to control these problems, but
encapsulation, modularity, they may not be portable and are
hierarchy)l51. It should consider dangerous when used for include
metrics like coupling, cohesion, interfaces to run-time libraries
sufficiency, completeness, and compiled with different directives.
primitiveness,[5 1 in order to also New abstract interfaces should hide
deal with resource reusability and these aspects. Code should contain
maintenance. names rather than references
The following will describe depending upon the memory order
typical critical aspects, also and the data size.
encountered in the development of In any case, ANSI C, like most
MARS, together with IRIS's programming languages, omits or
strategies and solutions. inadequately specifies interfaces
(Application Programmer
Critical Aspects Interfaces) to the other resources.
• The Programming Language - It is The designer is thus forced to look
the basic resource for portability for many other portable interfaces.
and should hide details of any • The Operating System (OS) - It
usable resource. concerns functionalities like
MARS uses the standard ANSI multiprogramming and process
C, but no commercial compiler is syncronization. In MARS, real-
portable on all target platforms and time aspects are dealt with in the
compilers on different platforms sound board. The host environment
have different behaviours. They uses few functionalities, like I/O,
offer different development event handling or the exec of an
environments, but even worse they external program, waiting for its
make different assumptions on termination. No standard or
issues not solved by the standard or portable API is available on all
they offer non-portable directives. target platforms, even if UNIX-
For example, compilers define based OSs and Windows-NT are
diffurently the size of scalar types, going in this direction. So, in
like int, with possible problems on MARS this aspect was abstracted
bit-mask or size-dependent and isolated.

302
18 The File System - It concerns rewrite all graphical code to reach
operations like CreateDir, a high degree of portability. The
RemoveDir, MoveFile, DeleteFile. designer could create a platform-
Some portable interfaces are independent API that abstracts the
available from portable GUI functionaIities of target Gills,
packages mentioned below, but in isolating and hiding from the
MARS a new abstract platform- application non-portable Gill
independent interface was defined aspects. IRIS made some efforts in
that includes the required file the graphic toolkit of APPLI20 [3],
system functionalities. but it is difficult to define an API
• The JlO System - It concerns the able to satisfy all MARS graphical
access to I/O devices and in needs on all target platforms. This
MARS, particularly, the transfer of problem contemplates the creation
data via MIDI or parallel lines. A of portable high level toolkits,
transmission server, TSERVER,[3] event handling, graphics, graphical
was designed to be easily resources, and interface building
extensible and portable to different tools.
platforms and transmission lines, However, support for portability
through the abstraction of low-level is available on the software market.
access to devices and of data A lot of portable user-interface
protocols. TSERVER supports libraries and portable APIs are
application portability, because it flooding the market, that support
offers a unique API that hides I/O the portability of graphical
device access, data protocol, and applications on different GUIs
even the currently selected across multiple platforms, such as
transmission line. In APPLI20[3] XVT, Wndx, CommonView3,
TSERVER supports MSERVER Cicero. The choice depends upon
and SSERVER portability. The I/O the level of support and
problem was also solved in the functionality one requires, but there
board. is certainly a loss of efficiency. The
• Graphical User InteJface - GUIs real problem is that none of them is
like MS-Windows, Macintosh, standardized, even if XVT has been
OSF/Motif, are very popular, but no chosen by the IEEE committee as
two GUIs are the same. Each has the base for drafting a Layered API
its own look and feel. Window for GUIs. For tools and graphic
styles, mouse actions, and menu resources, each vendor adopts its
arrangement all differ. GUI API's own products. IRIS has the further
are still different, non standard, and problem of providing the
non-portable on all target portability for the ATARI/GEM
platforms. platform, which is supported only
The early adoption of the by Cicero.
GEM/Atari platform for the MARS • Data - In MARS, this concerns
user interface is forcing IRIS to data files describing algorithms,

303
timbres and orchestras that are makes it difficult to improve the
exchanged among applications portability of a complex system. In
running on multiple platforms. The MARS, this is true especially as far
ASCII format helps portability, but as the graphical user interface is
disk space consumption and concerned. A part of APPLI20 is
efficiency sometimes require a already running on more than one
binary format. Rather than using platform. Reusability and
conversion tools, it should be maintenance take advantage of and
possible to read/write data across depend upon the design portability
multiple platforms. This can be choices. Portability goes toward
obtained abstracting memory the definition of a virtual (abstract)
dependencies (for example, computer. More help should come
swapped bytes for an integer on from Software Engineering to
Motorola and Intel) and compiler automate portability.
differences (for example, integer
and structure sizes and padding). References
In any case, code portions should [1] P. Andrenacci, E. Favreau, N.
be isolated and hidden from higher Larosa, A. Prestigiacomo, C.
level users, when they concern file Rosati, S. Sapir, "MARS:
accesses and data formats. RT20MIEDIT20 Development
.. Internationalization - It concerns tools and graphical user inteiface
the portability of software across for the sound generation board",
different languages. A programmer pp. 340-343, ICMC Proc., 1992.
should not hard-code messages or [2] E. Favreau, S. Sapir, "La
language-dependent output text. Stazione MARS: dalla
.. Experience - This concerns the progettazione di algoritmi alla
portability of the user skills across realizzazione di ambienti esecutivi
the target platforms. It is dedicati", X CIM Proc., 1993.
compromised by the differences [3] P. Andrenacci, F. Armani, A.
between operating systems, Prestigiacomo, C. Rosati,
development environemnts, file "APPLI20: A development tool for
systems, and GUls. A developer has building MARS applications with
to know more than one platforms, an easy to use graphical inteiface",
in order to recreate his/her tools X CIM Proc., 1993.
(makefile, shell scripts) or even to [4] J. D. Mooney, "Strategies for
reorganize the directories. Supporting Application Portability",
.. Other aspects - Luckily for us, the IEEE Computer, Vo1.23 , N.11,
problems of networking and pp.59-70, 1990
parallel processing did not apply to [5] G. Booch, "Object Oriented
the MARS project. Design with Applications", The
Benjamin Cummings Publishing
Conclusion Company, 1991
The lack of universal standards

304
Sezione 6c

Altre workstation

305
306
MUST C25
STAZIONE DI LAVORO MUSICALE CON
SCHEDE DSP LEONARD'C25

G.Bertini, D.Fabbri, M.Marani, L.Tarabella.

Istituto Elaborazione Infonnazioni-CNR


Via S.Maria 46,1-56126 Pisa
fax +39 050/554342; e.mail:bertini@ iei.pi.cnr.it

Abstract

MuSt C25 is a workstation for compiling synthesis and analysis


real-time audio musical signals algorithms, for hard-disk recor-
processing and synthesis based on ding operations and for integra-
a low cost, modular multiDSP ting the MuST C25 with DSP-
system (named MULTIC25) This MIDI environments are under
system is based on a MSDOS development.
compatible Personal Computer
and a number (up to eigth) of Introduzione.
DSP boards LeonardC25 (Leo-
nardo Spa, Massa, Italia), carried Un sistema digitale programma-
out around the TMS320C25 bile atto a trattare segnali audio
microprocessor; the boards are musicali, per avere un grado di
connected in a daisy-chain con- polifonia e ricchezza timbrica
figuration via the high speed tale da soddisfare Ie esigenze
serial ports provided by the della didattica, della ricerca e
TMS; the first and the last board dell'esecuzione in tempo reale,
are then connected to, respec- deve possedere potenze di calcolo
tively, an analog-to-digital and a dell'ordine di almeno decine 0
digital-to-analog external circui- centinaia di MIPS. Per raggiun-
tery. gere tali livelli di performance
The operational environment for con un buon rapporto costo/pre-
the MuSt C25 has been developed stazioni, abbiamo proposto (rIEl
as an application for Windows ed il CNUCE, Istituti pisani del
3.1. The main functions of MuSt CNR, sono da tempo attivi nel
C25 are tasks distribution among settore con realizzazioni di siste-
the various modules and mana- mi in tempo reale per la sintesi in
gement of the flow of controls tempo reale [l]) una soluzione
parameters and synchronisation. che consiste nella realizzazione di
Tools for graphically editing and sistemi paralleli opportunamente

307
configurabili, basati su processori Le schede LeonardC25 utilizzano
DSP programmabili di tipo il microprocessore della Texas
commerciale e su elaboratori Instrument TMS320C25, banchi
della classe personal [2]. di memoria EPROM e RAM
veloci; sono inoltre corredate di
Architettura della MuST un'interfaccia analogica, circui-
teria di condizionamento dei
Per poter sperimentare Ie solu- segnali e di una porta seriale ad
zioni proposte in [3] e sviluppare alta velocita (5 Mbit/s), che
applicazioni con ridotti costi implementa il protocollo di
dell'hardware, e stato assemblato comunicazione DSP della Texas
un sistema con architettura stessa.
semplificata basata su bus AT [4]. Nel nostro sistema Ie schede sono
La stazione di lavoro musicale collegate in configurazione daisy-
'MuSt C25' e composta quindi dal chain, senza necessita di logica
sistema MULTI-C25 [5], da un aggiuntiva, tramite la porta
software di base e da alcuni tools seriale suddetta ed il primo e
applicativi. MULTI-C25 e un l'ultimo modulo sono connessi ad
sistema modulare basato su un opportuno dispositivo di
schede LeonardC25 [6] (fino ad conversione AID e DI A. Per
un massimo di otto) compatibili quanta riguarda l'interfaccia
con il bus AT. MIDI e stato utilizzato il modulo
commerciale (MPU-40l PC,
della Roland).

canale analogico
diingresso
~,..----,::'_'""Modulo di conversione;-;~==:""'"'"'. canale analogico
di uscita
Convertitore
AID

link seriale
.....------, MIDI in
Personal
Computer Interfaccia
(Host) IN
MIDI MIDloul

Architettura della stazione

308
Sono in via di completamento dei La finestra principale del pro-
tools software per rendere gramma consente l'editing con-
disponibili suI sistema funzioni di temporaneo di piu file di testa
utilita come un editore/compila- che costituiscono il codice
tore grafico di algoritmi di specifico relativo agli algoritmi
sintesi, funzioni di hard-disk da caricare sulle schede. Al
recording e di integrazione di momenta questi algoritmi devono
ambienti DSP e MIDI, ecc.). II essere scritti in assembler
sistema opera essenzialmente nel TMS320 arricchito di alcune
modo seguente: con cadenza nuove istruzioni e diretti ve
temporale pari al periodo di definite come macro, raccolte in
campionamento prescelto, tutti i un file di sistema. L'utente ha poi
moduli eseguono in parallelo gli a disposizione un'ulteriore serie
algoritmi ad essi assegnati e di macro definite in una libreria,
trasferiscono i dati da un modulo contenenti Ie specifiche dei piu
a quello successivo per mezzo comuni algoritmi di sintesi.
della porta seriale del Per gestire i dati digitali prove-
TMS320C25. Le funzioni prin- nienti dal convertitore AID, sono
cipali del software sono quelle di disponibili alcuni strumenti che
controllare il trasferimento dei consentono di memorizzarli tem-
dati e di gestire il passaggio dei poraneamente sulla RAM del-
parametri inviati da host per il l'Host. Dalla Ram possono essere
controllo in tempo reale degli quindi caricati sui moduli 0
algoritmi in esecuzione sulle trasferiti come file verso Hard
singole schede. Disk.
Come esempi applicativi che con-
L'ambiente di lavoro. sentono il controllo degli algo-
ritmi sono stati sviluppati due
L'ambiente di lavoro e stato strumenti: un 'Sequencer' e una
sviluppato sull'Host, un calco- finestra di 'Controlli in tempo
latore Ms-Dos compatibile, come reale'.
applicazione per il sistema ope- II Sequencer e strutturato ad
rativo Microsoft Windows 3.1: imitazione dei registratori audio
allo start-up il programma rico- multi-traccia (cosl come i piu
nosce la configurazione hardware comuni sequencer commerciali).
installata e carica il software di La finestra di 'Controlli in tempo
base sulle memorie dei moduli reale' consente all'utente di
DSP; Ie funzioni affidate ai configurare una serie di cursori,
moduli possono essere decise manovrabili sia tramite il mouse,
dall'utente che puC> specificare il oppure tramite input esterni
compito di ciascun modulo di (segnali analogici acquisiti
calcolo scegliendo fra operazioni direttamente dalle schede, uscite
di sintesi ed elaborazione. di dispositivi MIDI, ecc. ).

309
I particolari di progetto e una Bibliografia
guida utente sono contenuti in [7]
[I] G. Bertini, M. Chimenti, F.
Conclusioni Denoth - "TAU2: un terminale
audio per esperimenti di
La stazione di lavoro si presenta Computer Music", Alta frequenza
come un sistema aperto, comple- Vol. 46, pp. 600-609, 1977.
tamente programmabile, utilizza- [2] E ALee - Programmable
bile in diverse applicazioni, dalla DSP Architectures -IEEE ASSP
progettazione di algoritmi di ela- Magazine, Vo1.5, n.4 pp. 4-19,
borazione e di sintesi di segnali Vol.6, n.l pag. 4-14, 1988.
audio e musicali, alla didattica e [3] L. Tarabella , G. Bertini -
ana produzione musicale crea- "A Digital Signal Processing
tiva. Essa e corredata di alcuni System and Graphic Editor For
strumenti software per funzio- Synthesis Algorithms" ICMC
nalita tipiche del settore come il Proceedings (Columbus, Ohio,
trattamento grafico di algoritmi USA) pp.312-315, 1989.
di sintesi, l'hard-disk recording, [4] G. Bertini, L.M. Del Duca,
interfacciamento ed integrazione M. Marani -"The LeonardC25
verso il protocollo MIDI. System for Real-Time Digital
Dato il basso costa delle schede Signal Processing" Proc. Infl
Leonard'C25 (0.5 MLit), la Workshop in Man Machine
stazione rappresenta di fatto un Interaction in Live Performance
sistema facilmente acquisibile per Pisa, pp. 107-117, 1991. '
un primo approccio all'uso di [5] G. Bertini, D. Fabbri:
supporti di calcolo per il DSP "MULTIC25, un sistema Multi-
basati su personal computer ed e DSP con schede LeonardC25"
rivolta ad un'utenza tipicamente Rapp. Teen. Prog. Finalizzato
costituita da laboratori di scuole "SICP"-CNR, R/2/l08, 1993.
musicali e di ricerca, compositori [6] G. Bertini, L. Tarabella,
e musicisti ecc.. G. Bacchiocchi, M. Balestieri:
Le funzionalita della stazione "Tecniche di gestione delle
possono essere ovviamente valu- risorse in sistemi multiDSP per la
tabili solo in una seduta dimo- sintesi e l'elaborazione di segnali
strativa; Ie pili significative fine- audio" Rapp. Teen. Prog. Fina-
stre di interfaccia utente sono lizzato "SICP"-CNR, R/2/55, 91.
state inoltre riportate in un [7] G. Bertini, D. Fabbri, L.
poster. Tarabella: "MuStC25, una stazio-
ne di lavoro musicale con il
Questo lavoro e stato svolto con sistema MULTIC25. Manuale
il contributo del Progetto operativo." Rapp. Teen. Prog.
Finalizzato "Sistemi Informatici e Finalizzato "SICP"-CNR,
Calcolo Parallelo", Sottoprogetto R/2/109, 1993.
2, Processori Dedicati.

310
CONSTRAINTS SATISFACTION
PROGRAMMING
IN COMPUTER AIDED COMPOSITION
ON A HIGHLY GESTUAL DEVOTED SYSTEM,
BASED ON A VME-MULTI-PROCESSOR
JOINING TRUE UNIX AND REAL TIME
Philippe Prevot & Arnaud Debayeux

LIMCA (Lutherie Informatique et Musique Contemporaine aAuch)


Chateau de Saint-Cricq Route de Toulouse
F-32000 AUCH - FRANCE
Tel +33 62.63.29.54
Fax +33 62.63.44.23

The objective of this work is which made up the glory of


not to start the dispute again digital versus analog techniques,
between real time and deferred are never truly guatanteed but in
time, but on the opposite to deferred time, even though it
tempt a conciliation approach. doesn't exist any essential
Besides, at the time when this impossibility, but only big
article is written, the work is practical difficulties.
fully in progress. In fact, from the user's point
From the coming out of of view, real time is an illusion,
specialized processors and DSPs, if one forgets that the freedom
signal processing has become, degrees at execution time, are
often with quality loss, the limited to a class of sounds and
prerogative of real time. As a musical schemes, previously
matter of fact, the pleasure of defined in deferred time.
being able to modify a sound in From the designer point of
reaction to listening and without view, real time is far from just
intermediate, still facilitating the going as fast as possible, but
work on sound material, often should:
prevails on the will of very - guarantee the delays as
finely defining some subtle parts deterministic as possible.
of the sound, like the attack or - be able to measure these
the polyphonic superposition delays, and to control them by
conditions. The transition controlling the amount of
processes, so important in the processes.
percieved quality, are often - allow asynchronous task
neglected, because difficult to execution, by an essentially non-
manage in real time. linear architecture.
Reproductibility and stability,

311
Besides, real time work of simultaneous voices and
practice is nothing but a parameters that one system has
succession of real time work to hold at a given moment, and
sessions, consistuting by also by the increasing power of
themselves a deferred time work general computers. A computer
session, compared to some "final music instrument maker should
execution". feel guilty not to try to merge
In terms of time structure real and deferred time. The
building, the computer aid to finest job to give life to a sound
composition is still the can hardly replace the pleasure
privileged area of deferred time, of playing an instrument in live
for three possible reasons: music.
- the necessary reflection time
for composing often makes real Real time never exempted the
time unnecessary. composer to compose, would it
- the complexity and the huge be by a "top-down" pyramidal
diversity of the problems, and of genesis, by building a structure
the ways of solving them, imply from the sound material, by
that the solutions may hardly be non-arborescent nets, or by any
given by a unique software, and other method which make the
are often processed in languages conception time be neither
like Lisp or Prolog which are monotone nor linear versus
not well suited for real time musical time. Finally real time,
execution. Their real time beside the technical questions,
processing may be difficult, could be defined by three rules:
independently from the language - 1- it is submitted, more or
used. less, to an unpredictable external
- some problems, among the world (environment rule).
most interesting ones, may - 2- it renounces to the non-
imply a non monotone time monotony of its execution time
processing: it is possible that: versus the musical thought time
x(td = f(tl, y(tz)) where (monotony rule).
tZ>tl and where y(tz) may not - 3- it should guarantee the
be known at tl. linearity, in a 1 to 1 rate, of the
execution time to the desired
Such an indictment against musical time (linearity rule).
real time should terminate the
discussion. We daily develop the first
And yet the last years have rule. We do not have (yet) tried
been marked by an increase of to transgress the second one. We
the DSPs computational do try to get round the third
capacity, by an enhancement of one.
the sound possibilities which
result in increasing the number

312
System architecture time allows it, as well as the
This is why we designed a midi communications. The
VME based multi-processor "Constraints Satisfaction
system, comprising: System" (CSS) is run by Unix.
- a DSP board "Quatron" Both processors share a
featuring up to 64 wave-shaping common memory, allowing a
generators, or 128 table-lookup dynamic software link, without
generators, with 128 ramp delay, but can also be used
generators, and 128 timers, at a independently on their own disks
sampling rate ranging from 27 and displays.
KHerz to 3.5 MHerz The fact that the DSP is
- driven by a 68030 real time mapped on the bus allows a non-
board, delayed control of all the audio
- and a 68040 UNIX board, parameters.
under XII, The midi interface board
- various interface boards for allows to extend sound
gestual control devices, among processing to any midi external
which midi, instrument.
- and a gestual console -
Pacom featuring 32 How to handle it ?
potentiometers, 8 incrementals, Such a system, according to
76 push buttons, 176 leds, 14 8- the user's specific patch, may
character displays, under midi typically present more than a
communication. hundred different parameters to
None of the elements above is handle (for example, up to IS
of a brand new design, but are parameters for 10 independant
still above all commercial instruments, inside the Quatron).
devices. Even some parameters,
classically considered as a one
There are two reasons for a parameter like wave tables,
double-processor: should be replaced by a set of
- being opened to the Unix parameters, if not defining
world, and able to use high level exhaustively the objects, at least
libraries from any source. defining one or several functions
- sharing the work between a allowing to classify the objects
relatively large time scale under in a set.
Unix on one hand, and real time Facing such a large number of
on the other hand. Unix manages data to hold, the musician wants
the displays under X, as well as to concentrate himself, in a
some events like mouse clicks, given situation, on certain va-
for low communication flow riables, and expect from the
interaction. The Real Time computer to temporarily hold
Processor (RTP) manages the the others, in the "best way",
sound process details, as long as that is by satisfying, as well as

313
possible, a set of equations and predefined values from a score,
unequations, logic or arithmetic, instantiated by some external
linking them to each other or to event, etc... The external
some parameters, imposed from elements give the control
elsewhere. values.
These techniques, known The parameters are elements
under the general "Operations the value of which can change in
Research" methods, have been of time, independently from the
high interest for computer re- CSS, and are what the CSS sees
searchers for a long time. But it of the external world .They
generally lead to solve problems receive their values from the
by or for languages such as Lisp controls. Their mobility keeps
or Prolog, which are not clearly the constraints from being
designed for real time. definitively satisfied. They are
On the opposite, real time read only by the CSS and
implies that the constraints written by the RTP.
cannot, in a binary way, be or The variables designate the
not be satisfied - since music has elements of which the value is
to proceed anyway (monotony fixed by the CSS after
rule)- but that the solutions must satisfaction. They are read only
push towards an ideal value. by the RTP and written by the
Thus we must introduce a com- CSS.
promise:
1- the result of each constraint Where to start ?
is not a definitive value but a Starting from the patch
temporary target to reach. definition, the user defines the
2- a wrong or too slow parameters to be communicated
direction is not fatal, but itself by the RTP to the CSS, both in
generates its own correction. terms of an algorithm based
upon different controls and by a
Definitions computation instantiator, that is
At this point, let us define a either a regular rate or an
few elements: external unpredictable reason to
The external world consists deliver a new value.
of all sorts of devices giving The user then defines the
unpredictable values to variables to be expected from
predefined objects, thus creating the CSS and the different points
unpredictable events attached to in the patch where to use their
predefined functions. They values. The system state at a
typically are potentiometers and given time t is the set of the
buttons from the Pacoms, midi variables values:
events, widgets events from X(t)= (Xl(t), X2(t), ... xn(t)·
XII, but also interrupts and The set of constraints are then
values read from the DSPs, or defined, in different ways,

314
according to their type. null on one side. The most
interesting one seems to be
Where to go ? increasing with the distance,
For given values of the insuring the system self-
parameters at a given time t, a limitation. Furthermore, the
constraint Ci defines a subset strength weighting before
<;i(t) of "ideal" points. A pole is application of the global strength
the association of a constraint allows to take into account the
and a sign: repulsive or shifts accumulated by a
attractive. It exercises a strength constraint. In this way, if a pole
of that sign, in the direction of comes to trap the system, it
the shortest way between X(t) rapidly looses its relative
and <;i(t), the magnitude strength and frees the system
depending on the way length. back. According to the way the
The poles are weighted to constraint is defined, the stength
differentiate a priority, relative magnitude and direction are
to each constraint. The weights computed by the gradient or by
vary with time. This is our first the orthogonal projection.
way of getting round the A constraint may be defined:
linearity rule: according to the - as a linear or non-linear
necessity at a given time, we do function of n variables.
not affect the same - as a set of definitions on
computational power to each discrete definition intervals.
variable. - as a Hermite cubic, by a set
From the weighted strength of points and derivatives.
sum results a global strength - or in any way, through a C-
which is going to displace the code giving the signed distance
system to a new state, or target, to the hyperplan and the partial
X(Hat) where at is the CSS derivatives.
sampling period. These - (in progress) as a logical
successive targets are taken into function, like belonging to a set,
account by the RTP. As at can etc ...
be relatively large if the
constraints are complex, the Different contsraints may be
RTP processes a series of linear gathered in a character . The
interpolations between the user can create a character
effective state and the present library and build his musical
target. mechanisms as a succession or
superposition of characters.
Stength computation
The s t r eng t h may be Problems
different according to the Real time is still the first and
distance sign, that is the pole last constraint to be satisfied. If
side. For an unequation, it is not, the constraints satisfaction

315
fineness should be cut. A and display the trace in deferred
hardware timer measures the time. The user can trace, for
evaluation process charge by its each variable and each
response delay. The step of any satisfaction step, the distance, the
iterative method and the computed strength direction,
transition step to the target state magnitude and weight, the
increases in ratio to this delay. target, and the absolute (hard
The transition and satisfaction process) time from the
processes are asynchronous, and preceding target.
they share the processor time
depending on the delay. Their Examples
rate is variable and decided by The four examples on the
the RTP. As long as the CSS graphic page show:
does not give a new target, the - (high left) a Hermite cubic,
RTP can optionally, either stay called "cherm" and defined in
and wait at the last target, or the text example by 6 points and
extrapolate a next target upon derivatives (only 5 are visible).
the last ones, independently for The thicker line in the middle is
each variable . This is our <;(t), the others are the
second way of getting round the equipotentials. The orthogonal
linearity rule: it is a kind of lines show the way from
"rubato" applied specifically to different points to the curve.
each variable . - (high right) a polynomial
called "cpoly" and defined by 3
Traps and oscillations are coefficients.
partly opposite to each other. - (low left) two concurrent
Traps may occur if the strength circles with equal weight. One
is not an increasing function of can see a trap in the middle
the distance. Oscillations may between the two common points.
occur, with too large a As soon as the starting state is
satisfaction step, if the strength shifted or if a weight had been
is increasing with distance. modified, the trap is left. One
Direction ambiguities may occur can also see an ambiguity point
if two strength directions are if the system is dropped on the
possible with the same common diameter.
magnitude. We have no way of - (low right) two separated
choosing. circles with different weights.
One can see - by observing the
Trace and data display are equipotentials - that the strength
essential to any real time music of one circle lowers down when
system to enable the user to do the state is near the center of the
twice the same thing. Display other circle, thus far from it.
takes time. Trace takes memory.
We choosed to trace in real time

316
It is important to understand period of the satisfaction
that the vision of these examples process. These data are decided,
is a static vision of an essentially in real time, by the RTP.
dynamic system. The curves may After the header, some
be parametered by the values constraints are defined,
coming fron the RTP. especially a Hermite function
defined by 6 points and
The POLE language is a derivatives, in a two dimension
LALR grammar built with space.
YACC. The signs < > mean to one
This example begins by a side or the other, = mean on the
classical header, defining two hyperplan.
default parameters, the step and

library mylib

parameter step(2) [0] := L period := 10

procedure Circle(2,3) Sinus(2,3) Logarithm (2 , 1)


Square(2,1)

strength pull, push

variable x(3) [0] := 100, y := 200.3, Z

-- Two parameters have been declared in the header,


therefor the index begins at 2
parameter xO(3) [2] := 300, yO := 300, my-pararneter

constraint
in z[y], y is considered as s constant
cpoint :point x[100,200], y[20,my-pararneter], z[y]
ccar :procedure Square(x,y;xO)
linear relationship x -2y + z + yO = °
clin :linear
polynom y = °
l*x + -2*y + l*z + yO
-2x + 0.5x~2
cpoly :polynom x, y, (0, -2, 0.5)
cherm :hermite x, y, (0,100,2),(200,400,0),
(300,300,0), (400,600,0), (600,300,0), (800,800,1)
pole
poleO cpoint, =, pull, 1
pole1 ccar, <, push, my-pararneter
pole2 clin, >, push, 1.2
pole3 cherrn, =, pull, yO

character
carOl, poleO, pole1
carl: xO, pole1, pole3, pole2

317
318
Capitolo 7

STRUMENTI S/W E H/W

PER LA COMPOSIZIONE
E LA PERFORMANCE

319
320
MUSIC 5 MAC
Simone Bettini

C.S.C, Padova University


Via Gradenigo, 6a,
35100 Padova (Italy)
home telephone 0429/86230
E-Mail (internet)space@maya.dei.unipd.it

Music V and Music 5 Mac. finally on personal computers.


In the beginning Music 5 Mac
Music V is, from an historical (M5Mac) was conceived as the
point of view, a very important Macintosh port of a PC Music V
sound synthesis and music compiler written by Daniel Arfib,
composition language. CNRS Marseille [2].
Music V was the last of a A preliminary straight port of
series of Music languages carried the compiler resulted in a non
out, starting from the end of Macintosh application: in
fifties, by Max V. Mathews, particular it lacked a user-friendly
researcher at Murray Hill Bell graphical user interface allowing
Laboratories [1]. It was the first to simplify and speed up the
language to allow the realisation scores writing. It indeed behaved
of a virtual digital synthesiser. as every other command line
It can be used to define, version of the package:
starting from their describing • the composer has to draw on
algorithms, the instrument of an paper a directed acyclic graph
"orchestra" to which pass a score (DAG) representing the
to perform. connections of the components
Music V, although it is now of an instrument.
used mainly for didactic • the composer himself then
purposes, still stands out for its translates this graph in the Music
versatility and power and can be V language and completes the
considered the first of a whole score adding the instructions
generation of music synthesis relative to the motif execution.
languages based on the same This sequence of instructions
operation principles. goes in a text file.
The first Music V compiler • such file is then passed to the
was implemented on the General Music V compiler which
Electric 650 computer of Bell generates the sample file,
Laboratories, then, with the eventually requiring long
increase of raw computational computational times.
power, on smaller systems and To simplify this lengthy and

321
quite error prone procedure, at the data types: the scores(text) and
same time speeding up instrument instruments (DAGs).
creation, it was then decided to Besides it can generate sample
add to this version of Music V a files in three formats: AIFF,
graphic user interface following Macintosh audio format ("snd "
the Macintosh Human Interface in "sfil" file) and 16 bit integers.
Guidelines [3]. Thus a complete Two conversion routines
visual development environment between instruments and scores
consisting in a text editor and a are available. The first allows to
graphical editor was integrated in obtain the Music V language
M5Mac beside the language textual definition from its DAG,
compiler itself and other general the second performs the opposite

purpose tools. operation: it allows to get an


Such an environment offers equivalent DAG for any
the composer powerful and instrument present in a score,
flexible tools for score writing pleasantly formatted on screen.
and allows to generate the sample Several algorithms have been
files in background, in several developed for those routines:
output formats, and to listen • tree analysis and code
directly to the result of generation are done to translate
computation. DAGs to scores.
The score development can • text parsing, tree weighing and
thus be realised entirely in arranging are used to extrapolate
M5Mac following those steps: a DAG relative to any
• the DAG of the instrument is instrument described in a score.
done at the computer using a This output can then be modified
dedicate graphic editor. and reused.
• then, instrument translation to This Music V version, among
the textual representation is other things, allows reuse of
executed automatically by a already existent samples files as
code generating procedure. input data. 16 bit mono AIFF has
• the score can be completed in a been chosen as input format.
text editing M5Mac window. Some conversion routines
• the score is executed directly in between the various formats
M5Mac choosing the output generated by M5Mac have been
format. specially written in order to ease
At the end of the samples the use of this feature.
generation it's possible to listen
directly to the result. M5Mac environment.

Music 5 Mac features. The global structure of the


application has been developed
M5Mac manages two kinds of following Apple's Human

322
Interface Guidelines: The handling of the two
.. Standard File and Edit Menus, different representations of
as well as application specific instruments is carried out by two
menus. different types of windows (fig.
.. Dialogs for opening and 1).
closing files and for other For the score a normal text
miscellaneous functions, such as window, with usual cut, copy,
confirmation dialogs to avoid paste possibility and an additional
accidental closing of files, with search and replace function is
consequent loss of data. provided.
.. Full multitasking capability For the instrument DAGs a

New Script
COM --My In3trument
INS 0 I;
IDS P5 P7 B4 F2 P30 ;
IDS B4P6 B3 FI VI;
OUT B3 B 1 ;
END;

Fig.I: Music 5 Mac windows. The back window contains the textual definition of the
instrument in the graphic one. On the left of graphic window there is the palette for the
modules selection. The inputs menu is open.

(the user can perform other tasks dedicated window has been
while M5Mac compiles in the designed. It contains a palette
background). with six pop-up menus from
.. Multi-window environment which it is possible to choose the
with cut-copy-paste capability modules for building instrument.
between windows. A dynamic data type in which
.. Possibility to print both scores coexist both a tree and a queue
and instruments diagrams was conceived in order to manage
through a consistent user the graphic representation of
interface. instruments.

323
The use of the graphic window compilation, a movable modal
and of the palette is very dialog with a progress bar is
intuituve. shown (fig. 2). The user can stop

~ pass 3 in progress

o ( Stop here) 110000

Fig. 2: The progress dialog

The modules can be joined in the current operation by hitting a


a easy and intuitive way simply button labeled "Stop" on this
clicking and holding the mouse dialog. Doing this does not cause
button on an input [output] hot- the loss of the already generated
spot, dragging to an output samples, which are saved in the
[input] hot-spot and releasing the output file, and then can be
button, thus forming arbitrarily subsequently used.
complex pipelines. Eventual
unacceptable connections Acknowledgement.
(output-output, input-input) are
not permitted. Roberto Avanzi gave some
The input pseudo-modules P, technical assistance during
B, V, F icons show the value of M5mac development and
the relative field. This value can proofread this document.
be assigned using a command in
the Edit menu. References.
The cursor shape changes
according to the function [1] Max V. Mathews and
performed (joining, selecting, others: "The TechnoLogy of
dragging) to provide the user with computer music", The M.LT.
good visual feedback. Press, 1992.
It's possible to select one or [2] Daniel Arfib: "Music V
more modules and copy, move or pour PC manuaL", internal report,
delete them. Connections CNRS Marseille, 1990
between modules are copied [3] Apple Computer, Inc.
along with the modules "Inside Macintosh", volumes l-
themselves. VI, Addison Wesley.
Dragging the selection over
the window borders causes the
latter to scroll.
During score generation and

324
MELODIA: A PROGRAM FOR PERFORMANCE
RULES TESTING, TEACHING, AND PIANO
SCORES PERFORMING.
Roberto Bresin

C.S.C. - Centro di Sonologia Computazionale


via San Francesco 11 - 35121 Padova
phone: 049/8283757 - fax: 049/8283733
E-mail: rb@csc.unipd.it

Introduction the performance. The


performance is obtained through
In the field of musical the sum of time, sound and timbre
performance many researches deviations due to the
have been made in the past years. simultaneous application of
One of the most interesting works numerous symbolic rules.
is that developed by Sundberg In the present work, I started
and his collaborators at the KTH from the results of Sundberg to
in Stockholm [1] [2]. They build a build a program that is an
symbolic rules' system using the experimental environment for
analysis-by-synthesis method. teachers, students, and
The purpose of these rules is to researchers to understand how
change a music score into an performance rules can "modify" a
acceptable performance. The nominal score.
resulting deviations from the rules
are additive because each note
can be processed by several rules. MELODIA: the program
These deviations made from each
rule, are successively added to MELODIA is a program
the parameters of that note. With implemented at the C.S.C.
this method expert musicians and laboratories in the field of the
teachers were asked to formulate researches on automatic music
some performing rules. These performance [3] [4] [5]. In
rules, from the suggestions of an particular the program was first
expert musician, who judged used to produce in output some
whether the adding effect of the data files suitable for the neural
single rule was pleasant or not, networks I built to perform in real
were applied in sequence during time any music score. In a second

325
phase MELODIA was adapted to applied to the principal melody
use it with a Yamaha Disklavier and then the other parts are
Grand Piano and to the needs of synchronized with it.
the professional pianist MO G.v.
Battel. The program allows to
perform via MIDI any score, MELODIA: performance rules
previously written in a simple
language, applying some In the program there are 19
symbolic rules. Many of these performance rules: most of them
rules were chosen from those derive directly from the rules
proposed by Sundberg and co- developed at the KTH and I
workers [1] [2], other rules are a remand to the related papers for a
modified version of Sundberg deep explanation of them. In the
rules, and there are also following there is a listing of
completely new rules. In respect these rules:
to the Sundberg rules system, 1- Durational contrast
MELODIA considers also rests, 2- Double duration
grace-notes, staccato, and legato. 3- High loud
The application of rules occurs in 4- Faster uphill
an interactive way: each time the 5- Melodic Charge
user can choose the rule to apply, 6- Harmonic charge
and its weight. 7,8 - Chromatic charge (with
When the choice of rules session and without rests)
is finished, the user is asked to 9,10 - Leap tone duration
listen to the results or to view (Sundberg and Bresin version)
them on the screen. The view on 11 - Leap articulation
the screen shows both the time 12 - Articulation of repetition
deviations graphic (in 13 - The shorter the shorter
milliseconds) and the loudness 14 - The shorter the softer
deviations graphic (in decibel) 15 - Social duration
due to the applications to the 16 - Inegalles
score of the chosen rules. 17 - Accents
Mter this phase the user has the 18 - Phrase
possibility to begin a new session 19 - Final retard
with another score or to view and Some of these rules are a
to listen to previously processed modified version of Sundberg's
scores. rules: the chromatic charge with
MELODIA, in the true sense of rests considers also a context of
the word, can be used only on notes with some rests between
monophonic scores: in case of a them; my version of the leap tone
polyphonic music, the program is duration inverts the signs of the

326
equations to obtain the sensation References
of the physical movements of the
hands of the pianist in a leap; the [1] Sundberg J. et al.
final retard rule has only the name "Peiformance Rules for
of the same KTH's rule but acts in Computer-Controlled
a different way. Contemporary Keyboard Music",
Computer Music Journal, MIT
Press, vol. 15, No.2, pp. 49-55,
Conclusions Summer 1991
[2] Friberg A. "Generative
Since the program allows a Rules for Music Peiformance: A
"micro" adjustment of all rules, it Formal Description of a Rule
can be considered a valid tool to System", Computer Music
experiment performing rules. Journal, vol. 15, No.2, pp. 56-
MELODIA can also be useful in 71, Summer 1991
the teaching of music [3] Bresin R , G. De Poli, A.
performance, since it has the Vidolin "Un approccio
possibility to outline how the connessionistico per it controllo
emphasis given to some notes in dei parametri nell'esecuzione
respect to others can change the musicale", Atti IX Colloquio di
performance of a particular Informatica Musicale, pp. 88-102,
composition. Furthermore Genova, 1991
MELODIA is a good tool to [4] Bresin R, De Poli G.,
obtain score performances, which Vidolin A. "Symbolic and sub-
are "warmer" than those resulting symbolic rules system for real
from a performance without any time score peiformance",
micro-deviation or, like in some Proceedings of the 1992 ICMC,
commercial programs, International Computer Music
introducing some random Association, San Francisco, 1992
deviations. [5] Battel G.V., Bresin R, De
MELODIA gives also two output Poli G., Vidolin A. "Automatic
files, which contain the input peiformance of musical scores
patterns for two neural networks: by mean of neural networks:
one for time deviations, and the evaluation with listening tests",
other for loudness deviations [5] elsewhere in these proceedings,
[6]. 1993
The program runs on a personal [6] Rumelhart, D.E.,
computer IBM compatible, and McClelland, J.L. "Parallel
with a Roland MPV-401 MIDI Distributed Processing", vol. 1,
interface or compatible. Cambridge: MIT Press, chapter 8,
1988

327
MEDUSA: A POWERFUL MIDI PROCESSOR
Nicola Larosa, Claudio Rosati
IRIS s.r.l.
Parco La Selva, 151
03018 Paliano (FR), Italy
fax: +39 (775) 533343 - tel: +39 (775) 533441
E-mail: mc2842@mclink.it

MEDUSA is a MIDI device ber of CPUs, according to the


that works as a patcher, pro- required processing power. In
cessor (filter, splitter, trans- the latter case, each CPU han-
poser, ... ), and fader box. dles a subset of the MIDI in-
The system core consists of puts and/or outputs.
an ASIC (Application Specific The first CPU (the con-
Integrated Circuit) single chip troller) manages the fader box,
developed using semicustom the display controller, the
technology (standard cells). LEDs, the buttons, the wheels,
It was designed within the and the system parameters,
European project OMI/DE- sending them to the other
ARM (included in the program CPUs (the processors) through
ESPRIT n. 6909). a high speed parallel link.
The system, in its current The other CPUs work exclu-
prototype configuration, con- sively on MIDI data, fetching
sists of two distinct parts: a it from the inputs, processing
rack module and a fader box, and then sending it to the out-
which are connected by a ded- puts.
icated cable. Each processor performs two
fundamental tasks: the rout-
ing and the processing of MIDI
Internal structure data.
The system uses two or more
ARM-60 RISe CPUs, MIDI Routing
serial interfaces, and fader box
communication circuits. Each processor can direct a
The software architecture message from its inputs to any
manages up to 32 MIDI in- output, independently and si-
puts and 32 outputs. The mul- multaneously with the traffic
tiprocessor hardware architec- going on all the other connec-
ture allows one to assign the tions.
computational load to a CPU This makes it possible to
or distribute it among a num- copy the messages from any in-

328
connection processing

,/

./
i
./
./ input .. -_.-- .-.~'-'

/ processing ./ .. -....
/ --_.-'

OUTPUT FIFOs
~ = processing zone
OUTPUTS

Figure 1: Processing flow diagram.

put to any number of outputs, 4. the message is inserted into


and also to merge the messages the FIFO of all those out-
coming from any number of in- puts which are connected
puts to any output. to that input, thus achiev-
The routing of those mes- ing the merge.
sages which have a predefined
length is carried out in four System Exclusive messages
steps (see fig. 1): require a different treatment,
in that they can have arbi-
1. the message is stored in trary length and cannot be in-
an input buffer regenerat- terrupted by other messages,
ing the status byte, when apart from System Realtime.
using "running status"; When a few inputs are being
merged, the arrival of a SysEx
2. the message may undergo message on one input necessar-
processing by the zone ily delays the processing of the
(see below) associated with messages on the other inputs,
that input (input process- until the end of the SysEx mes-
ing); sage. If the latter is consid-
erably long (even 100 + 150
3. for each connection of that Kbytes), the delay can reach a
input, the message may few dozen seconds.
undergo processing by hav- To avoid, as much as possi-
ing up to 128 zones associ- ble, any data loss, the SysEx
ated with that connection messages are inserted into ded-
(connection processing); icated buffers (not shown in

329
selection operators transformation operators
______ ~A~ , /r- ~A'_ ,
r

SysRT & SysEx

(a) (b) (c) (d) (e) (f)

Figure 2: Zone flow diagram.

fig. 1), which are dynamically longing to one or more


assigned. classes,
When merging n inputs, the
average flow of each input must (b) selectively allows the pas-
exploit no more than sage of those messages be-
longing to one or more
3125 byte/s MIDI channels,
n (d) selectively allows the pas-
where 3125 byte/sis the full sage of those messages
MIDI bandwidth. Heavier flow whose values fall inside a
peaks are allowed within a predefined set.
time frame dependent on the The (b) operator, besides the
output FIFO's length. function of channel filter, can
As an example, for a 256 byte "bump" a message from any
FIFO, the merge of 4 full band- channel to any other.
width inputs is allowed inside a The selection (d) operator
time frame of about 20 ms. uses two filtering masks of 128
bits each, whose meanin~ is de-
Processing fined by the assignment tc) op-
erator (see fig. 3) according to
The processing of messages is the message class.
based on the concept of zone. The thus selected messages
The zone can be pictured go through two transformation
with a set of selection operators operators:
and a set of transformation op-
erators. (e) independent remapping of
The message to be "trans- the values assigned by the
formed" is selected by three fil- (c) operator, using two ta-
ters (see fig. 2): bles, as follows:
MSB = tabM[MSB o1d + offsetM],
(a) selectively allows the pas-
sage of those messages be- LSB = tabLlLSB old + offsetL];

330
message MSB LSB
indexes of the composition
Note On key velocity buffers are stored.
Note Off key velocity For example, selecting the
Poly Press key pressure Note messages, one could as-
Chan Press pressure 0 sign greater delay times to
Prog Change program 0 notes with greater key values.
Ctrl Change controller value At the same time, a few un-
Pitch Bend value MSB value LSB used notes could be composed
MTC data 0 in System Realtime Start and
Song Position value MSB value LSB Stop messages.
Song Select song 0
F4+F6 0 0
Fader Box
Figure 3: MSB & LSB assignment. By means of sliders, buttons
and a joystick, the fader box
(f) delay or composition of the may generate any MIDI mes-
sage or group of .messages, us-
message, global or inde-
ing specific buffers which are
pendent for each one of the definable by the user.
128 values of MSB or LSB, More simply, a group of
which are selectable. modes is also available, in-
stantly recallable, where the
The composition pro- controls send the most com-
cess sends byte strings in place mon messages by default.
of the triggering message. The
strings are user definable and
stored in dedicated buffers. Conclusion
From the message a few pa- MEDUSA was designed focus-
rameters can be derived, and ing on two main aspects:
these can be used in the com-
position. Among them, there 1. power and flexibility;
are the MIDI channel, the
MSB and LSB values, and 2. ease of use.
their complements.
In the transformation (f) op- The system includes many of
erator it is possible to set the the features of MIDI patch-
delay time (from 0 ms to 10 bays, master keyboards and
s), or else set the index of the processors on the market, plus
composition buffer to be trig- a few not currently available.
gered by the messages, inde- This feat has been achieved
pendently of their values. by means of a configurable
As an alternative, it is pos- architecture, caring software,
sible to choose a table (ad- and user friendly devices such
dressed by MSB or LSB) where as a large LCD, wheels and joy-
the delay times and/or the stick.

331
PWCONSTRAINTS

Mikael Laurson

Tietokonemusiikistudio
Sibelius Akatemia
PL86
00251 Helsinki (Finland)
E-mail Mikael@ nextl. sibaJi

Abstract: PWConstraints is a that you can insert or take away a


rule-based constraint language rule without affecting the rest of
working under PatchWork (PW) the system.
(Duthen, Laurson, Rueda). It can The user specifies the time
be used to fill pitch information structure of the problem in a PW
of musical textures. Currently the rhythm editor module as a
system can write classical traditional score either by hand or
counterpoint, solve problems by algorithm. During search the
using musical set-theory to score is providing the rules
generate rows, matrixes, chains, information about musical
harmonic structures, etc. context of each note.
A part of a problem can be
PWConstraints provides a constrained in advance and the
backtrack search engine, a task of the search is to fill the
genetic search engine and a unconstrained parts. This makes
traditional musical score the system much easier and
representation. natural to control because you do
If the user has a specific not have to invent rules for those
musical problem she/he can write parts that are constrained in
rules as Common Lisp functions, advance. Examples where this
without having to worry about kind of constraining might be
backtracking, creating musical useful are cadences in classical
structures from scratch, other counterpoint, cantus firmus
rules, etc. techniques, etc.
Rules are modular in the sense Because the user works directly
that they can be in any order and with a traditional score, making

332
those constraints is as straight- package PWMacSet (Castren,
forward as writing notes on Laurson).
paper. PWConstraints can also be
The result of the calculation is used as an analysis tool where a
written directly to the score, so score is given as an argument to
reading, listening and printing the the rule system. In this case all
result is easy. pitches of the given score are
For the evaluation of the result constrained and the system loops
the user can use analysis tools through the score. If a rule fails, a
that are available in PatchWork, message is printed giving
like the set-theoretical analysis information about the rule and
the exact location in the score.

Backtrack search
current note
vector of .lr
sorted notes
11111111 candidates
update current of current
note with first note
item of candidates before
, musical
, rules
candidates
after
Score context rule 1 ) candidates rules
current note of current filtered and
note rule 2 ) by rules sort
, :) -+-, ...
rule N )
~
,Iv
-ell .... -G .... ..... - candidates
sort sorted
after
rules
I
Figure 1. Overview of the system

333
PWConstraints consists of Each rule acts as a chained
three parts: score, search and filter, where the list of candidates
rules (figure l). is continuously filtered. After all
The search space is defined at rules have been applied this
the preparation phase by taking filtered list represents all possible
all the ordered note objects from midi values that a note object can
a score and assigning to each of have in this specific musical
them a range of possible pitches. context. This list is sorted
For example the task might be according to a user definable
to write a diatonic five note function. The first item of the
melody for a soprano voice with filtered list is written as the pitch
a range of 1.5 octaves - in midi value of the note and the rest of
values from 60 to 77). the list is stored in the note
This means that we are object. The new pitch value
assigning to each of the five notes changes the state of the score and
a list of midi values - in this next note objects can refer to this
example the list would be (60 62 change when the musical status
64 65 67 69 71 72 74 75 77). All is been analysed later in the
solutions for this problem can be search. The rest of the list acts as
found by taking all possible five- a kind of buffer if later the search
note paths by selecting one midi has to backtrack. If backtracking
value from each range list, more occurs the first item of the list is
formally by taking the Cartesian popped and written as a new
product of all range lists (or pitch value of the current note
candidates). without having to run the rules
The search engine is a vector of again. The status of the score has
ordered note objects. The search changed and the search engine
can proceed in both directions. A can go on.
current note is selected from the If the filtered list is empty the
beginning or the end of the search has to backtrack and go to
vector. All current rules are the previous note object (default
called with the same argument behaviour). Otherwise the search
list that contains information of engine can select the next note
the current note, its list of object as current and continue the
candidates and its surrounding in search. A rule can control the
the score, etc. backtrack mechanism by
jumping directly to any previous

334
location in the score. For instance Figure 2 shows a short example
if a melodic rule fails it can jump using set-theory to create a
directly to the note that caused hexachord chain with three
the failure, instead of having to embedded tetrachords.
backtrack to that note step by
step.

I(:sc (6-18a 6 5b 6-z6 6-z38 6 Z4Ib»:lJ


const

1« :nth
....,

(5 6 9 10»( :sc (4-17 4-z 150») I


~
s
..........,
~
A

const ...., L Imeasu~ fhrom 1


fandol~
Imake-set-chain IInth-mel-subset IJ ~Irulesi
Inth-mel-subset2 IInth-mel-subset3
1 L:JlSeorc I
Imel-sets-not-includi
~~
make-rule-col I-I ist ~ appIY-,J:.l.lles R

1« :
nth (7 8 II 12»( :sc (4-17 4-Z150»),~
const ....,

1« :nth (10 13 14 15»( :sc (4-17 4-z 150») 1


canst ....,

1« :sc (3 lIb 3 lla»).IJ


canst ....,

"I-zl5A
"'-17 I "I-z15A

A -4'-

.....
eJ ft'

o 1 2 3
~

'" 5 6 7 8
=
~

9
==t=
l'
10 11 12 13 I'"
JJ::I- - I -
It'
15 16
f<iI '-
17
-CI~

3-7A 3-9 3-5A 3-51 3-9 3-9 3-513-2:8 3-2A 3-"')3-9 3-2A 3-1 3-7ll3-5:8 3-8) 2-"1
6-Z6 6-Z"I1:8 6-5) 6-Z38 6-Z"Il:8

Figure 2. A set-theoretical example. Above a PW patch and below the result.

335
SoundLib 2.0
Una libreria di classi C++ per I'elaborazione di segnali
audio campionati
Andrea Provaglio

c.c.A., Universita di Padova


via San Francesco, 11
1-35121 Padova (Italy)
fax +39 498283733
E-mail musicOl@mlipad.unipd.it

segnali audio campionati. Per


Abstract quanta possibile, il progetto non e
This paper describes the mam stato legato a una data piattaforma
design strategies adopted in the hardware e/o software, e la
development of a C++ class libreria esiste al momenta in
library for the processing of versione per DOS e per Windows.
sampled audio signals, namely In fase di progetto particolare
SoundLib 2.0. The design attenzione e stata posta nel ereare
intentionally bear resemblances delle astrazioni modulari e
with the standard library iostream, fortemente riutilizzabili, e nel
but there are no inheritance fomire una sintassi per l'impiego
relationships between the two delle stesse che fosse il piu
libraries. A Parent-Child semplice possibile. Nel design di
mechanism among instances of SoundLib ci si inoltre e isipirati
some classes of SoundLib was alla libreria standard iostream, per
used to combine different l'evidente analogia di alcune
abstractions. funzionalita, per l'eleganza della
Operator overloading and friend sintassi e per la comprovata
functions was largely used to robustezza della architettura;
achieve a simple syntax in the use d'altro canto, essendo evidenti dei
of the classes. casi in cui l'analogia mancasse
totalmente, 0 non ci fosse
Introduzione compatibilta nel design, non vi
SoundLib 2.0 e una libreria di sonG tra Ie due librerie relazioni di
classi C++ rivolta carattere ereditario; la somiglianza
specificatamente allo sviluppo di con iostream e prevalentemente a
software per l'elaborazione di livello sintattico, come nell'uso

336
degli operatori di inserimento ed legati a un istante realtivo
estrazione. all'inizio del segnale, 0 a un
evento nel contenuto del segnale;
Applicazioni un esempio del primo caso sono i
Lo sviluppo della libreria e stato marker utiizzati nei missaggi,
guidato dalla analisi dei classici mentre un esempio del secondo
problemi che si incontrano nella caso sono i marker utilizzati nella
realizzazione di applicazioni per il segmentazione del parlato. La
trattamento di segnali campionati; differenza tra i due tipi di marker
Ie operazioni per Ie quali consiste nel fatto che i secondi
SoundLib e stata progettata sono vengono automaticamente spostati
state divise in categorie ad un diverso istante in seguito ad
funzionali, di seguito illustrate: operazioni di editing suI segnale,
verificatesi a monte del marker,
Trasferimento di campioni tra che abbiano modificato la durata
diverse aree del sistema del segnale.
Si tratta di operazioni a basso Le operazioni di selezione
livello, quali il colloquio tra driver permettono di indicare su Quale
di dispositivi di conversione area del segnale si vuole operare,
AD/DA e buffer di memoria, per esempio la zona da
operazioni di scrittura e lettura di visualizzare sullo schermo, da
buffer di campioni da supporti di filtrare 0 da analizzare. Sono stati
massa e scambio di campioni tra identificati due tipi di se1ezione:
buffer di memoria. temporale e spaziale; una
selezione temporale specifica
Operazioni di marcatura e l'area del segnale in tennini di
selezione intervallo di tempo, mentre una
Sono Ie operazioni con Ie quali e selezione spaziale indica su quali
possibile indicare particolari punti canali operare. Questi due tipi di
o aree del segnale. selezione possono essere
L'implementazione di questa tipo combinati, ed hatmo Ie stesse
di funzionalita, apparentemente proprieta dei marker sopra
semplice, ha in realta avuto descritti.
notevole influenza suI design della
libreria, portando alIa architettura Elaborazione di campioni
Parent-Child pill avanti descritta. Si intendono tutte Ie operazioni
Le operazioni di marcatura che modificatl0 il segnale, Ie pili
permettono di indicare un semplici delle quali sono Ie
particolare punto del segnale. operazioni di cut and paste. Altre
Questi marker possono essere operazioni di elaborazione sono i

337
vari tipi di missaggi, che tra l'altro funzionalita di interrogazione
estendono Ie operazioni di dello stato dell'oggetto.
inserimento e sovrascrirtura Dalla classe ISound discende tutta
tipiche del cut and paste; queste la famiglia dei generatori
comprendono il cross-fade, fade- algoritmici di segnale, di cui la
in e fade-out. Infine vi sono i libreria implementa attuahnente
filtraggi, intesi qui in senso vasto, solo i piu' semplici. Una
in quanta si intendono operazioni applicazione dei discendenti di
di equalizzazione, riverberazione, ISound, attuahnente in fase di
time stretching e cosi via; sono valutazione, e la realizzazione di
turte Ie operazioni algoritmiche classi che supportino la
che alterano il contenuto spettrale generazione di campioni tramite
del segnale. partitura in un linguaggio per la
sintesi del suono (csound,
Analisi del segnale Music5).
Sono quelle operazioni che Da ISound e OSound discende, in
restituiscono informazioni suI analogia con iostream, la classe
contenuto del segllale, dalla astratta IOSound da cui a loro
semplice durata, 0 picco massimo, volta discendono la maggior parte
fino al contenuto spettrale, delle classi non astrarte della
all'andamento del pitch e cosi via. libreria, tra cui SoundChannel,
SoundSelection e SoundMarker;
Generazione algoritmica di queste astrazioni fomiscono Ie
segnale funzionalita necessarie per Ie
I moduli di generazione operazioni di marcatura e
algoritrnica includono i generatori selezione sopra descritte, e sono
di rumore bianco, di silenzio e i state determinanti nella scelta
generatori di fonna d'onda. della architettura Parent-Child tra
Ie istanze di queste classi (si noti
che la relazione Parent-Child non
Descrizione va qui intesa come di tipo
Concettualmente, il centro della ereditario). Definiamo child un
libreria e una classe astratta oggetto di una classe discendente
denominata Sound, da cui di Sound quando questa gestisce
discendono Ie due classi astratte indirettamente campioni, fomiti da
ISound e OSound, che un oggetto parent specificato
definiscono Ie modalita di nella creazione del child. II parent
inserimento ed estrazione di tiene traccia di tutti i propri child,
campioni con una sintassi analoga in modo da poter notificare
a quella di iostream, nonche eventuali variazioni nel segnale

338
che Ii possano interessare. Le queste astrazioni in mamera
classi che gestiscono direttamente estrememente comoda.
i campioni quindi possono essere Comunque, classi a livello pili
solo parent, SoundChannel e basso sonG state sviluppate per
SoundSelection possono essere gestire il fonnato dei soundfile e
sia parent che child, mentre I'utilizzo delle risorse di sistema.
SoundM.arker puo solo essere Opportuni manipolatori e friend
solo un child. L'unione tra Ie function della famiglia di Sound
relazioni ereditare della maggior pennettono, con una sintassi
parte delle classi e la architettura simile ad iostream, di impostare i
Parent-Child sopra descritta parametri per la acquisizione e
permette di combinare in maniera riproduzione del segnale, di
estremamente semplice, ma effettuare filtraggi, e cosi via.
flessibile e potente, Ie diverse
astrazioni messe a disposizione da Conclusioni
SoundLib (si veda l'esempio 1). SoundLib e gia stata impiegata
Si e voluta mantenere la stessa con successo nello sviluppo di
sintassi e 10 stesso approccio applicazioni complete e
concettuale anche nella gestione funzionanti, semplificandone
dei dispositivi di conversione notevolmente la realizzazione.
AD/DA, dei soundfile e dei buffer L'approccio scelto per il design si
di memoria, facendo derivare Ie e rivelato particolannente solido e
relative astrazioni a livello pili alto efficace.
dalla famiglia di Sound. Questo, Ringraziamenti
unitamente ad una particolare cura Si desidera ringraziare Nicola
nella implementazione del Bernardini per il brainstonnillg,
meccanismo di dialogo tra istanze Graziano Tisato e Roberto
delle classi, pennette di utilizzare Cavazzana per la collaborazione.

Esempio 1
II Questo esernpio apre un soundfile stereo esistente,
II ne azzera i campioni da 1000 a 2000 del canale sinistro
II e richiude il file ("silence" i fornito dalla libreria).
void rnain(void)
{
DiskSound aStereoSound( "DEMO.WAV" );
SoundChannel leftChannel( astereoSound, CHAN LEFT);
SoundSelection noisy( leftChannel, 1000, 2000 )i
silence » noisYi
}

339
"HIPPOPOTAMUS"
Un sistema di performance interattivo

GiOlanni Ramello

(A.M.EI'. - Conservatorio G. Verdi di Torino)


via Torricelli, 5 - 10128 Torino
fax ++39.11.507141

Abstract to realize the two movement of the


piece, "The hippopotamus
This work has been produced steps", with an onomatopeitic
at the" Banff Centre for the flavour, and "The jungle
Arts" for an art exhibition's sounds", the most extensively
musical installation in an ex- interactif. The keyword of
zoo. " Hippopotamus" is the
T he original idea was to interplay between the human
create a dynamic musical being and the machine.
process with choosen degrees
Introduzione
of player's interaction with the
machines. Questo lavoro, realizzato
The controllers used to input nell'estate 1992 presso it Ba nff
new parameters and musical Centre for the Arts, e' nato
events are two: a sta ndard come audio-installazione per i
keyboard "re-interpreted" by locali di un ex zoo ed e' stato
Max (the Ircam object oriented succesivamente modificato
language running 0 n a per esibizioni concertistiche.
Macintosh platform),to answer L' intenzione iniziale era
in a different way to the di creare un flusso sonoro in
gestures of the traditional tempo reale, continuamente in
instrumental technics; the divenire e gli strumenti della
second interface is the monitor, composizione algoritmica si
with his own virtual sliders, sono rivelati particolarmente
keys and buttons, mouse adatti a tale scopo.
controlled. Ripartendo dalla definizione
The system, essentially, is reichiana della musica come
controlled by two different "processo gradua Ie " [1] ,
algorithms written using Max, abbiamo cercato, dunque, di

340
costruire un processo dinamico che ne scaturisce porta percio'
piuttosto che una partitura per it segno delle scelte combinate
computer, rna ne I conte mpo di esecutore e computer,ilquale,
abbiamo sentito la necessita'di ovviamente,si muove a sua volta
inserirvi un contributo umana entro i limiti prestabiliti dal
attivo, mediante alcuni gradi compositore/programmatore.
prescelti di interazione di un Dal punta di vista esecutivo,
esecutore / utente con iI il sistema e' state realizzato in
meccanismo stesso. L'utilizzo, modo tale da liberare il
poi, d i variabili casuali "performer" dalla tradizionale
all'interno degli algoritmi e di corrispondenza gesto/ suono
una partitura "aperta", ricollega tramite la "reintepretazione" di
questa lavoro, al meno in una normale tastiera elettronica
termini di isp irazione, ai e la realizzazione di uno sped-
molteplici studi ed esperimenti -fico controller virtuale su
sull'indeterminazione nella monitor, accessibite via mouse.
musica, realizzati dalle scuole Materiali
della musica stocastica, della
musica intuitiva ed improv- "Hippopotamus" utilizza un
-visata [2] con un tributo expanderYamaha TO77 per la
particolare agli insegnamenti sintesi sonora, una tastiera
di John Cage sull'alea in musica. Yamaha SY55 come master
keyboard, 2 DSP Yamaha SPX
La direttiva principale seguita 900, un mixer 8 su 4 e relativo
e' stata quella di disegnare un impianto di amplificazione e
sistema che combina 1'umano di ffusione, un Macintosh con
e l'artificiale secondo una porte MIDI e almeno 2 Mega-
logica di non preminenza, in -bytes di memoria, e software
una sorta di gioco strategico in scritto con itlinguaggio object
cui uomo e macchina dialo- oriented Max dell'IRCAM.
- gano ed agiscono quali La configurazione del sistema
partners, interdipendenti tra MIDI e' la seguente:SY55 out
loro. La chiave di volta del -->in Macintosh out-->in TG77.
processo e' 1''' interplay": it La parte audio invece e'stata
computer genera alcuni eventi di volta in volta adattata alle
musicali, l'esecutore puo' esigenze ed aIle disponibilita'
mod ificarli e aggiungerne contingenti, sebbene la soluzione
nuovi, la macchina puo' a ottimale sia quella quadrifonica
sua volta a lterarl i 0 (doppia stereofonia).
semplicemente "impararli" e I controllers a disposizione
riproporli piu' 0 meno variati dell'esecutore sono la master
e cosi'via. II prodottomusicale keyboard, con due possibili

341
configurazioni e it monitor del di quattordici differenti timbri.
computer (anch'esso con due Per ottenere questa alternan-
algoritmi alternativi), nel quale -zadi canali MIDI, senza incon-
l'utente, attraverso il mouse, -trare problemi di noteoff ,
puo' pilotare il processo agendo soprattutto in seguito a figure
su sliders e pulsanti virtuali. ritmiche tra timbri diversi
Metodi particolarmente rapide (causa
in sede empirica di notevoli
II sistema" Hippopotamus"
difficolta'), abbiamo realizzato
e' formate da due algoritmi; it
un oggetto[3]chiamato "filtro"
primo, it piu' semplice, trasfor-
che accende e spegne 14 sub-
-rna modulation wheel e data
- generatori (uno per ciascun
slider del SY55 in veri epropri timbro) permettendo cosi' di
strumenti musicali, attraverso
mantenere stabili i flussi di
cui controllare, contempora-
codici MIDI, senza " channel
-neamente i codici nota e la
changes".
velocity di 2 diversi timbri del
TO 77, dando vita al primo L' interazione dell'esecutore
movimento " I passi dell'ippo- con i due primi generatori e'
-potamo" di sapore vagamente vincolata all'on/off del proces-
onomatopeico. -so e all'eventuale inserimento
II secondo algoritmo, utiliz- di note che it computer elabora.
-zato per it movimento succes- Piu' capillare e'invece la
-sivo, "1 suoni della giungla", presenza umana nel terzo, in
contiene tre generatori di eventi cui l' utente puo' come al
casuali sui quali 1'esecutore solito inserire eventi nota, ac-
puo' intervenire e che, a loro -cendere e spegnere il processo
volta, si comportano anche da nel suo complesso, rna anche
elaboratori dei dati immessi nelle sue singole parti cosi' da
dall'utente tramite i controllers. avere un ruolo determinante
Due di questi generatori nell' ore hestrazione, nella
sono sostanzialmente uguali densita' sonora e nella presen-
tra loro, presentano un liveUo -za delle pause in quella che
d' interattivita' piu' limitata e potremmo definire la parte
servono a creare un doppio solista. lnoltre, attraverso uno
flusso di eventi sonori , che rap- slider virtuale si puo' determi-
-presenta,anche timbricamente, -nare la velocita'di elaborazione
una "foresta pluviale". II terzo, e di risposta del generatore
invece, genera gli interventi stesso agli inputs " live".
degli "abitanti" della selva e, Le variabili casuali utilizzate
pertanto, produce selettiva- sono queUe interne al linguag-
mente Ie apparizioni musicali - gio (istruzioni del tipo

342
"random"). Per evitare -viamente tutto cio' non im-
connotazioni troppo marcate -plica dei giudizi assoluti di
sui risultati musicali, piu' istru- valore).
-zioni "random" sono state A questa proposito, la speri-
collegate, rispettivamente, in -mentazione di"Hippopotamus"
parallelo 0 in serie, a seconda da parte di artisti, anche non
della necessita' di delimitare 0 musicisti, ha testimoniato una
menD l'intervallo di estrazione. soddisfazione quasi liberatoria
Considerazioni (soprattuto da parte degli
strumentisti) nelll'uso del si-
La. frequentazione dei sistemi
-sterna per produrre e mani-
interattivi apre interessanti
-polare il materiale sonoro,
prospettive creative a scapito,
senza i patemi e Ie inibizioni
pero', di nuove problematiche.
che si verificano nella tradizio-
Questo tipo di produzioni
-nale pratica musicale.
porta, in effetti, ad una ridi-
Un' attenta frequentazione
-scussione della posizione del
del sistema, infine, ha permes-
compositore, tradizionalmente
-so di scoprirne idiosincrasie
inteso, nel processo creativo. e qualita', favorendo applica-
Da una parte questi viene a
-zioni a differenti contesti e
perdere quel potere di deter-
sviluppi compositivi ed esecu-
-minazione pressoche' totale
-tivi non considerati in fase
suI prodotto musicale che gli
progettuale.
procura la partitura scritta, rna
ne acquista in termini di
quantita'di parametri di scelta
e di controllo. D'altronde, se Bibliografia
vale la definizione di musica
come "rumore messo in [1] S. Reich, "Eerits et
forma" [4] secondo sintassi entretiens sur 1a musique", C.
mutevoli, il computer (grazie Bourgois Ed., pp.49-51, 1981.
alIa flessibilita' di strumenti [2] C. Roads, "Introduction"
quali Max e l'ambiente MIDI) in" Composers and the com-
offre liberta' pressoche' illimi- -puter", C. Roads ed., pp.XI-
-tate nell' ordinare il dominio XXI, W. Kaufman Inc., 1985.
delle frequenze udibili.
La stessa apertura e', pari- [3] D. Zicarelli, "Writing
-menti, lasciata al performer, External Objects for Max",
che si trova in una posizione Opcode Systems, 1991.
menD coercitiva e piu' reattiva [4] J. Attali, " Rumori.
rispetto a quanta avviene con Un' economia politica della
gli strumenti tradizionafi (ov- musiea", p.38, Mazzotta, 1978.

343
LE TWIN TOWERS
UN DISPOSITIVO
PER ESECUZIONI INTERATTIVE
DI COMPUTER MUSIC

L.TarabeUa, G.Bertini, M.Romboli

Computer Music Department of CNUCE/C.N.R.


Via S.Maria 36, 1-56126 Pisa
tel. +39-50-593276 fax +39-50-904052
e.mail:music5@icnucevm.cnuce.cnr.it

Abstract Introduzione.
The Twin Towers here descri- In questi ultimi anni sono stati
bed is a device for controlling realizzati e/o utilizzati molti di-
interactive computer music which spositivi in grado di rilevare in-
detects information from the formazioni dal movimento delle
movements of hands, and how- mani (e delle dita) per controlla-
ever does not require any kind of re sistemi interattivi di computer
physical connection with them. music, come ad esempio il Data
It consists of two sets of four Glove, The Hands etc. [1] [2]
sensing devices which create two Le Twin Towers qui descritte
zones of the space (the vertical costituiscono un dispositivo che
edges of two square-based paral- rilevando il movimento delle ma-
lelepipedon, or towers) where an ni rientra in questa categoria e
object can be detected in terms of che tuttavia non richiede di avere
distance and front and side alcun contatto fisico con parti
rotations, with respect to a elettromeccaniche.
reference frame. Le Twin Towers sono state
Each sensing device actually realizzate in collaborazione con il
consists of an infrared-based Reparto Elaborazione Segnali ed
ranging system composed by a Immagini dell'IEI/CNR, Pisa, co-
transmitting diode and a me evoluzione dell'Infrared Py-
receiving diode. ramid realizzata presso l'ARTS
U sing the values of the four (Advanced Robotics Technology
heights of each tower, it is and Systems) Lab, Pisa. [3]
possible to compute the heights
and the rotations of the hands. Le Twin Towers.
The whole system so imple-
ments a sort of two aerial three II dispositivo consiste di due
dimensional joy sticks. insiemi di quattro sensori che
creano due zone dello spazio do-

344
ve un oggetto puo essere rilevato due torri quadrangolari poste a
in termini di distanza e di orien- loro volta a 30 centimetri circa
tamento rispetto ad un sistema di di distanza l'uno dall'altro.
riferimento. Ciascuno dei due Con i valori delle altezze raggi
insiemi consiste di quattro di ciascuna torre vengono calco-
sensori posti ai vertici di un late altezza e rotazioni di
quadrato di 5 centimetri circa di ciascuna delle due mani: il
lato ed attivi dal basso verso sistema realizza COS1 una sorta di
l'alto; i raggi sono percio doppio joy stick aereo.
idealmente POS;=~ ~~

_4 ~
I
I
I
I
I
I
I
I
I
I
I
I
TT-n- 11
I
I
I
I
I
I
1
I
I
I
1
I
I
I
I I I I I I I I

ILl
~
~~ I

fig.! - Le Twin Towers

Realizzazione. L'elettronica di controllo e


composta da un sistema di tra-
Ciascun sensore e costituito da smissione multiplexato che pilota
due fotodiodi di cui il primo gli 8 diodi trasmittenti ad infra-
trasmittente ed il secondo rice- rossi, e da 8 sistemi di ricezione.
vente del raggio riflesso dal- II sistema di trasmissione e com-
l'ostacolo incontrato. posto da un generatore di impulsi
e da un amplificatore di potenza.
La forma del segnale di controllo
e un'onda quadra a ::::::130 Hz con
100.01 % di duty cycle. [4][5]
II segnale fomito da ciascun
diodo di ricezione, viene trattato
con un filtro passa-alto che
elimina i disturbi dovuti ana luce
di sorgenti artificiali e mandato
Tx Rx ad un amplificatore driver che
Fig.2 - Fotodiodi fomisce cosl una tensione pro-
porzionale alIa distanza tra

345
l'oggetto rilevato e la sorgente ad infrarossi stessa.

diodoTxI
diodoTx2
diodoTx3

diodo Tx8

diodoRxI~ InseguilOri di AnaIog-to-midi


diodoRx2~ tensione AmpIificalOri MIDI
converter
efiltri Drivers AKAI ME3S-T
diodoRx8~ Passa-allO

Fig.3 - Schema a blocchi dell'elettronica di controllo

La circuiteria di controllo MIDI NoteOn dove il Key-


delle Twin Towers manda i Number individua il numero di
valori analogici degli 8 trasdut- trasduttore (l +8) ed il Velocity
tori agli 8 ingressi di un Analog- riporta il valore dell'altezza.
to-MIDI trigger (AKAI ME35T); La risoluzione in altezza rile-
un insieme di 8 valori viene vata da ciascun sensore e di circa
spedito ogni decimo di secondo, e 2 millimetri e l'intervallo di
l'Analog-to-MIDI trigger e pro- azione significativo e di circa 25
grammato per generare messaggi centimetri.

, K ;: ,/ :

.. •
I .... I I' I , )--
, 1 ,I .... " I

.. ..
: /':, ,- - -:- - - - - - - - ~ --)
:....... I I ',,:
1,0' I I
"

" I I I I f - - T" - - - - - - - - , - -
I, I I I, I I I I I I
1- -.,. - - - - - - - - , - - I I I I I I I I
I I I I I I I I I I I I

: : : : : :
: h3 h4 : : h3 h4 :
, , , I
:, h3 h4 :I


I I , I
___ Scm--G
hI h2 •hI •h2 hI h2

FigA - Tre differenti possibili situazioni di una singola torre

I Messaggi NoteOn generati


dall'Analog-to-MIDI trigger ven- hI + h2 + h3 + h4
Altezza media =
k
gono utilizzati da un programma
attivo su un personal computer ROlaz. Later. = alan ~ hI - h2 k+ h3 - h4 )
che in base aIle 4 altezze di
ciascuna torre calcola altezza ~ hI - h3 + h2 - h4 )
Rotaz. Front. = alan k
media e rotazioni laterale e
frontale delle due mani:

346
II programma genera poi una alia sensoristica tipica del settore
serie di messaggi MIDI Control della robotica.
Change in base alIa seguente
tabella di riferimento: Bibliografia
Bn 5A vv altezza modo (mano destra)
[1] M.Waisvicz: "The hands, a
Bn 5B vv rotazione laterale mod.
Bn 5C vv rotazione frontale mod
set of remote midi controller"
Bn 5D vv altezza m.s. ICMC 84 Proceedings, pp. 313-
Bn 5E vv rotazione laterale moso 318,1984
Bn 5F vv rotazione frontale m.s [2] R.B.Knapp, H.S. Luster: "A
bioelectric Controller for Com-
Conclusione. puter Music Application - CMJ
Vo1.14, n.1, pp. 42-47,1990.
Quella dell'utilizzo di un [3] V.Genovese et alii: "Infr-
analog-to-MIDI trigger ed un aredBased MIDI Event Gene-
personal computer deve essere rator", Proceedings of the Inter-
considerata una soluzione transi- national Workshop on Man-
toria: la forma finale dell'appa- Machine Interaction in Live
rato vedra infatti incorporato Performance, Servizi Tecnogra-
anche un modulo di generazione fico, Area di Ricerca, Pisa, pp. 1-
di messaggi MIDI. 8,1991.
Per Ie sue caratteristiche que- [4] M. Idesawa: "Optical Range
sto dispositivo puo essere inserito Finding Methods for Robotics"
nella categoria dei controller Proceedings of the IEEE Inter-
MIDI in quanto i messaggi gene- national Conference on Robotics
rati possono essere utilizzati per and Automation, pp. 207-213,
controllare in real-time qualun- 1986.
que apparato di generazione [5] B. Espiau and J.Y. Castros:
sonora dello standard MIDI. Va "Use of Optical Reflectance Sen-
tuttavia sottolineato che 10 scopo sors in Robotics Applications",
principale per cui Ie Twin IEEE Transactions on Systems,
Towers sono state progettate e Man and Cybernetics, Vol. SMC-
realizzate e quello di controllare 10, No.12, pp. 903-912, 1980.
composizioni musicali interattive [6] L.Tarabelia: "Informatica e
realizzate con l'uso di sistemi e Musica", Jackson Libri, 1992.
linguaggi tipo MAX, PascalMusic [7] M. Puckette, "Combining
ed altri.[6] [7] [8] Events and Signal in the MAX
Graphical Programming Eviron-
Ringraziarnenti. ment", Computer Music Journal,
Vo1.15, n. 3, pp. 68-77, 1991.
Si ringrazia l'Ing.V.Genovese [8] L.Tarabella: "Real- Time
dell'ARTS Lab di Pisa per averci Concurrent PascaIMusic", Inter-
dato il supporto tecnico relativo face, Vol.22, pp 229-241, 1993.

347
348
Capitolo 8

SISTEMI MULTIMEDIALI,
REALTA' VIRTUALE,
SPAZIALIZZAZIONE

349
350
APPLICAZIONE MAX PER LA SIMULAZIONE
DI SORGENTI SONORE IN MOVIMENTO CON
DISPOSITIVI COMMERCIALI A BASSO COSTO
Andrea Belladonna, Alvise Vidolin

Conservatorio di Venezia "B. Marcello"

C.S.c. Universita degli Studi di Padova


via Gradenigo 6/a - 35129 Padova - Italy
phone: +39 49 8287631- fax: +39498287699
E-mail: vidolin@maya.dei.unipid.it

Abstract Digital signal processors produce


delayed and reverberated signaL
The system we have realized The system can be employed
allows: in all those live performances
- moving sound sources where Live Electronics needs
simulation on 2 to 8 indipendent sound source movenents
audio channels systems; simulations, even if very
- possibility to have sound complex.
movements in every direction; The structure of program
- capability of the system to be allows a total control of different
adapted to different kinds of parameters that defines the
ambients; virtual space dimensions and
- simulation of virtual space sound source movements in it.
dimensions and "near/far away" The user can control the
effect of sound source even out several parameters in two
of real space; different ways: in real-time
full real-time control mode and with "playlists" of
capability of simulation models sound movenent events written
by sophisticated graphic- in a simple internal language.
interactive user interface.
The system is based on a
program developed in Max, on Introduzione
Apple Macintosh Computers; it
controls a group of MIDI-drived II sistema qui descritto e in
audio potentiometers; automated grado di effettuare la
potentiometers produce the simulazione di sorgenti sonore in
necessary audio volume movimento su installazioni da 2 a
modification perceived as 8 diffusori indipendenti con
interaural amplitude variations. possibilita di compiere percorsi

351
su tutte Ie direzioni. Pub essere il Mini Trails ed altri, questa
utilizzato nell'esecuzione in sistema utilizza elementi discreti
concerto di lavori di Live di assoluta reperibilita e puo
Electronics, realizzando in adattarsi aIle innovazioni
tempo reale alcuni modelli di tecnologiche che l'industria degli
simulazione originariamente strumenti musicali MIDI offre di
sviluppati per il tempo differito continuo; nella versione attuale,
[I] oppure implementati su presentata nella dimostrazione, e
prototipi di difficile reperibilita progettato per gestire due Niche
quali il Sistema 4i [2], l'Halaphon Audio Control Module (ACM) e
[3] 0 il Trails [4]. due processori digitali di segnale
II sistema si basa su un Yamaha SPXlOOO.
programma realizzato in In alcune prove effettuate i
ambiente Max, su piattaforma Niche sono stati sostituiti dal
Apple Macintosh, che pilota un sistema Sound Engineer della
banco di potenziometri audio GeneralMusic [6] con risultati
controllabili a distanza via MIDI, equivalenti.
i quali danno luogo aIle II nucleo del sistema, essendo
modificazioni percepibili come un insieme di patches MAX
variazioni interaurali di ognuna delle quali svolge un
ampiezza. Processori digitali di diverso compito, non risulta
segnale generano il segnale quindi essere vincolato ad un
ritardato ed il segnale particolare hardware ed e in
riverberato. grado di adattarsi ai differenti
L'utilizzo di controllers MIDI dispositivi impiegati.
come nucleo di controllo del
sistema permette inoltre il
controllo remoto e quindi di Descrizione del sistema
essere svincolati dalla posizione
della consolle di mixaggio; II sistema, illustrato
questa consente una notevole schematicamente in figura 1, e
comodita operativa e non esclude composto da:
l'utilizzo del programma da 1) un banco di 8 potenziometri
parte degli stessi strumentisti controllabili via MIDI (ACM 1)
durante un concerto. per ottenere gli inviluppi di
II sistema puo trovare impiego ampiezza associati a ciascun
in tutte quelle applicazioni altoparlante;
esecutive in cui l'aspetto del Live 2) un processore digitale di
Electronics necessita della messa segnale dedicato alla
in opera di spazializzazioni anche riverberazione;
complesse. 3) un processore digitale di
Rispetto ai sistemi dedicati alIa segnale dedicato alIa generazione
spazializzazione, quali il di delay con tempi di ritardo
processore SP-I [5], l'Halaphon, variabili da 1 a 30 ms;

352
4) un banco di 8 potenziometri ritardato ed il segnale
controllabili via MIDI (ACM 2) riverberato eventualmente anche
che consente di controllare il per pili linee.
segnale diretto, il segnale

CONTRO_I----<....
FADERS

_
AUDIO
SOURCE
i~l~~~~=~~~~~DEL!A~y~ ACM 1

REVERB
AUDIO
MIXER

FIGURA 1

Descrizione del generatore rendere il movimento pili 0


di rnovirnento menD evidente (vedi figura 2).

II generatore di movimento e
costituito da due generatori di
inviluppi di ampiezza
smcronizzati in modo da ottenere
una dissolvenza incrociata tra il
livello di ampiezza di un
diffusore e quello successivo nel
percorso stabilito. La curva di
inviluppo e definibile dall'utente
in varie modalita (disegno
manuale, generazione
algoritmica), prevede il
controllo di soglia (livello
ampiezza minima) ed e
visualizzabile graficamente. FIGURA 2
Questo consente di adattare il
generatore di mov imento a
qualsiasi ambiente esecutivo e di La curva di inviluppo e
definita in una tabella di 128

353
punti e viene Ietta con una I percorsi vengono realizzati
frequenza di clock variabile il associando a ciascun inviluppo
cui valore massimo e di 200 Hz; un MIDI Control Change
questa garantisce una quantita differente.
ridotta di dati MIDI da trattare e La tecnica della dissolvenza
da trasmettere, unitamente ad incrociata fa sl che vengano
una sufficiente risoluzione. Per coinvolti solo due controllers
ottenere inviluppi di durata alla volta e pertanto e facile
inferiore ad 1 s mantenendo realizzare percorsi diversi ed
inalterata Ia quantita di dati articolati. II cambio di MIDI
MIDI generata, viene Control Change avviene in
incrementato il passo di lettura corrispondenza dell'inizio di
della tabella. lettura della tabella onde evitare
Tutte Ie prove sperimentali disturbi dovuti alla
effettuate hanno dimostrato commutazione.
accettabili durate di inviluppo Nelle figure 3 e 4 sono
fino a 0.1 s, ben al di sotto visualizzati due particolari del
quindi dell'uso corrente. generatore di movimento.

FIGURA 3

354
I
Is Run_l_A lis Run_LBI Is iter _1 I

r---t I I I
19ah' 8 119ate 8 I
I I III II
I
Is F_L.B I
IsF_L71
IsF_L61
Is F_L51
- Is F_l_41
Is F_L31
Is F_L2
IsF_Ll
::::::
=
FIGURA 4

II fatto che siano coinvolti solo Nel caso di rapidi movimenti,


due diffusori alla volta provoca Ie brusche variazioni di ampiezza
un calo di potenza nel sistema tra un diffusore e l'altro possono
generale di amplificazione e una diventare fonte di disturbo; allo
estrema polarizzazione del scopo di ridurre questa
segnale su una coppia di inconveniente fino a livelli
diffusori. Per compensare questa accettabili, un apposito algoritmo
inconveniente e stata introdotta si incarica di modificare
una linea di ritardo con tempi l'inviluppo di ampiezza di ogni
variabili da I a 30 ms, in altoparlante attenuando la
dipendenza dalle dimensioni pendenza della curva di
della sala, che invia con illivello aumento/decremento di ampiezza
opportuno a tutti gli altoparlanti e contemporaneamente
il segnale ritardato. Per l'effetto innalzando il livello medio di
precedenza [7] si mantiene la soglia, ossia di permanenza di
direzionalita del suono che parte del segnale su tutti i
arriva dalla coppia di diffusori. In pratica, pili la
altoparlanti attiva, rna l'energia velocita di rotazione e rapida,
totale del segnale e data dalla pili ridotta e la variazione di
somma del segnale proveniente ampiezza tra i vari diffusori
da tutti i diffusori. durante la transizione del

355
segnale. II percorso della segnale diretto e quello
sorgente puo essere effettuato riverberato.
anche al di fuori della spazio In figura 5 e riportato uno
reale in quanta esiste un schema di spazializzazione a 4
algoritmo che determina una canali.
opportuna proporzione tra il

SIGNAL

DELAY

FIGURA 5

Modalita esecutive dispositivi MIDI di ogni tipo e/o


tastiera alfanumerica del
11 controllo dei movimenti computer dei seguenti parametri:
della sorgente sonora puo essere attivazione/disattivazione della
gestito con due modalita linea di spazializzazione, livello,
operative: "Realtime" e velocita, direzione di
"Playlist". movimento, allontanamento e/o
La modalita Realtime prevede avvicinamento della sorgente
il controllo gestuale della sonora. L'assegnazione di
maggior parte dei parametri di controller MIDI esterni ai
gestione della spazio; la modalita diversi parametri del
Playlist consente di definire una programma e completamente
lista di eventi di spazializzazione configurabile dall'utente.
che possono succedersi in modo - Modalita Playlist: questa
automatico 0 mediante comando modalita prevede di
impartito dall'utente. memorizzare su di un file ASCII
- Modalita Realtime: in questa la successione degli eventi-spazio
modalita il programma prevede mediante un linguaggio
il controllo in tempo reale via simbolico. Tale linguaggio e

356
composto da diversi tipi di numerose finestre possono essere
istruzioni, fra cui: spostamento richiamate mediante pulsanti e
del segnale verso uno (0 piil) menu a tendina.
diffusori lasciando inalterato il 11 setu p princi pale del
livello degli altoparlanti da cui 10 programma nel quale vengono
spostamento ha inizio, impostati i parametri relativi
spostamento del segnale verso all'ambiente esecutivo (posizione
uno (0 piil) diffusori con degli altoparlanti e dimensioni
crossfade rispetto agli della sala) viene eseguito in
altoparlanti da cui 10 modo grafico; durante l'utilizzo
spostamento ha inizio, rotazione del sistema sono disponibili una
circolare in senso orario 0 serie di finestre che consentono
antiorario, percorso aleatorio, di tenere sotto controllo ogni
percorso complesso iterativo aspetto dell'esecuzione sia con
(ossia ripetuto fino a nuovo visualizzazione grafica che
comando), percorso complesso numenca.
non iterativo (ossia spostamento
della sorgente sonora da un
punto ad un altro 0 allo stesso Conclusioni
passando per uno 0 piil
diffusori). La sintassi utilizzata Nello sviluppo di questo
dallinguaggio e la seguente: lavoro si e cercato di tenere il
programma il pili possibile
comment, opcode parI svincolato da apparecchiature di
par2 ... parn; qualsiasi tipo e marca; la
struttura modulare del progetto
II commento e cio che viene consente inoltre di aggiungere
visualizzato durante l'esecuzione altre patches in grado di
e serve per comunicare all'utente compiere nuove funzioni senza
cio che sta facendo il dover modificare il software
programma; il codice operativo nelle sue linee essenziali. Le
definisce il tipo di evento da varie subpatches dedicate ai
compiere ed e seguito dai diversi compiti sono infatti
parametri propri dell'istruzione legate al kernel del programma
da eseguire. in modo autonomo. Durante
movimenti rapidi della sorgente
sonora non e stato presQ in
Interfaccia utente considerazione l'effetto doppler
in quanta Ie conseguenti
Particolare riguardo e stato variazioni di frequenza, pur
dato all'interfaccia utente con 10 dando un notevole effetto di
scopo di semplificare al massimo realismo, diventano fattore di
Ie operazioni di configurazione disturbo su un piano puramente
del sistema e l'utilizzo del musicale.
programma stesso; tutte Ie

357
Riferimenti bibliografici for Sound Location",
Proceedings of the International
[1] J. M. Chowning: "The Computer Music Conference
Simulation of Moving Sound 1989 Columbus, Computer
Source", Journal of the Audio Music Association, San Francisco
Engineering Society, 19: 2-6, CA,1989.
1971. [5] "A Real-Time MIDI
[2] M. Graziani: "Riverbero e Processor for Sound
Spazializzazione nel processore Spazi alization", Operating
4i", Quaderno LIMB 4, pp. 41- Manual, Spatial Sound Inc.,
48,1984. Faifax, CA, 1990.
[3] H. P. Haller: "Live- [6] "Sound Engineer",
Elektronik", Tei1ton GeneralMusic S.p.A., S.
Schriftenreihe der Heinrich Giovanni in Marignano (FO),
Strobel Stiftung des 1993.
Siidwestfunks, pp. 41-46, [7] B. C. J. Moore: "A n
Barenreiter-Verlag, Kassel, Introduction to the Psychology
1980. of Hearing", Academic Press,
[4] N. Bernardini, P. Otto: New York, 1982.
"TRAILS: An Interactive System

358
A SYSTEM FOR REAL-TIME CONTROL OF
HUMAN MODELS ON STAGE
A.Camurri F.Giuffrida G.Vercelli and R.Zaccaria

DIST - Univ. di Genova


via Opera Pia lla, 1-16145 Genova, Italy
e-mail: sand@dist.unige.it

Abstract System Architecture

The employment of human The animation of a human model


models interacting with actors on is carried out by tracking the
stage is often used in science movements extracting the
fieton and fantasy movies with relevant points of the human
sophisticated, three dimensional figure, the positions of the joints
graphic tools. It is a difficult task of the human skeleton, and
to use these techniques in a calculating the kinematic
theatrical environment. In a structure to move the model. The
theatrical performance it is real-time tracking of the actor's
essential to exhibit a real movements is realized using an
interaction between the actor and advanced acquisition system
the model, so that the system is named CosTel [7] [1] CosTel
able to react in real-time. Such a (Space Coordinates by means of
system should be multi-medial in Linear Electrical Transducers), an
a whole sense: it should be able acquisition and processing
to generate and modify music in system for three dimensional
real-time influenced by the kinematic data designed for use
movements of the actors on stage, in biomechanics, neurology,
and the directives of both the robotics and sport medicine. The
choreograph and the composer. main feature of CosTel is a high
In this paper, a prototype of such resolution recording capability
a kind of "theatrical machine" is which allows to simultaneously
described, for the control of a track the movement of several
theatrical (including musical) infrared spot lights, both in space
event. and time.
We can record the spatial
trajectory of each independent

359

f Figure 2: Example of reconstruction

~f course: the actor must play


mto a restncted area, and as far as
possible in front of the cameras
since every camera has to acquir~
all the markers.

Preprocessing and Extrapo-


Figure 1: Structure of Acquisition's lation
System
In order to have a good real-time
zone by placing the markers in
tracking of every marker, the
selected zones of the human
system has to see each marker
body. A special vision device
from the three cameras: this
formed by three anamorphic
precondition is often not verified
cameras generates three streams
due to the actor's movements
of monodimensional data. A
during a performance.
reconstruction algorithm extracts
Given an actor moving on a
from .these streams the spatial
stage, it often happens that some
coordmates. The markers consist
markers are not visible from all
o.f infrared light emitting diodes
cameras: so it is necessary to
fued sequentially and scanned
have an algorithm able to
simultaneously by the three
extrapolate missing data from the
anamorphic cameras, allowing
previous measures.
the automatic markers
identification. Markers data are !his module (PE) simply keeps
III a queue the measures, and
t~ansmitted via a special radio
when a new incomplete measure
lmk to the host computer, in
arrives (with some markers data
order to synchronize and merge
missing), it extrapolates the
each data stream coming from the
missing data using a standard
cameras. The total absence of any
cable connection between the minimum quadratic distance
approximation algorithm..
markers driving unit and the
CosTel main unit leaves the actor
absolutely free in his movements.

360
3-D Reconstruction of the
human figure

From the PE module we obtain a


stream of frames (normally 15-20
frames per second). Each stream
of frames represents the measures
calculated by CosTel and
processed by the PE module,
transformed into a smooth
trajectories of the markers.
Starting from these trajectories, it
is necessary to compute the
complete human kinematic
structure at each frame computed
by the PE module.
So far, the problem is how to
move from a rough geometric
description (a nebula of 3-d
points independently moving, see
the central image of figure 2) to a
structured kinematic description
(constrained to the human body
structure, see the image right in Figure 3: Kinematic human structure
figure 2). To reconstruct the
human body is necessary: Each matrix represents the roto-
translation relation between the
• link the 3-d points to form the considered frame and its relative
skeleton that represent the man frame (called the ancestor). The
• compute the spatial- frames are linked in a chain.
geometrical relations. Movements, rotations and
positions are relative to the
To perform these two steps, it is ancestors and are not absolute.
necessary to use a representation This is a traditional approach
of the body: in our system the often used for the control of
body structure is based on robots in real time [5].
kinematic chains of geometrical As shown in figure 3 a certain
frames, where each frame is a number of frames are attached at
4x4 homogenous matrix, every joint of the human body.
describing the position and the One frame defines the position of
orientation of the point. A 3x3 the joint, the others (max 3)
matrix represents the orientation represent. the degrees of freedom
of the target, and a vector (one for each degree of freedom).
represent the position. In this way we can simplify the

361
calculus of complex rotations in actor, we recompute the
partial, sequential rotations. The kinematic transformation.
partial rotation are computed and Starting from the root frame
executed always in the same MAN, and considering the
sequence, otherwise the position direction of the motion, we
we obtain is not the one we need. calculate the rototranslational
The rotational frames of a joint matrices going in the two
are all in the same position (the direction of the chain, as shown
position of the ancestor) and are in figure 4.
linked in the sequence in which
the rotations are computed and
executed. The base frame of the
human figure is the MAN. The
""r"
reference frame attached in the Tr
pelvis position defines the 'ellAs"p
position of the human body on
the stage. Pelvis-up and Pelvis- f
MAN
down are the first two derived
frames of MAN, pelvis-up has 3
degree of freedom (x-y-z order),
1
pelvis-down is strictly linked to l-
ilium and r-ilium, the frames
A
,_tlUIl I_iluln

linked and used when you need


to use only one leg and not all the
human structure. l-ilium is linked
Figure 4: Order of generation of
to l-hip, this joint has now two matrices
degrees of freedom, on the x axis
and on the y axis. The complete The final reconstructed human
body is in this way built. It is also kinematic structure is used to
possible to use only a part of it, if control the graphic model. The
it is necessary to animate only a graphic model, and also the
leg or an arm. In this case the virtual world is based on the Iris
ancestor of the structure is not yet Inventor Toolkit, the software
the MAN frame, but the ancestor toolkit running on the Silicon
of the chain used: for example the Graphics.
father of the left leg is l-ilium.
After modelling the human body
as a tree of kinematic chains The Human Kinematic
starting from the pelvis, it is Algorithm
necessary to fit each geometric
frame with the position of the In this section we refer to the
markers onto the actor. Once frames defined in figure 3. The
solved the problem of the Human Kinematic Algorithm
positions of the markers on the builts up the kinematic structure

362
of the human joints [6] starting 4. Using the markers on
from the measures extracted from thorax, left horn, right horn and
the markers. head, calculates the position of
To explain the algorithm we must head, and after computes the
focus on three base functions: matrix, keeping the Z axis in the
same versus of MAN.
o Znim - starting from three
points A, B, C individuates the
plane and calculates the vector The system in a theatrical
V normal to the plane, starting environment
from A
o Xnim - starting from three The system we are developing
points A, B, C it calculates the allows the real-time tracking of a
vector (B-A) and normalizes it. human figure onto a virtual
o Ynim - Completes the replication. Starting from the
orthonormal triple build with current software prototype we
Xnim and Znim. plan to use our virtual "puppet" to
perform further transformation
The algorithm is the following: useful on stage, including the
1. lnit the frame MAN using following:
the marker on the pelvis, the 1-
ilium and the r-ilium: it calculates o specular reconstruction
with Z min the Z axes orthogonal o from the kaos to the man
to the plane individuated by the o modifying the shape of the
pelvis, the r-ilium and the I-ilium, body part in real-time
then calculate a second vector o mime on stage, real world on
parallel to the Z axes of the the screen
environment, so it calculates the
rototranslational matrix of the Specular Reconstruction
orthonormal triple. Modifying the regular tracking of
the movement in to a specular
2. lnit the positions of Pelvis- tracking (mirroring the left side
up and Pelvis-down as middle with the right side), the actor can
point between I-hip and r-hip at compete with his puppet.
known distance, then calculates
the matrices, as translations of From the Kaos to the Man
MAN. The animation software is
organized in a way that it is
3. lni! dorsum using the possible mount and umount the
coordinates of the marker on the graphical object on the kinematic
thorax, then calculate the matrix structure in real-time. So we can
for dorsum using those assemble the puppet building the
coordinates and the frames just different object starting from the
computed. surfaces and nearing the objects

363
to the place where the puppet is WinProcne/HARP [2] as
going to be created at a given analogical modules. Their
speed. Since the surfaces of the outputs are also used for
objects are tracked by the controlling computer music
kinematic structure, even if are synthesis tasks.
not on the puppet, it is possible to From a knowledge representation
manage a transformation from a viewpoint, WinProcnelHARP is
complete kaos situation from grounded on a twofold
which to the image of the moving formalism: analogical and
puppet, is also possible to have symbolic. The low-level sound
{\em explosions} of the puppet, representation and data on real-
or part of it, during the motion. time performance, as well as the
CosTel recognition algorithms,
Modifying the Shape of the Body are managed by classes, included
Parts in Real-Time in an object-oriented concurrent
This is under implementation, it environment (the analogical level
will be possible to transform any of representation). The analogical
part of the body. Now is possible level is implemented as a set of
only to change colors, materials actors [3]: each actor is hooked to
and dimensions of the body parts an object in the symbolic level,
during the execution of an action. which is the repository of high
level entities, scores, composition
Mime on Stage, Real World in the rules, high-level descriptions and
Screen definitions in general. The
The actor is on a empty stage, but symbolic level of representation
he/she can interact with a puppet consists of a declarative symbolic
that lives in a world full of environment, including a
objects, that he can play, move multiple-inheritance semantic
and pass trough them, so we shall network formalism derived from
integrate our system with devices KL-ONE [4], extended with
as gloves, goggles and other units temporal primitives and with a
used in virtual reality typing mechanism.
applications, to improve the
interaction between the actor and
the "virtual world" that the References
puppet lives into.
[1] A.Cappozzo, "Three-
dimensional analysis of human
Music Generation walking: Experimental methods
and associated artifacts", Human
The data extracted from the Movement Science, 10:589--602,
CosTel sensing system, and pre- 1991.
processed by the reconstruction
software are embedded in

364
[2] A.Camurri, C.Canepa,
M.Frixione, and R.Zaccaria,
"Harp: A system for intelligent
composer's assistance", IEEE
COMPUTER, 24, July 1991.

[3] G.Agha, "ACTORS: A


Model of Concurrente
Computation in Distributed
Systems", The MIT Press,
Cambridge, MA, 1986.

[4] R.J. Brachman and J.G.


Schmolze, "An overview of the
kl-one knowledge representation
system", Cognitive Science,
pages 9:171--216, 1985.

[5] G.Vercelli F.Giuffrida,


P.Morasso and R.Zaccaria,
"Hybrid architectures for
intelligent robotic systems", Int.
IEEE Workshop on Robot and
Human Communication, Tokyo,
Japan, 1992.

[6] P.Morasso, M.Solari, and


G.Vercelli, "Software tools for
analysis and representation of
movements", Proc. Symposium
on Biolocomotion, Formia, Italy,
1989.

[7] Macellari V. Costel, "A


computer peripheral remote
sensing device for 3 -dimensional
monitoring of human motion",
Medical and Biological
Engineering and Computing,
21:311--318,1983.

365
THE USE OF 3-D AUDIO IN A SYNTHETIC
ENVIRONMENT: AN AURAL RENDERER FOR
A DISTRIBUTED VIRTUAL REALITY SYSTEM

Stephen Travis Pope and Lennart E. Fahlen

DSLab - Swedish Institute for Computer Science (SICS)


lsafjordsgatan 26, Kista, S-16428 Sweden
Fax: +46/8nSl-7230; E-mail: stp@sics.se, lef@sics.se

Abstract they perceive their environment.


We have investigated the addi- The SICS Distributed Interactive
tion of spatially-localized sound Virtual Environment (DIVE) [1]
to an existing graphics-oriented [2] is a virtual reality (VR)
synthetic environment (virtual framework for building applica-
reality system). To build "3-D tions and user interfaces based
audio" systems that are robust, on the metaphor of presence and
listener-independent, real-time, interaction in 3-D-space. It uses
multi-source, and able to give graphical rendering-visualizers-
stable sound localization is be- to display objects in simulated
yond the current state-of-the-art- worlds. As an extension of this,
even using expensive special- we developed a package for the
purpose hardware. The "aural- generation of multiple audio sig-
izer" or "aural renderer" de- nals that have spatial localization
scribed here was built as a test- cues derived from the geometry
bed for experimenting with the of virtual worlds-the so-called
known techniques for generating "auralizer" [3].
sound localization cues based on The DIVE is a multi-user virtual
the geometrical models available reality system that is imple-
in a synthetic 3-D world. This mented within a loosely-coupled
paper introduces the psychoa- heterogeneous multiprocessing
coustical background of sound environment (i.e., a network of
localization, and then describes different UNIX workstations).
the design and usage of the Independent DIVE applications
DIVE auralizer. We close by (called "Als" after W. Gibson),
evaluating the system's imple- run on nodes distributed within a
mentation and performance. local- or wide-area network, and
update a shared (node-wise
Introduction replicated) object database. This
Sound plays a very important architecture enables the distribu-
role in the way humans commu- tion of object animation and in-
nicate with each other and how teraction processes for load-

366
sharing between the user input want to provide a framework in
device handling, and graphical which one can experiment with
rendering tasks. models of the perceptual cues in-
The auralizer models sounds that volved in sound source localiza-
take place in the simulated world tion, and test these in virtual
using a "source/filter/listener" worlds in combination with vi-
model of sound processing. It sual cues and 3D-interaction de-
can also be used as a stand-alone vices such as magnetic trackers.
system for 3-D rendering of This document discusses the de-
complex interactive sound envi- sign of the DIVE auralizer. We
ronments. The package allows first outline the basis for spatial-
DIVE AIs to use sound as an- ization of sounds in the hearing
other output medium, and allows system and its implications for
researchers to experiment with system design and performance.
the use of 3-D sound environ- We focus on the "cheap" psy-
ments, allowing, for instance, choacoustical models it uses-de-
new kinds of user feedback using rived from the literature of
different "aural perspectives" on recording engineering and elec-
objects as a way to heightening troacoustic music-for doing
the degree of illusion. model-based sound spatialization.
There are several researchers in We then discuss the auralizer as
VR technology who are develop- a real-time, distributed DSP
ing sophisticated models and application. The work described
hardware/software systems for in this report was done as part of
listener-specific or generic real- the MultiG project, a Swedish
time 3-D sound spatialization national effort at developing
(e.g., [4] or [5]). Also, a number very-high-speed wide area net-
of expensive hardware systems works and distributed software
aimed at high-end music studio tools and applications.
and film post-production work
have been announced recently by Synthesis of
major manufacturers. Neither of Localization Cues
these domains is the main pur- There exists a significant litera-
pose of our work in the DIVE ture in psychoacoustics that is
auralizer, rather we are inter- devoted to understanding how
ested in achieving a "pretty humans determine the character-
good" 3-D spatialization over istics of spaces using aural cues
headphones or stereo loudspeak- and how we localize sound
ers using known techniques and sources in a 3-D space. As in
widely available hardware, also several other areas of perception
adding the requirement that the psychology, there are several
system run in real-time with simple, well-understood mecha-
multiple (on the order of ten or nisms at play, and other, quite
more) active sound sources. We complex and still poorly-under-

367
stood, ones that are equally im- aspects of a spatial model as
portant. It is obvious that it is shown in Table 1.
central to our ability to localize There are several approaches to
sound sources that we have two simulating the cues we use to lo-
ears, that they are rather direc- calize sounds. Some of them are
tional in their reception pattern, based on direct interpretations
are separated by a small (and and implementations of the
constant) distance, and face in physics of sound propagation and
different directions. Deeper re- the anatomy of our ears-so-
flection leads to the realization called sound ray-tracing tech-
that our ears are asymmetrical in niques that use the "pinna trans-
both the horizontal and vertical form," also called the "head-re-
(cutting through the head) lated transfer function" (HRTF)-
planes, and that there are other and others are based on simple
cues (e.g., Doppler shift, or in- psychoacoustical insights. We
ter-ear delays), and higher level will present an example of the
functions in the brain (see e.g., latter kind of model-the type that
[6] or [7]) that we use to localize we are trying to explore and im-
sound sources. prove upon in the DIVE aural-
The basic model of sound local- izer.
ization uses direction and dis-
tance to characterize the relative It Amplitude (loudness) distance to
geometrical orientation of the source
sound source and the listener, It Position (in 3D space) direction to
and provides cues for these, such source
as relative and absolute ampli- It Spectrum (many dimensions)
tude, spectrum envelope and in- distance, direction, space
ter-ear delays. Simple rever- It Reverbemtion (ratio/chamcteristic)
beration as a cue for the charac- distanee,space,envinJnment
teristics of the space is also It Inter-aural delay time direction to
widely used, whereby rever- source (Haas effect)
beration time can be keyed to the Table 1: Possible mappings
volume of the space, and the di- between sound features and
rect-to-reverberated signal ratio spatial cues
is mapped onto the source-lis-
tener distance. The more com- Figure 1 below shows the geo-
plex relationships between the metrical model and the basic
spectrum of the sound and its lo- mapping equations used in cur-
cation and the characteristics of rent psychoacoustical models of
the space is still a topic of re- localization. The distances-shown
search. as a and b in the figure-between
Looking at the basic perceptual the source and the listener's two
features of a sound, we can re- ears, which are assumed to be
late each of them to one or more separated by the fixed distance e,

368
determine the relative amplitude kHz) combined with a mild boost
of the signal that the listener in the 7 kHz region.
hears. The left-right stereo vol-
ume balance is related to the ra- GeQmelrv
tio of the difference between a - P Source

~~(-
(S9lII1fromtheq,)
b, and e (shown by the angle e
in the horizontal [x/z] plane). z~~ Lisl9l19f
The amplitude scale factor and X
9
direct-to-reverberated signal ra- (seen from the back)

~
tio are proportional to the square
of the average distance
d = (a + b) / 2. The absorbtion
y~ ~
x
of high frequencies by the air
between the source and the lis- Figure la: Auralizer spatial
tener can be modeled as a low- model
pass filter whose slope (Q) and
cutoff frequency vary propor- The change in the reverberation
tional to tfl. signal can be approximated by
There are several ways to model the addition of more early re-
the spectral changes in a signal flections to the signal, simulating
according to its direction in the echoes off of the walls and ceil-
horizontal plane. Making sound ing of the simulated environ-
sources behind the listener "less ment.
bright" than those in front of, or This simple geometrical model
to the side of, the listener is per- has all the necessary information
haps the most trivial model. This to map geometrical properties
can be simulated simply by a onto more sophisticated or
low-pass filter whose Q and cut- higher-level cues, such as more
off frequency vary (according to complex filters for the front-
some non-linear look-up table) back and height cues, and more
with e. The height of a sound location-dependent reverberation
source, measured as angle 'II in techniques.
the x/y plane, effects the inter- There are many other details of
aural time delay, the sound's localization, such as what cues
spectrum (via the HRTF and we use to differentiate between
possibly the listener's shoulder), sounds that have the same inter-
and the characteristics of the re- aural time delay-those that lie on
verberation. The spectral effects the so-called "cones of confu-
are complex and still poorly-un- sion" that surround each ear and
derstood, but are simulated in are symmetrical with respect the
our "cheap" model by two fil- medial plane (the y/z plane
ters: a fairly broad band-reject topassing through the listener's
or "notch" filter in the upper head between the ears)-and how
speech range of frequencies (2-4 we determine the difference

369
Spatial Cues ing component, which generates
Distance (d) = (b +a) /2 a visual display. The user inter-
Loudness a d2
action, database management,
LJRratio a (b-a)/e
Low-pass filter a d2
and rendering systems can be
DirecUreverb ratio a d2 relatively independent of each
Spatial low-pass filter ratio as other (as they are in the DIVE),
Spatial band-reject filter a'¥ with the interface asyn-
Inter-aural time delay a (a-b)/ e chronously driven by changes in
Initiavdecay reverb ratio a'¥
(many more possible)
the user's position and the state
of the objects in the world each
Figure Ib: Spatial localization time the renderer generates a
cues new frame of video information
for display. The renderer can
between the characteristics of the thus be said to "poll" the world
space and of localized sound and user state for each frame it
sources based on the nature of displays.
the signal's reverberation. These The basic unit of activity in the
are, however, beyond the scope DIVE is the AI, or application-a
of our present model. The UNIX process running on some
reader is referred to [9] or [10] node in the network that broad-
for more detailed discussions. casts change messages via the
system's event distribution
Building an Auralizer for mechanism. An AI might, for
DIVE example, represent a clock that
We will now describe that envi- updates itself every second, dis-
ronment (the SICS MuItiG tributing messages throughout
DIVE) in enough detail to pro- the network as to its new visual
vide the background for our in- appearance. Visualization AIs
troduction of the software archi- that are in the same world as the
tecture of the DIVE auralizer. clock would receive this mes-
For a more in-depth discussion, sage, and update their renderings
see [1] and [2]. of it (in case it is in the user's
The DIVE Architecture field of vision).
Most VR systems are mainly The DIVE architecture allows
concerned with a 3-D rendering multiple "virtual worlds" (which
of a simulated virtual world, and mayor may not be connected by
with user navigation among, and gateways) to be active on the
interaction with, objects in this same network, with one or more
world [8]. In these systems, some users in each of them at anyone
"database" of exists, along with time. This distributed, muIti-
some data about the "user's" world, multi-user, non-server
position in the world (the posi- (peer-to-peer) architecture is
tion and perspective from which perhaps the most interesting
the world will be rendered). feature of DIVE. Figure 2 shows
This data is used by the render-

370
a schematic overview of the ar- The DIVE Aura/izer
chitecture. Visualization and in- AI programs "register" sounds
teraction nodes manage the with the auralizer by sending out
graphical output and gestural in- ~ess~~es that include a unique
put for the user. The visualiza- IdentIfier, a sound file name, and
tion nodes can potentially sup- a sound index, and a "perspec-
port stereo-optic output systems tive" (which can be thought of as
such as head-mounted displays. a sample name and channel and
key numbers in MIDI synthe-
CN User sizer terminology). The AI can
~~I~Ul later play this sound by sending
~ VISUai~er out a message that includes the id
and indices, and the relative
OM: VA Worids
~=~ amplitude of the sound.
The auralizer runs as a separate
~ AlJraI~er
@ Q
l;.
<:> ~-;:::::::;
~/ Outp
(arlo)
process, probably on a network
node with high-quality stereo
audio output hardware. It main-
VIN = Visualization/Interaction Node tains a table of the sounds that
eN =Compulation Node have been registered by AIs in
AN =Auralization Node
the current world and responds
Figure 2: DIVE architecture
to sound output messages by
playing the chosen sounds, pos-
sibly mixing them with other ac-
Computation nodes can execute
AI processes such as ticking tive sounds and spatializing the
clocks, or "supporting" processes result to provide for an aural
model of the virtual space. To
such as a collision detector or
gravity simulator, which update trigger a sound, the auralizer
the state of other objects in the needs to know which sound is
world database without introduc- being requested, the position of
ing any of their own. The degen- the sound source (possibly in
erate case is that all of these pro- motion), the position and orien-
cesses are running on the same tation of the listener, and the
workstation, though we use be- characteristics of the virtual
tween three and six in common space. It will use this informa-
practice. tion to select and process the
The Auralizer node shown in the stored sample. There can be sev-
lower-right part of Figure 2 is eral types or levels of auralizers
the component that we have depending on the amount of
added in this project. Like a vi- computation power that can be
sualizer, it will "render" the state dedicated to the auralization
of the world object data base, but (Le., if it is running on the same
using stereophonic audio instead workstation as the visual ren-
of video as its output medium.

371
derer, or if special DSP hard-
ware is used).
The architecture of the auralizer distr event
was debated for some time, and
several approaches were proto-
(~rxooJ$ )r alls

typed. The final design was fac- register/trigger sounds


tored into several components:
(l) the interface to the network
distribution mechanism; (2)
sample structure storage and
management; (3) the geometry
functions for determining the Mixerlreverberatorlfilter
relative locations of sound
sources and mapping the geome- VR o~ect data
try onto the parameters of the
mixer, filter and reverberator; Auralizer
(4) the sound mixer, filter and
reverberator; and (5) the real-
time output handlers for the Figure 3: Auralizer Architecture
DACs. The rough architecture of
the APE and its interconnection The sound mixer functions can
with the rest of DIVE is illus- manage multiple active clients at
trated in Figure 3. The system different locations, and perform
runs as three "threads" (light- the summation of up to 32
weight processes): the event, the monophonic sources into a stereo
geometry, and output loops. output stream. The reverberation
When an AI registers a sound, model adds a small amount of
the sample manager stores the reverberation (mixed according
sample data of the sound, its du- to the geometry engine's data) to
ration, rate, format, and other the mixed sound array. The
data. The second component is DIVE's distribution mechanism
the geometry engine. Each time communicates between Als and
a client plays a sound, it is neces- visualizers. It registers handlers
sary to get the coordinates and for the collection of callback
orientation of the user (defined functions that Als can call with
as the user's "ear" objects), and sound output or other auralizer-
of the AI in order to determine related messages. Finally, the
the relative position and ampli- real-time drivers and interrupt
tude of the source. This data is handlers for the sound output via
updated while the sound is play- the DACs are implemented so
ing by the geometry thread, as that different DAC hardware can
both the source and the listener be substituted with relative ease.
may be in motion.

372
Applications network running multiple sound-
A number of application areas generating AIs on various nodes.
have been targeted for our test- We implemented the spatial cues
ing and further development of of loudness, inter-aural balance,
the DIVE auralizer. The aim in directional filtering in the hori-
general-purpose VR-based user zontal plane, and simple rever-
interfaces is to increase the natu- beration (fixed reverberator pa-
ralness and reduce the cognitive rameters with mixing of direct
load of the interface. The initial and reverberated signals). The
applications were various ticking system was found to support
clocks and bouncing balls. These sustained real-time output with
were used to debug both the between five and ten AIs making
system and its initial spatial noises and to provide a "reason-
models. More complex applica- ably good" spatial model, with
tions to date include an inte- the expected problems related to
grated system using the ANIMA the "cones of confusion" and
choreography [11] system to- missing height cues. The system
gether with DIVE visualizers was ported to the Silicon
and auralizers for dance. Graphics Indigo and Sun
The system will also be used in SPARCstation-lO platforms in
teleconferencing systems to give April of 1993, where it ran sig-
cues about activities that goes on nificantly faster and needed no
"off camera" such as participants additional hardware for sound
entering and leaving the confer- output (on the Indigo).
ence. We hope to be able to build The first auralizer that was built
more fully featured environ- mixed 16-bit 8 kHz monophonic
ments for artistic-aesthetic ex- sound samples into a stereo aural
pression, Le., sound sculptures, image. The current generation
music compositions, virtual in- runs at 16 kHz and incorporates
struments and interactive experi- better stereo reverberation and
ences such as films and games. A inter-aural time delay. More so-
related project aims to make it phisticated filters (modeling the
possible for sight-impaired per- HRTF more closely), and dy-
sons to participate in activities in namic reverberation algorithms
synthetic environments. are planned.
We have also avoided the issue
Evaluation of AIs that generate their sounds
The DIVE auralizer was devel- in real time by requiring them to
oped through several phases pre-register sound samples. We
during the Summer of 1992. The do this because we do not want
initial platform was a Sun to have to address the issues of
SPARC-station-2 with Ariel network bandwidth with multiple
Corp. digital-to-analog conver- real-time sound sample streams,
tors. It was tested in a DIVE or of abstract timbre description

373
languages, in the present system, more sophisticated modules can
which is intended for experi- be substituted after this initial
menting with localization mod- implementation. Another design
els. The experimental ANIMA criterium was to have loose
system uses disk-based samples coupling between the auralizer's
that may be too large to fit in components, so that they can
memory, but still cannot handle themselves be distributed over a
real-time sample streams passed network in various fashions.
over the network. Examples of the "pluggability"
Due to delays in the network are the ports of the output stage
distribution system, and the mentioned above, moving the
relatively slow (10 Hz) frame sample storage to an object-ori-
rate of visualizers, exact syn- ented database management sys-
chronization of visual and aural tem, and the running the mixing,
events is difficult. We are cur- reverberation and spatial filter-
rently investigating ways to im- ing components on a special dis-
prove this. tributed real-time signal process-
It remains to be seen just how ing framework.
good we can make our spatial We hope to build better geome-
models and maintain real-time try subsystems, fancier spatial
performance on "reasonable" models, and distributed mixing
hardware. The auralizer is writ- components in the future while
ten entirely in C and does not use retaining the basic auralizer de-
the DSP coprocessors of the plat- sign. We also think it is impor-
forms it runs on. More complex tant to build, as soon as possible,
reverberators and spatial filters non trivial applications and have
can be expected to "push the en- users experiment with them in
velope of this restriction.
II order for us to gain experience
both from the "systems" stand-
Future Work point and from a "user interface"
and Conclusion standpoint.
As mentioned above, because the
auralizer is being built in the
context of a project whose goals
are understanding distributed
programming on very-high-
speed networks, and the devel-
opment of new techniques and
tools for supporting this, it was
not our intention to place a great
deal of emphasis on the exact
quality of the spatial model. The
auralizer was designed for
maximal "pluggability" so that

374
References [8] S. Helsel, J. Roth eds.:
"Virtual Reality Theory,
[1] L. E. Fahlen: 'The MultiG Practice", and Promise.
Telepresence System.", in Addison-Wesley, 1991.
Proceedings of the Third MultiG
Workshop. Stockholm: Royal [9] J. Chowning: "T he
Institute of Technology, 1991. Simulation of Moving Sound
Sources.", Journal of Audio
[2] C. Carlsson, O. Hagsand: Engineering Society 19: 2-6,
"The Architecture of the MultiG 1971.
Distributed Interactive Virtual
Environment.", Proceedings of [10] F. R. Moore: "Spatialization
the Fifth MultiG Workshop. of Sounds over Loudspeakers.",
Stockholm: Royal Institute of in M. V. Mathews and J. R.
Technology, 1992. Pierce, eds. Current Directions
in Computer Music Research,
[3] S. T. Pope, L. E. Fahlen: MIT Press: 65-87, 1989.
"Building Sound into a Virtual
Environment.", in Proceedings [11] T. Ungvary, S. Waters, P.
of the Fifth MultiG Workshop, Rajka: "Nuntius: A Computer
Stockholm: Royal Institute of System for the Interactive
Technology, 1992. Composition and Analysis of
Music and Dance.", Leonardo,
[4] D. R. Begault: "Challenges to 25(1): 59-68, 1992.
the Successful Implementation of
3-D Sound.", Journal of Audio
Engineering Society 39(2): 864-
870, 1991.

[5] E. M. Wenzel: "Localization


in Virtual Acoustic Displays.",
Presence: Telepresence and
Virtual Environments 1(1): 80-
107, 1992.

[6] J. Blauert: "Spatial Hearing",


MIT Press, 1983.

[7] Durlach, et al.: "On the


Externalization of Auditory
Images.", Presence: Telepresence
and Virtual Environments 1(2):
251-257, 1992.

375
376
Capitolo 9

STUDIO REPORT

377
378
CENTER FOR ART AND
MEDIATECHNOLOGY KARLSRUHE
THE INSTITUTE FOR MUSIC AND ACOUSTIC
P. DutiHeux
Ritterstr. 42
D-76137 Karlsruhe, Germany
Tel: +49 (0) 721-9340 300
Fax: +49 (0) 721-934039
E-mail: music@zkm.de

The Center for Art and Media The production of musical


Technology in Karlsruhe is dedi- works (tape, live-electronics,
cated to art and its relationship to sound installations with or
new media. The center is com- without other media) ;
prised of the Museum for Con-
Open processes for learning
temporary Art, the Media Mu-
and discussion (reflection of
seum, the Institute for Image
esthetic conditions) ;
Media and the Institute for Mu-
sic and Acoustics. Though the The development of computer
center is in the process of buil- music environments and
ding up its art collection it alrea- systems;
dy offers studio space. The cen- Scientific research (psycho-
ter will be fully operational by acoustics, basic musical
the end of 1997, when the new research).
building is completed.
Multimedia projects are also
Aside from the collection and possible through cooperation
presentation of art works, the with the Institute for Image Me-
center places its main emphasis dia. The two institutes are loca-
on the production of artistic ted in different buildings until
works in the various institutes. the opening of the new center.
Though the main tool for artistic The Institute for Image Media is
production is the computer, digi- situated in a former storage buil-
tal technology does not necessa- ding and offers shop space and -
rily have to be part of all works. within limits - a performance
space.
The Institute for Music and Under the direction of Jeffrey
Acoustics Shaw, the Image Institute focu-
ses its efforts on computer ani-
The Institute for Music and
mation, the construction of inter-
Acoustics is open for artistic and
active installations and architec-
scientific work in general areas
tural simulations. It also offers a
of:
video postproduction studio.

379
Facilities and Equipment of the musicians.
Institute for Music The Open Music System
Until the new building is ope- The Institute for Music and
ned, the Institute for Music and Acoustics is emphasizing the de-
Acoustics occupies two stories of
velopment of an open comput~r
an apartment building. Space is music environment for compOSI-
available for work studios (no re- tion and sound synthesis. This
cording studio), offices, semi- environment is centered around
nars, a small electronics shop Common Lisp Music by Bill
and a MIDI studio. Room for
Schottstaedt for sound synthesis
scientific research is limited. A and processing and Common
disk-based sound editing system Music by Rick Taube for score
(Studer Dyaxis) is available for
generation and algo~thmic co~­
general editing and postproduc-
position. This envI~o~ent IS
tion purposes, including radio being developed as a Jornt ventu-
play productions, final mixes of
re between ZKM and the Com-
material generated with MIDI
equipment or NeXT computers, puter Music Center at S~nfor~
(CCRMA). The software IS OptI-
and for audio postproduction of
mized to run on the NeXT com-
video. Individual CDs can be
puters used at. Stanford ~d
produced with the Sonic Solu-
ZKM, but is baSIcally machrne
tion CD Printer.
independent.
When the new building opens,
Live-Electronics
the studios of the music institute
will include a big control room, With the ISPW (IRCAM Musi-
one large recording studio (250 cal Work Station) and the NeXT
square meters), four smaller ~tu­ Computer the work for live-elec-
dios for musical work and SCIen- tronics is as well emphasized.
tific set-ups, five rehearsal stu- Composers and performers get
dios for composers and instru- support for developing computer
mentalists as well as offices for instruments that can be played
guests. either during the composition
process or on stage. We aim at
enhancing the ability of the com-
Activities and Projects puter at having a musical beha-
The Institute for Music and viour (score folowing, user
Acoustics contributes to ongoing friendly filters).
activities of the center, such as Bali Project
competitions and awards, sym-
In collaboration with musicolo-
posia and exhibitions, concerts,
gists, we develop a system for
performances and the MultiMe-
the analysis of the balinese, mu-
diale, a biannual festival. The
sic (acquisition of gender play-
music group also offers regular
ing).
workshops for composers and

380
IDEAMA The Work Environment
An important project at the In- The working procedure in our
stitute for Music and Acoustics studios is not predetermined. We
and the Mediathek of the ZKM is expect visiting musicians and
the establishment of an archive composers to be willing to work
for electro-acoustic music (IDE- with the machines directly. The-
AMA). This international archi- re are no technicians who will
ve will store digitally electronic translate or implement artistic
music from its very beginnings descriptions. Our staff will assist
along with documenting materi- with technical introductions and
als. IDEAMA is set up in coope- supply assistance when problems
ration with CCRMA, Stanford arise. Colaborative working ar-
University, and with the help of rangements are possible: e.g. a
internationally reknowned ex- composer and a software specia-
perts. list may apply as a team to reali-
ze a project at the center, or peo-
ple with different qualifications
Visitation may join together to work on a
Our studio space is currently project.
able to support three to four
guest artists working in parallel.
This number will increase drasti- The Staff of the Institute for
cally in the new building. Wor- Music and Acoustics
king visits are usually between The Institute currently employs
two weeks and three months; nine people, six full time and
longer periods for learning and three part time. The director of
working can be supported with the Institute is Johannes Goebel.
stipends. The center offers two The staff consists of Pierre Dutil-
Siemens stipends each year, one leux (signal processing and acou-
for music and one for image me- stics), Rick Taube (software de-
dia. In addition, we are able to fi- velopment and workshops), Hei-
nance individual projects with ke Staff (project selection and
grants. Third party funds raised concerts), Caroline Mossner (se-
by the artists are also welcome - cretariat), Frank Schweizer and
especially if a proposed project Gerhard Wolfstieg (introduction
needs extensive financial support to the studios and project accom-
for special instruments or perfor- paniement), Sukandar Kartadina-
mances. Students in natural ta (hardware maintenance); re-
sciences can get involved in art- sponsible for the IDEAMA is
related technical projects during Thomas Gerwin (Mediathek).
training periods (six months typi-
cal).

381
SOME OBSERVATIONS ABOUT S.V.E.M.
ACTIVITIES AT THE CENTRAL
ENGENEERING LIBRARY OF GENOA
UNIVERSITY.
Leopoldo Gamberini - Stefano Mosca
c/o Biblioteca Centrale Facoltit di Ingegneria Univ. Genova
Via Montallegro 1
16145 Genova (ITALY)
TeL +39 103532545 - Fax +39 10 318709

From a bibliotecnic and Engeneering Library of Genoa


publishing point of wiew, a University and S. Mosca a search
computerized system can give to program related to the use of
any musical work better graphics personal computer has been
than handwriting and also a lot developed ; Its aim is musical
of opportunities for analisys and creation and writing.
musicology. The Central Library, already
Through a collaboration active in using computers for its
beetween Prof. L. Gamberini activities, has arranged, for
(teacher of Hystory of Music at teachers and students, S.V.E.M.
Littterature and Phylosophy which is the System for Music
Faculty), the Central desktop publishing.

Biblioteca Centrale Facolta di Ingegneria

Via Monlallegro 5/4 16145 Genova - Tel. 0101308616

System ~s structured as below: -1 PROTEUSI1 E-mu System


- 1 PROTEUSI2 E-mu System
- 1 APPLE MACINTOSH II ci - 1 Digital Sound Procesor
- 1 PORTRAIT DISPLAY (A4) YAMAHA SPX50D
-1 APPLE LASER WRITER IIf - 1 MIDI YAMAHA
-1 APPLE SCANNER CLAVINOVA CLP 260
- 1 APPLE CD Rom - 2 APPLE MIDI INTERFACE
- 1 DYNAFILE DRIVE for -1 Amplification and Mixing
5,25",1,2M system
- 1 LETTORE Syquest 44 MB

382
Professor Gamberini , mostly from the past to contemporary
concerned with the musical experience.
aspects of the matter, offered his 4) Good final product using laser
work "Cristoforo Colombo printers with high resolution.
12/10/1492" and his experience 5) Capability of automatic
of composer,musicologist,and extraction of single parts from
conductor. the complete orchestral score.
The work is a Stage Cantata for There is also capability of
Barithone Chorus and Orchestra execution of the score realized by
proposed by Genoa University in musical electronic instruments
occasion of the Columbus MIDI connected to computer.
Celebration for America Sound generation is realized by
Discovery, and already politimbric modules using
performed with good succes traditional syntesis methods or
during Symphonic Season of the multisamples of real instruments.
City Theater in Genoa. In this type of realization is
the text is taken from "Giornale necessary to put in evidence some
di bordo" by C. Columbus and important positions:
was realized by L. Gamberini and 1) We don't want to attach any
C. Cormagi. "absolute artistic value" to
Considering of S. V .E.M. electronic execution of the piece.
performances and potential users, This realization doesn't want to
the first problem to solve was to substitute traditional performers
realize a specific configuration as conductors any way. Computer
for manipulating, without can't emulate (at this level) a
difficulities, a large score as the traditional execution without
examined one. undermining final result even if
The necessary requirements are: there are a lot of margins for
1) Video layout of the complete correction. Computer has no
orchestral score ( 26 staves in this human feel so emotional presence
case) with a more possible during execution is missing.
complete codification of the 2) Electronic performance can't
western music writing system. realize 100% of the score.
2) A good user interface to write "Cristoforo Colombo 12/10/1492
and read score in a simple way was not written as electronic
also for not "computer composition so Barithone an
competent" users. Chorus parts are not executed as
3) Possibilities for several singing text but as melodic
modifications on the edited score vowels by digital instruments.
to satisfy all needs of work 3) Electronic execution has
personalization. This fact is not principally teaching porpouse for
negligible considering great music students; they can listen
evolutions of musical notation and modify quickly orchestral
exercises.

383
II
I

III
\
11\
/

\
II III
" /

III
..-
/'

All the orchestral score has been "Aula Magna" of Genoa


rewritten with "Finale 2.6" ( University.
Coda Music Software). From this Mass media were very interested
edition are pulled out a graphic in use of Peronal Computer in
edition and a executable version Music and promoted a
in Standard Midi File form. This constructive debate beetween
S.M.F. has been subsequently technicians and musicians.
improved from musical point of Afterwards were realized some
wiew with "Vision 1.4" (Opcode performances of the complete
Systems Inc.) obtaining a large opera (as Symphonic Poem) on
document (about 500 Kb on 26 the occasion of the recent
midi channels for a duration of Columbus Celebration at
42 minutes). Scientific Community Stand
These types of experimentations inside International EXPO of
have been already illustrated in Genoa in July 1992.
some demonstrations. For the composer computer
During Concert Lessons of becomes a good instrument in
History of music organized by writing and publishing and
Literature and Philosophy simulator of sound models.
Faculty of Genoa in May 1992, These possibilties that few years
they demontrated the capabylities ago were own only of Phonology
offered by computerized Svem studios andInformatics
System performing pieces taken Laboratories are now availabe at
from the same opera. low cost on account of
On the occasion of inauguration improovement of personal
of the Accademic Year 1992 it computer capabilities.
realized the concert conference at So a large number of musicians,
professional and amateurs,were

384
interested in electronic equipment development in collaboration
in reason of semplification of with CNR.
software graphic interfaces.
Also traditional musician, without
informatic abilities can symply
use computer.
Is perhaps in this circle that
precise technical competences of
Genoa Engeneering Faculty
(already working towards
multimedia images and digital
sound) are complementary to
artistic interests realizing an
interdisciplinary seldom verified
beetween different branches of
learning like music and
informatics.
Structure "LIBRARY" as well as
place of conservation and
consultation fo books and
magazines tends to become a
bibliotechnic services in a broad
sence come up again creative
centre of a cultural technical
artistic world like from time
immemorial.
The activation of new informatic
services like book list
management, bibliographic
Database (on CD ROM and on-
line consultation is an example of
thius tendence.
Another example is the recent
inauguration of a new
"Informatic Classroom" with a
Macintosh network in free use
for University students.
SVEM project, on the other way
is evolving in a multimedia
prospect; infact we can have
graphic realization of same
original manu scrips kept at
Conservatorio N. Paganini in
Genoa and Hypertexts

385
Configurazione Sistema
S.V.E.M.

Laser Printer 11"""................_., Syquest 44MB


300/600 dpl

Apple CD ROM

.....................---1 Scanner 300dpl

MIDIINTERFACEIPATCHBAY

Emu Proteus /1

Emu Proteus /2

Yamaha SPX 50d

MIXER

Ampllflc8zlona

386
L'ATTIVITA'DELL'ASSOCIAZIONE
RICERCARE ED IL SUO STUDIO PER LA
RICERCA MUSICALE E ARTISTICA
Lucio Garau, Giorgio Tedde

Associazione RICERCARE
Via Ninasuni, 48
1-09048 Sinnai (CA)
tel-fax 070/765967

This is the last year activity of


Abstract the association.
"Ricercare" is a cultural
association born to offer a * Research stages and
technical and cultural basis for workshops.
the production, circulation, and * Research commissions for:
enjoyment of contemporary art * analysis of contemporary
and music in Sardinia. The initial music, such as mainstream,
fundamental goal of association popular, and traditional
was that of maintaining the Sardinian music;
values of the traditional culture * analysis of problems related
in the contemporary artistic to musical listening;
language. To fulfil this goal we * composition;
proceeded along several * recordings of musical and
direction: workshops, concert, acoustic material.
commissions of musical research * Collaborations with other
and compositions. European musical research
After its first year of activity, association.
the association has now focused * Shows and concerts to support
its goals. To support this and promote its work.
activity, a laboratory has also
been created for musical and
artistic research. The lab is
equipped with state-of-art L'Associazione Ricercare e sorta
machinery for digital sound and per offrire con la propria attivita
image processing , and for the di ricerca un supporto teorico e
editing and pre-printing of culturale alIa produzione e alIa
musical scores. The laboratory is diffusione e fruizione della
currently open only to members musica contemporanea in
of the association; however, it Sardegna. Dopo un anno di
will soon be moved to a new attivita l'associazione ha potuto
centre where it will be possible focalizzare i propri obiettivi e
to host visiting researchers. quindi creare uno studio per la

387
ricerca musicale ed artistica in popolare e della tradizione
generale. etnica della Sardegna;
Partita con il proposito di * studi sulle problematiche
recuperare i valori della cultura della fruizione con
tradizionale per mezzo del particolare riferimento alIa
linguaggio artistico contempo- diffusione della musica
raneo, ha sviluppato la sua contemporanea presso il
attivita sia attraverso laboratori- pubblico;
seminari, concerti, commissioni * composizione di opere
di ricerca e di composizione, che musicaliconnesse ai lavori
con l'allestimento digitale di di ricerca di cui sopra
suoni e immagini e per l'edizione * registrazioni di vari
e la stampa delle partiture. materiali acustici e musicali
Diverse esperienze portate avanti e studio delle relative
sono direttamente collegate allo tecniche;
sviluppo dell'informatica musi- * organizzazione di seminari
cale, come i1 laboratorio- di studi al fine di diffondere
seminario di Editing di partiture e pubblicare i risultati delle
con if computer, 0 la produzione ricerche commissionate ed
di Computer Music per concerto integrarle con interventi di
e per teatro-danza. studiosi e ricercatori in
L'attivita dello studio e per ora collaborazione con altri enti
riservata ai soci, rna e imminente culturali.
il trasferimento in locali idonei * Inserimento dell'associazione
ad accogliere collaborazioni in un programma inter-
residenziali con studiosi e associativo con alcuni
ricercatori estemi. importanti centri europei di
L'attivita di questa associazione ricerca operanti nel settore.
si e incentrata nell'orga- * Realizzazione di manifesta-
nizzazione di: zioni di promozione e divulga-
zione delle attivita prodotta
* Stage formativi e seminari di per mezzo di seminari
studio-ricerca tesi a concerto e installazioni multi-
coinvolgere i partecipanti in mediali.
diversi momenti ora rivolti
alia semplice informazione- Nel 1991 e stato organizzato un
acquisizione di nuove tecniche seminario sulle tecniche di
e informazioni, ora rivolti alia scrittura ed edizione musicale
elaborazione-riflessione di con il computer. Sono state pre-
queste. sentate Ie nozioni di base del si-
* Commissioni di ricerca di: stema Macintosh e successiva-
* analisi di opere di musica mente Ie caratteristiche del pro-
contemporanea colta, gramma finale. In collaborazione
con il docente e stato avviato un

388
programma di lavoro teso a LAVORl PRODOTTI
formare delle competenze di M. Bertoncini: Chanson pour
scrittura musicale col computer instrument a vent.
in Sardegna, ed a migliorare Ie L. Garau : Canoni.
condizioni di avvicinamento Preludio.
dell'utente a questo programma. G. Tedde: ATM.
Alcuni partecipanti hanno potuto Cellovoce.
vedere realizzate a stampa alcune Vox.
pagine delle lora partiture.
Sono state affidate sei com- ATTREZZATURE
missioni di ricerca ad altrettanti Quadra 700 20/80
musicologi e compositori e tre Monitor Apple 21"
cornmissioni di composizione e Centris 650 32/120
tre cornmissioni di composizione Monitor Apple 17"
di musica a tre musicisti di Digital Film
chiara fama: Mario Bertoncini, Hard Disk Blizzard 1 Gb
Enrico Correggia, Thomas StreamerDat PLI
Kessler. PowerbookDU02308/120
Questo lavoro di ricerca si Digidesign Pro-tools
concludera con la pubblicazione Sample Cell 8 Mb
e la diffusione dei saggi prodotti, Piatto Technics
e con l'esecuzione delle musiche Doppia Piastra Teac
scritte inserite in un programma 2 Microfoni Schure sm 81
inter associativo. Mic. Sanken COSIIBP
L'associazione RICERCARE ha DATSony
gia vissuto una fitta rete di DAT Teac DAP20
relazioni e di scambi con diverse Tastera Fathar
istituzioni analoghe e centri di MIDI Opcode studio 4
ricerca europei. CD-rom Apple
Mixer 16 canali Alesis
PARTECIPAZIONI Mixer 12 canali Soundcraft
Lyon convegno europeo di Spirit Folio
acusmatica 1991 4 casse Yamaha NS 10M
Basel Musikakademie 1992 Abaton Interfax 24/96
Cagliari Spaziomusica 1992 Macintosh Classic
Cagliari Ente lirico 1992 Stampante HP Laserjet 4M
Basel Musikakademie 1993 Stampante Style Writer
Tokio ICMC 1993
Boston Alea 3 1993 SOFTWARE
Cagliari Spaziomusica 1993 Finale 2.6.3
Alchemy 2.23
COLLABORAZIONI Max 2.5
Elektronische Studio- Studio Vision 1.4
Musikakademie Basel Turbosynth 2.0

389
390
Capitolo 10

COMPOSIZIONI MUSICALI

391
392
THE EFFECT OF DIGITAL SYNTHESIS LANGUAGE ON
THE CONCEPTION AND PROCESS OF COMPOSITION
Ludger Brummer
Visiting Composer at CCRMA Stanford University USA
New Address: Hohenzollernstr. 66
45128 Essen / Germany
E-mail: ffh004@vm.hrz.uni-essen.de
L Creating tape music Music as a kind of communi-
with electronic synthesis cative language needs well
The most obvious difference defined expressions if even
between composing for tra- the surface of its structure is
ditional instruments and to be perceived. Considering
digital technology lies in the this lack of performer inter-
limitations brought about by pretation, the problem for the
the process of performance. composer of computer music,
The computer has limitations is therefore to create a
too, but on the whole it composition with enough new
seems to be more powerful and traditional information.
than the instrumental player One way of dealing with this
as it does exactly what the problem is to define or rede-
user tells it to do. And this fine the musical material by
is the problem! Presented applying conventions, like ins-
with a couple of notes to trumental timbres, classical
play, an instrumentalist would form schemes, equal tempered
start to add the usual inter- scales, melodies, rhythms and
pretational habits of vibrato, so on. Another way is to use
phrasing, dynamic changes, a new abstract element and
note timbre variations and so expose it in the piece like a
on. This adds a complex set motif.
of paramters to the composed
material and with this a lan- 2. The theme of the piece
guage of convention in the "La Cloche sans Vallees:
process of musical communi- One element exposed in the
cation. Their absence in com- piece "La cloche sans Vallees"
puter music is one of the is the source piece "La valle
main problems that the des cloches" for piano from
listener encounters. the cycle "Miroirs composed
The appearance of the play- by Maurice Ravel. The whole
ers on stage, the use of well piece is generated from this
known instruments and a sound. Another idea can be
common expectation of their described as an "identical or
sound quality (or range of modified repetition of a sec-
sound qualities), as well as tion of a sound-: a loop.
the use of a compositional Changing the dura tion of a
language help us to perceive single loop (pulse) leads to
and evaluate musical structure. the following types of per-

393
ception: i.) The shortest loop limit (the click) the melodic
is perceived as a pulse. or a structure of the piano sound
click without any timbre or is revealed and mirrored. ma-
therefore pitch. ii.) A longer king reference to the sym-
loop enables the auditory metrical structure of the
system to, in effect, apply a source piece.
Fourier Transform to the sig- This technique of discovering
nal and evaluate at least a material by decelerating the
part of the timbre of the speed is used to connect
source .signal iii) If the loop Ravels piece with the new
is longer still, the ear can composition.
perceive timbre and pitch
completely. 3. The looping program.
These phenomena also apply The looping program which
to continuous sounds contai- is created for the above men-
ning repetitive portions of tioned ideas executes basically
signal and silents. pointer operations. Four main
Human perception structures functions are specified by en-
sound information in the sub velopes: L loop length. 2. rea-
audio domain into time units ding position in the source
of different size e.g. events. file. 3. silent time in between
sequences, and sections, while the loops and 4. glissando.
sound informations in the To have access to smaller
audio domain is perceived as events as welL some func-
pitch and timbre. Once placed tions determine the individual
in one of these categories the loop. Each loop can be read
sound information is. percei- forwards or backwards as
ved. A process of decreasing specified in a list e.g.: (r f f
or increasing speed causes a f r r f._._.). The pattern of
modification in the perception the list is then repeated.
of a signal as the category Each loop is separately
crosses the threshold between transposable with a pattern
sub audio and audio percep- given in the loop pitch list: (l
tion and with this the whole 12 132 I _).
set of categories moves. The idea of the parameter
Nearly all the sounds used in determination is to specify
this piece create a ritardando parameters on different time
through different algorithmic levels, so that a heterogenous
procedures. Even the form global structure can be gene-
itself is an inverted ritar- rated. The pitch parameter.
dando (or accelerando) fol- for example is determined L
lowed by another ritardando globaly with a pitch constant.
The whole structure of the 2. flexible with the pitch
first 10 minutes is transposed envelope and an envelope
or accelerated successively 7 scaler. 3. specific in the loop
times until it shrinks into a pitch list and 4. by the
click which again starts to envelope which applies silents
expand. discovering its new between the loops.
contents. At the point where A zero crossing detector with
time is compressed to its settable limit values and two

394
ramping mechanisms which easier to categorize the sound.
fade the sound in at the The amount of useful sounds
beginning and out at the end which the instrumentalist is
of each loop helps to avoid able to create with his or
click. An envelope controis her instrument is huge, but
the intensity of this process. all of these sounds are sha-
The loop and the deceleration ped by the instrument itself
of note patterns represent the (material resonance body),
central idea of the piece. A the technique 'of sound crea-
software synthesis language tion (bowed, plucked. blown
can now provide a totally etc.), the acoustic of the
different conception of com- room and the distance of the
position. The instrument is player from the listener. The
not only the source of the limitation of the sounds by
sound and carrier of the mu- the way and technique of
sical structure (like a conven- production. and perhaps even
tional instrument), it can the optical process of crea-
represent as a sound syn- ting the sound, enables the
thesis program the musical listener to build categories of
concept of the piece every bit perception. 'These categories
as much as the musical help us to use a specific
material iteself. The creation sound as an expressive ele-
of the synthesis program as ment of a composition, not
a creative part of the com- so 'much as an effect, but
position process influences to more as an integral part of
all the following work and the composition that can
determines the composition. then he varied to a greater
In instrumental music the or lesser degree.
composition is determined by The way a sound is created
the technical abilities of an in a more or less complicated
instrument and of its player. synthesis process inside the
But while the abilities of computer makes it difficult to
player and instrument could control and categorize the
be seen as inflexable cons- results. There exists no opti-
tants, the program created cal represen tation of the pro-
for the sound synthesis repre- cess of production which
sent a variable which could could help in the perception
be an interactive part in the of the music. Sounds cannot
process of composing. be categorized by the tech-
nique used to create them.
4. Conclusion. The same technique applied
The tradition of instrumental to different sources can result
music provides a notation in widely varying sounds so
language: The notation of a that it may not be possible
Dew sound is derived from for the listener to perceive of
the traditional context of an these technically linked
existing set of sound cha- sounds as linked in an aural
racteristics and playing tech- sense. Hence the process of
niques. Refering to the pro- structuring is the responsi-
cess of creation makes it bility of the composer. He or

395
she has to select the tech- 1n the score) and transpo-
mques and results and apply sitions in halftonesteps
limits to the material as a
.- --...
• N
whole. so that some elements ~

are restricted while others :\

~
are exposed to create the po-
tential of expreSSion for the
otherwise unordered timbres.
The lack of tradition 1n V
computer mUSIC, and the ex-
plosion of techniques made
possible by the computer
requires of the composer an
additional control and clarity
of intention 1n order to
create an expressive musical
language. It also requires a
knowledge of technical, psy-
choacoustic. and programming
skills to create a piece of
music and not merely a piece
of sound.
No established and elaborated
technique of composition ex-
ists as yet in the field of
computer music. The topic of
"musical language" is, the
weak point of this genre, es-
pecially of mUSiC created by
newly defined processes of
synthesis. The freedom and
infinite possibilities of com-
puter music IS at the same
time both its weakness and
its attraction.
The piece was created on the NeXT
Computers at CCRMA. Stanford
University USA. using WilJiam
Schottstaedt's Common-lisp-Music
synthesis language and Richard
Taube's Common-Music composition
language as "'.elI as Paul Lansk.l"s
R T Mixing program and a special
"Emergency Filter" designed just in a
couple of minutes by Julius o. Smith.

Pitcture: formal overview of


"La cloche sans vallees" with
references (algorythms shown

396
DOPPIO SOLO

Luigi Ceccarelli
Musica Verticale
via Tevere, 15 - 00198 Roma
tel/fax +3968411034

Abstract
Doppio Solo (Double Solo) is a composIUon for amplified alto
saxophonist with sampled bass saxophone and alto saxophone sounds. It
was realized in 1993 in the composer's private studio with the
collaboration of the saxophonist, Federico Mondelci.
The basic idea behind the work is to be found in considering the various
creative possibilities which can occur when a direct confrontation is set
up between a live performer and a computer-controlled performance.
The piece, decidedly virtuosic in character, demonstrates both the
soloist's bravura together with the highest possible level of studio editing
techniques. Rather than compromising these two different worlds of
techniques the end result yields a true expansion of sonic events. The live
and synthetic parts are almost always present together and often strictly
intertwined, wether in melodic and rhythmic figures or with timbre or
range. Thus the title Doppio Solo .
The main technical challenge in the studio was to reconstruct a synthetic
part which would match the saxophone's natural phrase characteristics
and at the same time create a part which amply surpasses the human limits
of possible saxophone technique. Almost all of the individual sounds were
sampled separately in order to allow for complete control in freely
reconstructing, using digital editing, authentic sounding melodic figures
and rhythmic articulation. Only in a few instances were the almost 400
original samples altered electronically. The samples were grouped into
five timbral categories: key clicks, slap tongues, staccatos, tremolos,
repeated notes and multiphonics. These five groups were in tum further
subdivided into 32 different programs and corresponding MIDI channels.
Each of the programs consists of a unique timbre for the full range of the
instrument. The score and sequence of sampled sounds was produced
using a commercial sequencing program on an Atari computer.

397
....comunque il significato di massimi livelli l' abilita del soli-
"Doppio Solo" non va ricercato sta. La parte dal vivo e la parte
nei suoni in se, rna nella sintetica sono entrambe sempre
relazione che i suoni hanno con presenti e spesso sono stretta-
la personale storia di ogni mente unite (ognuna come
ascoltatore. un'ombra dell'altra), sia come
carattere delle figurazioni me-
lodico-ritmiche, che come tim-
"Doppio Solo" e una composi- bro e tessitura. In realta ab-
zione per sax contralto amplifi- biamo quindi un solista reale ed
cato e suoni campionati di sax uno virtuale che eseguono un
basso e di sax contralto. E' stata medesimo solo. Da qui l'origine
realizzata nel 1993 presso 10 del titolo.
studio dell'autore dopo un pe- In "Doppio Solo" non sono pre-
riodo di ricerca sulla tecnica senti soltanto tecniche diverse,
esecutiva del saxofono compiuta rna anche gli elementi stilistici
con la collaborazione con sono difficilmente riconducibili
Federico Mondelci. ad un unico modello precosti-
tuito storicamente. Pur mante-
L'idea fondamentale che sta alla nendo una innegabile unitarieta,
base di questo pezzo consiste nel si e cercato infatti di ottenere un
mettere a diretto confronto, risultato musicale al di sopra
nella realizzazione di una stessa della classificazione dei generi e
idea musicale, un esecutore dal delle varie correnti, utilizzando
vivo ed una esecuzione realizzata stilemi che nonostante oltre cin-
tramite computer, e di conside- quant'anni di convivenza appar-
rare come elemento creativo Ie tengono ancora a concezioni
possibili differenze tra Ie due. lontane tra loro.
L'intenzione del compositore e
stata quella di realizzare un'o- Descrizione della forma.
pera in cui potessero convivere n pezzo e diviso in cinque se-
al massimo livello tecnico ed zioni secondo 10 schema
espressivo Ie due componenti A,B,A',C,A". Le sezioni A sono
(quella virtuosistica dell'esecu- costituite da piu linee melodiche
tore e quella, per certi versi al- molto strette tra loro (una dal
trettanto virtuosistica, del mon- vivo ed altre sintetiche in nu-
taggio in studio) ottenendo un mere variabile da una a cinque)
risultato che non fosse un com- che seguono esattamente 10
promesso tra tecniche diverse stesso andamento smuoso. Le tre
rna una vera e propria espan- sezioni sono differenziate sopra-
sione degli universi sonori pos- tutto dai diversi registri in cui
sibili. suona il sax contralto: la prima
La composizione e pensata parte e scritta prevalentemente
come un pezzo teso a sfruttare ai nel registro grave, la seconda

398
nel registro medio, mentre nel dei campioni e realizzazione
finale del pezzo si raggiunge il delle sequenze) ha sempre man-
registro acutissimo. tenuto un alto grado di qualita
Nelle sezioni Bee il rapporto del suono.
tra suono dal vivo e suono sinte- Per la registrazione digitale dei
tico e meno stretto. La sezione B suoni dei saxofoni di Federico
ha il carattere di fascia sonora Mondelci sono stati utilizzati due
continua ed e realizzata unica- microfoni Neuman U8? posti
mente da sovrapposizioni di molto vicini agli strumenti in
tremoli, ribattuti soffiati e colpi modo da registrare anche i ru-
di chiave che formano una dense mori della meccanica ed il soffio
agglomerato armonico can- dell' esecutore, componenti che
giante. In questa sezione il soli- in questo pezzo hanno un ruolo
sta ha una parte di tremoli molto non trascurabile. Dalla registra-
libera. zione sono stati in seguito sele-
La sezione C, all' opposto, e zionati e ricampionati circa 400
realizzata solo con suoni cortis- diversi suoni che sono poi stati
simi. La parte sintetica e com- memorizzati con una frequenza
posta unicamente da un tessuto di campionamento di 44.1 Khz
ritmico di slap (colpi d'ancia) in due campionatori Akai S1000
del sax basso: fino a otto linee aventi una capacita totale di
ritmiche semplici, di poco sfa- circa 18Mbyte. II fatto di usare
sate tra lore per una diversa due campionatori non e tanto le-
velocita metronomica, sono state gato ad un problema di capacita
sovrapposte in modo da formare della memoria, rna alla possibi-
un pattern ritmico molto com- lita di avere contemporanea-
plesso e variabile gradualmente. mente 32 suoni invece di 16 e di
La durata totale del pezzo e di potere organizzare 32 pro-
13 minuti. grammi completamente indipen-
denti dal punto di vista della
Realizzazione tecnica. comunicazione via MIDI.
II problema principale affron- Per potere ricostruire libera-
tate nella realizzazione in studio mente con il montaggio digitale
della parte sintetica e stato l'articolazione melodico-ritmica
queUo della ricostruzione del ca- i suoni sono stati campionati tutti
rattere saxofonistico delle frasi singolarmente (fanno eccezione
che fosse neUo stesso tempo alcuni tremoli, alcuni ribattuti
credibile rna che superasse am- ed alcuni suoni multifonici tenuti
piamente Ie possibilita tecniche che hanno la durata di qualche
dello strumentista. secondo). Questa scelta e stata
Questo risultato e stato otte- fatta per potere controllare
nuto in primo luogo grazie ad completamente la ricostruzione
un lavoro che in ogni sua fase delle frasi musicali e, pur cer-
(registrazione, organizzazione cando di mantenere sempre una

399
stretta analogia con la qualita chiavi. Questa soluzione, oltre a
espressiva dell'esecuzione dal rendere pili reale l'effetto com-
vivo, superare i limiti tecnici plessivo, ha permesso di accen-
degli strumenti meccanici, so- tuare pili 0 meno la precisione
pratutto per la velocita e la chia- dell'articolazione ritmica indi-
rezza dell'articolazione. pendentemente dagli altri suoni,
L'organizzazione dei campioni semplicemente variando l'in-
e stata fatta secondo cinque di- tensita dei colpi di chiave;
verse tipologie timbriche: suoni - La velocita metronomica subi-
di chiavi, slap, staccati, tremoli, sce continuamente piccole e
ribattuti e multifonici. Inoltre i quasi impercettibili variazioni
campioni sono stati suddivisi in che servono ad accentuare il ca-
32 programmi diversi assegnati rattere dell' andamento delle
ad altrettanti canali MIDI. In frasi, come accade realmente in
ogni programma e contenuto un una esecuzione dal vivo.
diverso timbro che riproduce Generalmente si ha una piccola
tutta I'estensione dello strumento accelerazione nei crescendo di-
reale. namici ed un rallentamento nei
Generalmente si e cercato di decrescendo e nel finale di ogni
mantenere il pili possibile la ri- frase.
conoscibilita del suono originale
senza alterare elettronicamente il
segnale digitale. Soltanto alcuni
tremoli sono stati abbassati di
una 0 pili ottave per ottenere una
funzione di bordone. Anche al-
cuni slap, in una breve parte del
pezzo, sono stati raddoppiati di
velocita e quindi alzati di una
ottava, per ottenere un suono
non realistico.
II progetto della partitura e
della sequenza dei campioni e
stato realizzato con un comune
sequencer commerciale su com-
puter Atari.
Nell'organizzazione delle se-
quenze dei suoni, alcune solu-
zioni adottate sono degne di
nota:
- Per Ie tre sezioni A e stata
creata una sequenza indipen-
dente, parallela alle sequenze
melodiche, per il rumore delle

400
FINZIONI
FOR VIOLIN AND QUADRAPHONIC TAPE

Fabio CifarieUo Ciardi

Conservatorio "U.Giordano"-Foggia
via Pietro Giannone, 28
00195 Rome - ITALY

pattern invariants of pieces and


Abstract styles [2],[3]. Compared to short-
term memory, long-term
In "Finzioni", for violin and memory is rarely considered in
quardaphonic tape [1], I compositional processes. It has
considered listeners' common been· often treated as a personal,
long-term memory as a pole in unique wisdom that looses its
the exploration of a sonic consistency when applied to a
universe entirely derived from large group of people.
violin samples. Futhermore long-term memory
refers to knowledge that is
In music cognition memory is already achieved while
usually divided into short-term composers ideally tend to explore
(episodic) and long-term original, unknown territories. On
(semantic) memory. The first the other hand common auditory
refers to the occurrences of experiences today are much more
specific, concrete, temporally frequent than in the past and
dated events, the latter concerns therefore we might expect to
the organized, abstract, timeless store the same experiences in a
knowledge a person posseses. much more tangible common
The use of short-term memory is long-term memory. This
illustrated by the recognition of common knowledge play an
repetitions and variations of an essential role in timbric
unknown event presented perception of natural and
immediately before. The use of artificial sounds. Long-term
long-term memory is illustrated memory knowledge is retrieved
by the correlation of an unknown in order to synthesize timbre
event with a personal and social multidimensionality into a more
experience. This task is always invariable "mental image", no
performed during music matter whether the sound comes
perception in order to store the directly from real instruments or
overall meaning, to grasp the from loudspeakers.

401
According to contemporary The interplay with our memory
research in music cognition these of instrumental gesture is
structural constancies underlying stimulated by artificial granular
surface changes (like spectral articulations of transformed
variations in time domain) could violin samples, but also by
be related to archetypical "untouched" samples of
physical processes we associate to unorthodox violin articulations
sound, such as plucking, striking, (like the smooth passage between
bowing, blowing [4],[5]. normal and harmonic tones in the
These reflections have been central part of the piece)
taken into account in the course embbeded between natural violin
of the timbric organization of samples articulated in an
"Finzioni". All sounds have been artificial way.
derived from real violin samples Finally, fragments from a very
only, but their modifications well known violin repertoire
have been carried out to create a have been used to interact with a
dialectic interplay between common musical memory. These
musical events and the common fragments are:
memory of the sonic universe the 1) the first theme from the
violin is able to evoke. "Danse Macabre" by Camille
I considered our common Saint-Saens;
mental "image" of the violin as 2) the first theme from "The
the resul of a complex interaction Sorcerer's Apprentice" by Paul
among three different memories: Dukas;
a timbric memory, a memory of 3) The first theme of the third
instrumental gestures and a movements of Brahms's violin
memory of very well-known concerto;
musical fragments. In this sense 4) The first theme of the third
the exploration of the violin's movement of Brahms'violin
sonic universe in the tape part is sonata in D minor op.108;
intended as the exploration of an 5) The evergreen "The man I
interplay between the music and love" by George Gershwin.
these three memories. All fragments join the same
In the tape part, timbric melodic micro-structure that is
memory is explored through a used as basic material for the all
slow and irregular interpolation pitch structure of the piece.
going from the deeply "Finzioni" is the title of an
transformed sound of the anthology of short stories by
beginning which prolongs tone Jorge Luis Borges. Usually
ending the solo introduction like Borges astonishes the reader with
a frozen resonance, to the pure his associations between real and
violin tones at the end which imaginary worlds, between
recall some figures previously things already stored in our
played by the soloist. common memory and things

402
which are toally new. In this parvenza, che un altro stava
same sense I imagined this music sognando.
travelling quickly, sharply
through the virtual space of our "Finzioni" has been released at
memory, a virtual space IRCAM during the Cursus
simulated by the quadraphonic d'Informatique Musicale 1991
reproduction system. The soloist and is dedicated to my son
should be placed at center of this Francesco.
space, not on the stage: the entire
sonic universe is dreamed of,
created and explored by him. A
soloist who, at the end of the [1] F.Cifariello Ciardi:
"Fiction", will pass, like Alice, ... "Finzioni" per violino e nastro
through the looking glass. qudrifonico", EDIPAN ed.
Roma, 1992.
Borges described all that in "The [2] W.J.Dowling &
circular ruins", one of "Finzioni" D.L.Harwood: "Music
novels: cognition", Acdemic Press, New
"Nessuno 10 vide sbarcare nella York, 1986.
notte unanime... [3]A.Baddeley: "La memoria
umana" II Mulino ed.,Bologna,
II proposito che 10 guidava non 1990.
era impossibile anche se [4] C.Cado:"Timbre et causalite"
soprannaturale. Voleva sognare in "Le timbre, metaphore pour la
un altro uomo: voleva sognarlo composition", IRCAM-
con minuziosa interezza e C.Bourgois ed., Parigi, 1991.
imporIo alla realta' ... [5] G.J.Balzano: "What are
musical pitch and timbre", Music
Lo sogno attivo, caldo, segreto... Perception, Vo1.3,N.3, pp.297-
314, 1986.
Gradualmente 10 venne
avvezzando alIa realta' ...

In un alba senza ucelli il mage


vide avventarsi contro Ie sue
mura l'incendio concentrico...

Ando' incontro ai gironi di


fuoco: che non morsero la sua
came, che 10 accarezzarono e
inondarono senza calore e senza
combustione. Con sollievo, con
umiliazione, con terrore,
comprese che era anche lui una

403
VISIBILI
FOR TWO VIOLINS AND TAPE
Alessandro Cipriani
Via Voghera, 7 00182 Roma Tel.++39 6 7010310

"Visibili" was composed in 1992, the survival of numerous signs


a period of the history of music of oral tradition techniques
when the passage from the within the written tradition) [1].
techniques of written tradition to Therefore I tried not to establish
those of electronic tradition theoretically strict boundaries
becomes more and more evident: between strategies and
if we think about the influence of behaviours definable as
new technologies in recording "electronic" and those definable
studios on the performance and as "instrumental". I tried instead,
the recording of classical and within the possibilities at hand,
contemporary music; if we to make the best of the specific
notice the fact that listening to quality of the two traditions and
music in our society occurs of the possibilities that the
mostly via technology (CD, tape respective techniques offer. The
recorders, media, etc.) and not differences among those two
directly; if we think about the "worlds" find their unity, besides
impact that the proliferation of that of the basic materials, also
music everywhere has on the in the construction of the form.
value and the meaning of the There I tried to identify some
performance and of the listening basic formal functions that could
ritual, we can become aware of be dealt with by one or the other
the importance of that passage. medium: the formal functions
This piece was born from and the sound of the instruments
research on the possibility of as unifying elements; the
using the same basic instrumental diversity of the traditions and of
materials (written or played and the techniques as a subject to
recorded) and two different reflect on.
elaboration techniques: the first, The three techniques used
bound to the written tradition, here (reiteration, multi-track
for the instrumental parts; the recording and related techniques,
second, bound to the electronic granular synthesis) are indicated
tradition for the elaboration of in the score by three different
the tape part. Exchanges among symbols. I didn't try to 'describe
techniques of different traditions the sound' since that description
constantly took place in the would be extremely reductive.
history of music (for example This is, for example, a specific

404
quality of the electronic place, 'disconnecting' in a way
tradition: the separation of the the sound from its source.
sound description from its I tried, in the second
performance. section of the tape of "Visibili"
The first section of the to bring out some aspects of the
tape of "Visibili" is made violin timbre, by first recording
accelerating the reiteration of an instrumental cell performed
pizzicato sounds up to a speed of by one violinist in different
reiteration of 100 impulses per versions (each beginning from a
second, so that two textures are different pitch keeping the
produced: one has the intervals unchanged), then
characteristic taken from the "transposing" each of those
pizzicato (same pitches as the versions with electronic means to
pizzicato sounds performed at the same beginning pitch some
the beginning of the piece by the octaves under the original.
violinists); the other texture is Comparing the various versions
generated by the frequency of it was easy to notice the
the impulses up to 100 Hz. The diversities within the timbre of
higher or lower frequency of the the different "transpositions",
reiteration is used as a means to and yet 'the same notes were
a higher or lower tension on the played' and all of them were
perceptive level. The idea of the originally played on the same
reiteration technique comes from instrument. That is another
the Stockhausen's proposal [2] of specific technique of the
a sound generated by the electronic tradition: not to playa
frequency of impulses of another lower pitch note, which will
sound. The difference from that have necessarily a different
technique is that in "Visibili" I harmonic spectrum
used instrumental sounds that are configuration from a higher
recognizable by their timbre also pitch note played on the same
when they become a texture, instrument, but 'to transpose' the
notwithstanding the intervention same note (keeping the same
of the second sound used here relation among the harmonics) in
with a sort of 'subharmonic' a lower register. The latter way
function. This is another of the production of a certain type
many examples taken from the of harmonic spectrum is
electronic tradition: to repeat the 'disconnected' from the typical
same sound (and not to perform pitches to which the harmonic
the same note) is one of the spectrum is connected. To use a
specific possibilities of those type of 'electronic' technique
means that, since many years, means in this case to conduct
have allowed a sound not to exist research at the same time into
only in one moment and in one the sound of the instruments, the

405
same ones that will be played same clothes, in the same
live. position where the real
The third technique is performers were formerly
granular synthesis and, in playing. The projection takes
particular, the technique place using a slide projector but
developed by Barry Truax to the image has been previously
'time-stretch' a sound without photographed from a video so
changing its pitch. In this case, that the texture of the human
instead of using a real-time figure refers more to an
system such as Truax's PODX, I electronic image rather than to a
used in the penultimate and more photo. At the end of that last
extensively in the last section of section for tape only a two-note
the piece a CSound orchestra chord can be heard from afar:
written by Eugenio Giordani. it's the sound of the two
"Visibili" was also born violinists who are behind the
from reflection on the wings...
concreteness, the physical
quality, the visibility of the References
performer, on his relation with
the instrument (with all the [1] L. Treitler "From ritual
historical connotations implied) through language to music",
on one hand and the 'invisibility' Schweizer Jahrbuch fuer
and the sense of abstractness of Musikwissenschaft, Neue Folge
the 'resounding' of the tape, 2, pp. 109-123, 1982.
notwithstanding the concrete
original source (violin) on the [2] K. Stockhausen "L 'unita del
other. I will try to describe the tempo musicale" in 'La musica
scene of the performance: the elettronica' edited by H.
two violinists play live, one in Pousseur, Feltrinelli, pp. 150-
front of the other with the music 160, 1976 (also in german
stands between them. In the piece language) in Zeugnisse,
there are two sections for tape Europaeische Verlaganstalt,
only: during the first the Frankfurt, 1963.
performers stay motionless, one
in front of the other seen [3] A. Cipriani "Verso una
sideways by the public; before tradizione elettroni ca?"
the second section they leave the unpublished, 1993.
stage walking towards the wings
and playing the last notes
prescribed in the score. At this
point on the stage, behind the
music stands appears a
motionless natural-size image of
the two violinists wearing the

406
RECONSTRUCTIONS
FOR HARP AND COMPUTER
James Dashow

02030 Poggio S. Lorenzo (RI)


FAX: (+39) - 765 - 80.294
or (+39) - 6 - 58.19.904

The organizers of this who were but would like a few


Colloquium have asked us more goes at it), should head for
composers to say a few words their favorite local CD store and
about our non-wordy work. ask about a ProViva CD, made
They have requested us, kindly, in Germany and coming out in
to attempt to provide a few clues early 1994, with 4 of my pieces
about the more significant on it. RECONSTRUCTIONS is
musical or technical or, if all one of them, and the harpist for
else fails, methodological whom it was written, Lucia
characteristics of our efforts, in Bova, is the soloist. She plays it
a manner (hopefully) extremely well. Then you should
comprehensible (emphasized by listen to it, ears wide open, and
italics) to readers outside our first of all just enjoy (or get
specialty (just about everybody). accustomed to) the sound.
So, dear reader, I will herein After you have listened once,
endeavor to do all as asked by maybe twice, you will have
the hard- working organizers of undoubtedly noticed that it is
the X Colloquium in the interests asymmetrically in two parts: the
of comprehensibility, and in first part energetic, multi-
accordance with the "Instructions dimensional linear and full of
for Composers" provided by complexities, the second much
them to help us non-verbal types slower and suspended, more
navigate the difficulties of concerned with developing sound
expressing ourselves in an colorings rather than any other
unfamiliar medium. aspect. But the first part has a
slow suspended pause in it that is
Of course, all the words in the a forward reference to what is
world won't help you get coming, while the second part
through RECONSTRUCTIONS has a highly charged energetic
if you don't have a chance to part in it which is a backward
hear it. So those readers who reference to what has been. The
were not at the 4 December combinations of intervals and the
concert in Milano (or even those specific pitches that make up

407
each section and provide pitches played by the harp, and
motivations for the vice versa, all the pitches are
forward/backward references organically related to the
are Significant Musical electronic sounds, or timbres.
Characteristics. But if you don't Pitch structure and timbre
happen to have perfect pitch or structure are completely
are unable to memorize interval integrated. Or more simply, the
patterns the first few times harp and the electronic sounds
around, don't worry. The form are one and the same structure -
of the piece is also, and each part works out, within,
especially, concerned with kinds around the structure in its own
and degrees of energy and way. Kind of like simultaneous
motion, and this is all right out (rather than successive)
there in the open for you to variations on the same musical
hear. All those pitch and interval stuff.
relationships support (and are So, I think up ideas based on
supported by) the flux and some of those rich electronic
transformations of energy of the sounds, or on some combination
piece. My job is, has been, to (successive or simultaneous) of
convince you that it all goes those intervals, or interactions
together and makes musical between both, and then I
sense, as it does to me. elaborate them with all kinds of
Significant Methodologies. I
Would my telling you a few of even have a method for the
my workshop secrets be of any development of the underlying
value to you, dear reader (and pitch structure which in turn is
now listener, I hope)? Well, made to generate the sounds. If
maybe. You might have noticed you want to know more about
that the harp pitches are often the nuts and bolts of these
there in the electronic sounds. methods you can look at some of
Well, actually the harp notes are my other attempts at prose,
always (italic that one) there in printed, more or less
the electronic sounds, sometimes comprehensibly, elsewhere: [1]
in the foreground, sometimes in [2] [3] [4] [5].
the background. Yes, dear But to complete the picture with
reader-listener, there are indeed some Technique: the electronic
some specific sound synthesis sounds were entirely generated
algorithms involved, all written using the MUSIC30 software
by yours truly. I can make the system for digital sound
computer generate all sorts of synthesis. MUSIC30 was written
sounds built around whatever by myself for the PC-compatible
pairs of notes (intervals) I plug-in accelerator board known
choose. All the sounds are as the SPIRIT-30, made by some
structurally derived from the very cooperative electronic

408
engineers at Sonitech Well, all you have to do is listen
International [6] utilizing the and find out. Buon ascolto.
Texas Instruments DSP chip, the
TMS320C30. For those of you
who are concerned: the system is References
so fast that many things can be
done in real time, while deferred [1] James Dashow: interview
time performance of complex with S. Momo in booklet
music is close to mainframe accompanying CD recording of
standards. All with a humble PC. 3 works (ARCHIMEDES, Act I,
You can work at home. scene ii; MNEMONICS for
violin and computer; ORO,
Where does the title of this piece ARGENTO & LEGNO for flute
come from? A nice quote from and computer), WERGO CD
Jean Piaget, which goes like this: WER 2018-50, 1989.
"Reconstruction involves
increasingly varied [2] James Dashow: "Spectra as
recombinations and a greater Chords", Computer Music
freedom in the kinds of Journal, Vol. 4 N. 1 (Spring),
combinations." pp. 43-52, 1980.
In fact, Piaget has an interesting,
different, (musically applicable) [3] James Dashow: "Three
way of thinking about structure Methods for the Digital Synthesis
to which I subscribe, and which I of Chordal Structures with Non-
think is worthwhile mentioning Harmonic Partials", Interface,
here too: "...structure is a system N.7 pp. 69-94, 1978.
of transformations. Inasmuch as
it is a system and not a mere [4] James Dashow: "N e w
collection of elements and their Approaches to Digital Sound
properties, these transformations Synthesis and Transformation",
involve laws: the structure is Computer Music Joumal, Vol.
preserved or enriched by the 10, N. 4 (Winter), 1986.
interplay of its transformation
laws ...the notion of structure is [5] James Dashow: "The Dyad
comprised of three key ideas: the Sys tern", Perspectives of New
idea of wholeness, the idea of Music, forthcoming.
transformation and the idea of
self-regulation." [6] Brewster LaMacchia,
That pretty well describes the Yogendra Jain, Sonitech
way I composed this piece, and International, 14 Mica Lane,
others too. Wellesley, MA 02181, USA; tell
But then there is the usual 'em I sent you.
question, what does it all mean.

409
Sulla composizione di Zeitwerk (['orizzonte delle cose)
4-track computer-generated tape

Agostino Di Scipio

Abstract

The realization of Zeitwerk (l'orizzonte delle cose) involved 4*4 processes


of timbral exploration (four processes replicated on 4 channels). Each
process utilized a unique set of 8 purely sinusoidal grains of fixed frequency
and amplitude. Therefore the strategy determined exclusively how the
grains repeat and overlap through time (in this sense, the entire composition
was realized using design processes operating in the time domain only).
Criteria of patterning of grains included iterated difference equations
showing global properties of self-organisation, ranging from chaotic
behaviors to hierarchically structured patterns. Also iterated time-shifting
and -folding of sound were extensively used.

In this short paper, Zeitwerk is described as an example of algorithmic


composition that brings forth relevant morphological properties of sound
materials. In terms of composition-theory, the approach is one that merges
models of materials and models of musical design, and that fosters a
peculiar attitude towards indeterministic models of detailed sonic design.

Composed at the University of Padova with recently implemented granular


synthesis options in G.Tisato's ICMS, the work was premiered in Rome
(September 1992). The version presented at the 10th CIM was realized later
and initially presented at City University, London (April 1993).

1.Introduzione utilizza null'altro che 8 minuscoli


grani sonori, di frequenza ed
n brano si compone di quattro ampiezza fissa; di tali grani
sezioni caratterizzate da specifiche vengono determinati dunque solo
forme timbriche. Ogni sezione e la durata ed i modi di successione
realizzata attraverso quattro pro- e sovrapposizione nel tempo.
cessi simultanei di trasformazione Percio la costruzione dell'intero
timbrica (uno per canale). AlIa brano e avvenuta lavorando solo
base della composizione, vi e un nel dominio del tempo (come
unica strategia di progettazione allude il titolo, che pero in
non deterministica del suono, che tedesco ha anche connotazione

410
fortemente autoironica, signifi- vario tipo, anche molto differenti
cando infatti "opera di rilevanza tra loro.
epocale"...). Pur fisse per tutto il
brano, Ie frequenze degli 8 grani Su alcune delle strutture sonore
hanno valori strategicamente COS! generate, sono state applicate
importanti. ovvero : 48, 105, 232, anche operazioni di time-shifting
511, 1124,2473, 5442, 11972 Hz. del suono (dilatazione temporale
Spettri con simile contenuto di senza alterazione di frequenza) e
frequenza inducono illusioni di di foldover temporale (due
percezione di altezza. Ho COS! strategie di time-jolding : [lineare]
potuto raddoppiare 0 dimezzare la stabilita una soglia di durata
frequenza di campionamento per massima, i grani che la
utilizzare pili di una volta alcune oltrepassano vengono ribaltati e
strutture sonore, la modifica rimixati con i campioni gia
dando luogo alIa percezione di un generati; [nonlineare] ricampio-
semitono di differenza, invece che namento di un file sonoro
di un'ottava. (Naturalmente, il esistente mediante puntatore
sovra-campionamento ha anche governato da brownian walk, 0 da
dato luogo a fenomeni, ben una iterazione non-lineare). Tutte
accetti, difoldover). Ie operazioni sono state applicate
ricorsivamente, sottoponendo a
2.Processi microtemporali granulazione suoni costituiti da
innumerevoli grani. Se effettuata
I controlli che determinano Ie in modo appropriato, la rigranu-
relazioni temporali tra i grani lazione allarga Ie bande di
sonori sono stati derivati da frequenza del suono che viene
modelli di sistemi dinarnici non- elaborato, provocando temporanee
lineari (iterazione di equazioni- discontinuita nel segnale e dunque
differenza del primo ordine), il cui suoni estremamente ricchi e
utilizzo come struttura di controllo complessi, fino al rumore.
della sintesi granulare ho discusso
altrove [1,2]. Nell'evoluzione di 3.Teoria. Composizione
un sistema non-lineare, l'errore di algoritmica e timbre composition
predizione cresce logaritmicamen-
te, COSl che diventa impossibile Come per altri miei recenti lavori,
anticiparne il comportamento a anche Zeitwerk e un esempio di
seguito di stati iniziali quasi composizione algoritmica operata
identici (sensibilita aile condizioni al livello della struttura dei
inziali). Le proprieta di auto- materiali sonori e, in definitiva,
organizzazione di tali processi del timbro. In altre parole, si
nonlineari, proiettate nella micro- rendono inseparabili pensiero dei
struttura del suono, hanno fatto materiali e pensiero della forma,
emergere morfologie timbriche di ovvero, si confondono modelli di

411
generazione del materiale sonoro e lineare, che molta musica
modelli della articolazione della elettroacustica oggi ancora mutua
forma musicale [3]. Da questa dai paradigmi del linguaggio
fusione, che diventa praticabile verbale e dalle forme evolute della
attraverso l'utilizzo di rap- tradizione strumentale occiden-
presentazioni microstrutturali del tale. Che la chiarezza formale
suono (in questo caso sintesi dell'intero sia COS! sacrificata alla
granulare, rna altri sono possibili), ricchezza morfologica del
nascono modelli indeterministici materiale sonoro - ovvero aIle
di progettazione del rnateriale forme del rnateriale - non puo
sonoro : comporre il timbro come costituire un reale problema :
strategia di esplorazione e rendere possibili nuovi modi del
scoperta. comporre e dell'ascolto, cioe
coerenze cognitive inesplorate,
Ciascuna delle quattro sezioni del significa sospendere principi
lavoro si sviluppa fintanto che il cognitivi noti; qui sta, in sostanza,
processo di sintesi del suono e il reale contributo di originalita a
I'evoluzione della materia sonora portata della musica elettroa-
che ne deriva si traducono in custica e per computer; cft [3, 4]).
mutazioni ed insorgenze timbriche
di interesse percettivo. Oltre Nonostante i grani puramente
I'orizzonte di tempo entro it quale sinusoidali utilizzati, i processi
la microstruttura si rigenera in microtemporali di Zeitwerk
ulteriori forme possibili del suono, risultano in sonorita talvolta anche
Ie sonorita si cristallizzano, si molto aspre e concrete, che a tratti
stabilizzano; ed a quel punta vien si fanno tessiture e romori di
dato inizio alla sezione successiva carattere quasi ambientale (a cio
del brano. I momenti terminali di contribuisce la spazializzazione
ogni sezione non sono da del suono, in termini di ritardo e
considerarsi punti di arrivo, simultaneita, di densita di eventi
obiettivi di un discorso mirato, sonori simili rna non identici sui 4
direzionato rna ipotesi di altoparlanti). Altre volte, come
carnmino, punti di temporaneo nella seconda sezione, si hanno
equilibrio il cui raggiungimento sonorita quasi strumentali, rna
passa attraverso stati morfologici come di strumenti non
singolarmente caratterizzati ed a riconoscibili, mai uditi.
volte centrifughi ed in contrasto
con la consistenza formale 4.Epilogo
dell'intero. (Credo che comporre if
timbro conduca potenzialmente ad La varieta delle soluzioni
una esperienza compositiva ed timbriche risponde al tentativo di
ascoltativa che poco ha della sottrarre la sintesi e dell'elabora-
articolazione sintattica, logica e zione granulare del suono aIle

412
soluzioni tecniche e percettive a [4] ADi Scipio: "La musica
cui e spesso associata. Sono state di due culture. Tracce di una
percib sviluppate apposite pro- mutazione". In (a cura di
cedure per la sintesi del suono (in C.Boschi) Musica/Scienza. II
tempo rigorosamente non- margine sottile, ISMEZ, pp.17-70,
reale... ); questi programmi, da me 1991
inizialmente scritti su un PC486, [5] ADi Scipio: "Com-
sono state poi trasportati e resi piu posing with granular synthesis
efficenti da G.Tisato suI suo and the Interactive Computer
ICMS al Centro di Calcolo di Music System", Proc. of the 4th
Padova (vedi Di Scipio & Tisato Symposium on Arts & Techno-
altrove in questo volume e [5]). logy, Connecticut College, pp.38-
52, 1993
Zeitwerk e stato presentato
all'Universita "La Sapienza" di
Roma (Settembre 1992). La
versione odierna e stata messa a
punto in seguito, ed e stata
presentata alIa City University
(Londra, Aprile 1993).
Agostino Di Scipio si e diplomato in
Composizione e Musica Elettronica
presso il Conservatorio di L'Aquila,
dove ha studiato con G.Bizzi, M.Cardi e
M.Lupone. Ha inoltre studiato con
Riferirnenti bibliografid J.Dashow al CSC dell'Universiffi di
Padova.
[1] ADi Scipio: "Compo- Collaboratore del CSC, Padova, e del
sition by exploration ofnon-linear CRM, Roma. Membra fondatore
dynamical systems", Proc. of the dell'ESCOM (European Society for the
Cognitive Sciences ofMusic). Sue
ICMC90, ICMA, pp.324-329,
musiche sono state eseguite in vari
1990 festivals in Italia ed all'estero (Roma,
[2] A Di Scipio: "Caos Padova, Cagliari, Bologna, Amsterdam,
detenninistico, composizione e Ghent, Lyon, Bourges, Vancouver,
sintesi del suono", Atti del IX Londra, Norimberga). Inoltre ha
CIM, AIMIIDIST, pp.337-350, presentato lavori di ricerca in molteplici
1991 convegni internazionali (Glasgow,
[3] ADi Scipio: "Models of Montreal, Londra, Trieste, Roma, New
London, Cambridge, Jyvaskyla, S.Jose,
materials and ofmusical design
Genova, ecc...). Menzione di merito a
become inseparable. A study in Bourges 1991; nel1993 ha lavorato
composition-theory", Proc. of the presso la Simon Fraser University
International Conference on (Burnaby, Canada) con una borsa
Cognitive Musicology, Univ. of dell'International Council for Canadian
JyvaskyHi, pp.300-316, 1993 Studies.

413
ANIMALI IN SOFFITTA

Amedeo Gaggiolo, Silvia Dini

Jour d' atelier


Salita Brasile 13/37 Dx
16162 Genova (Italy)
fax +39 10 310466
phone. +39 10 404419
E-mail musica@ice.ge.cnr.it

The composition "Animali in purification is embodied by a


Soffitta" (Animals in the Attic) hopeless dynamic loop.
consists of four musical episodes Another creature is the "Mir-
intersected by a vocal' parenthesis' micoleone", defined by Flaubert
deriving from the processing of a [2l as "a lion at the front and an ant
sampled female voice. at the back": a biological im-
The piece draws its inspiration possibility and as such destined to
from recesses of the mind that perish.
swaml with fantastic creatures and Misery and sadness also mark
which only our centuries-old the story of the "Squonk" [3], an
imagination can digitize in the ever-weeping twilight creature
memory and bring to life. thought to live in Pennsylvania; its
It is a musical pOItrait of unreal fear of hunters is so overwhelming
animals, beasts that for generations that when the animal feels cornered
have fuelled peoples' fears, anguish it resOlts to "dissolving into tears".
and curiousity. The "Odradek", a small sinewy
These animals represent an beast springing from the
ancestral presence that roams the imagination of Kafka 14], is an
mind without respite, taking on incredibly fast creature which
multifoml shapes; they are incom- wanders about people's homes,
plete beings denied the possibility agile and uncatchable.
of death, a privilege that only the It is described as being "able to
living are granted. live in attics, under stairs, in
Such is the tale of "A Bao A Qu", corridors, entrance-halls .... It may
from an oriental legend [1 ], whose move into the neighbouring houses
unquenchable thirst for ascent and but it always returns to ours",

414
Musical characteristics interposed with puffs, breathing,
exclamations and spurts which,
Our composition falls within a considering the animal's nature, can
tradition of "animals in music" only be metallic in timbre (fig. 5).
which has intrigued musicians of "Rafel mai amech zabi almi" [7]
all periods. is Dante's linguistic enigma, which
What relationship may arise somehow enwraps all four fantastic
between music and animals? Musi- animals. The indecipherable voice
cal literature provides all kinds of is the attic - it does not reveal
approaches to animal representation anything of the creatures it houses
and "Anima Ii in soffitta" takes into but echoes the darkness and abyss
account the descriptive techniques of being (fig. 6).
forming the traditional legacy; the
animals, as introduced in Borges's
"Manual of Fantastic Zoology"151, References
have been musically interpreted in
a 'phonosymbolic' way (onoma- [11 A. Cesaro, C. Pansera, U.
topoeia, kinetic illusions, echoic Rizzitano, V. Vacca ed.: "Le mille e
una notte", Einaudi, 1975.
symbolism, etc.) [61.
In "A Bao A Qu" the dominant [21 G. Flaubert: "La tentazione
feature is rhythm (fig. 1-2)or, rather, di Sant'Antonio", in "Romanzi e
a kind of percussive stammering in racconti", Mursia, 1961.
which a gong evokes the Orient and
[3] Cox W. T.: "Fearsome
marks a ritual stage within the search creatures (~l the lumberwoods",
for purification leading to the Washington Press, 1910.
summit of the Victory Tower.
The Mil111icoleone's double na- [41 F. Kafka: "Tutti i racconti",
ture is conveyed by a melody Arnoldo Mondadori, 1979.
writhing between low and high [51 J. L. Borges: "Manuale di
pitches, immersed in a hazy wologiafamastica", Einaudi, 1962.
landscape ofsand and dustproduced
by pseudo-maracas (fig. 3). [61 C. Cano: "Simboli sonori",
A synthetic polyphonic wailing Franco Angeli, 1985.
and a trickle of percussion describe [71 D. Alighieri: "La Divina
the Squonk's wretched sobbing and Commedia",lnferno,XXXI,67, La
wandering (fig. 4). Nuova Italia, 1968.
The Odradek's frantic groove
recalls a labyrinthian video game.
or a tap dance a la Fred Astaire,

415
'mp _========
Fig. 1 A Bao A Qu's steps beat to the rhythm of the percussion.

250·
o
Tempo

Fig. 2 Representation of A Bao A Qu's swaying motion.

pp -=== >

111.- •

pp

Fig. 3 The Mirmicoleone's dual melodic anatomy.

416
Fig. 4 A few strokes on the synthesizer describe the Squonk's tears.

Tip-tap groove
~1mW<t~~~~~~~';g!~~''J..~!fJ@W/ijl;~'J ~

Metal Percussions ~>

Breath (sample) 1'''~ I


:
b~ "- (~ ~ ~~ ~ ti'~
~
p'P 111f~ ~
;;i"'""

1111' -==- p'P


=
Fig. 5 The sampled tap-dance groove adds ironic colour to the Odradek's
existence.

~ L 110
po n ~: ~ lEbo" "
;1H"~!

__ -----1-----------:-~,.
0
jlO lin b..
1

Fig. 6 MelodicFame confining Dante's phrase.

417
METAFONIE
Francesco Galante

V. Taranto, 178
00182 Roma
Tel. +39 6 7027370

The title of the piece may sug- Metafonie is a tape only music
gest several kinds of meaning. piece and it has been realized at
In this case the term of Meta- author's studio in Rome, in the
fonie is a valid expression to in- first quarter ofthe 1993, by using
terpret my interest toward some 2 FM expander TX81z
different kind of perception YAMAHA controlled by Atari
ambiguities, that I used into the 1040 PC, a TMS320 DSP Audio
last my musical works. board controlled by 286PC and
The aim is to conduct the hea- DSP SPX90-I1 YAMAHA.
ring throught different changing The software is based on a set
conditions starting from a collec- of customized routines able to
tion of simple sound objects. generate pseudo-random micro-
Consequently the central point structures that I used like objects
of the piece is to obtain different that I can manipulate by Cubase
and contrasting perceptive phe- Midi software. For TMS320 DSP
nomenas, colours, ambiguous board I used customized TI320
soundscapes. assembler code programs.
I used heavy and smoothed Metafonie is dedicated to Lui-
sounds, simple intervals and gi Pestalozza.
sound "clots", dramatics and ec-
static character, electronic sounds
and electronic sounds that simu- II titolo puo sottoindicare una
late instrumental timbres, and so molteplice quantita di significati,
on. suggeriti dalle sue implicazioni
In particular these virtual in- sia in campo linguistico che scien-
struments are synthesized by tifico.
using FM and then processed by In questo caso il termine Me-
using ofclassic techniques of elec- tafonie ben rappresenta.un pro-
troacoustic music such as blema che ha caratterizzato la
Amplitude and Ring Modulation, mia scrittura musicale di questi
Delay, Reverberation, etc. ultimi anni, vale a dire condurre

418
l'ascolto verso stati di diversa tra- fonici; carattere drammatico ed
sformazione della materia acu- estatico, sintesi del suono elet-
stica; considerare il movimento tronico puro e sintesi del suono
all'interno del suono, come aspet- che tenta una mimesi del suono
to centrale del progetto musicale strumentale.
e sonoro, e quindi della sua for- Tutti questi elementi sono sta-
ma. ti presi in considerazione duran-
Di conseguenza Metafonie si te la stesura e la realizzazione del
avvale di quelIe tensioni ed insta- progetto.
bilita che un materiale, opportu- Gli "strumenti virtuali", suoni
namente progettato, eingrado di di pseudo-ottoni ed archi, pro-
produrre nel muoversi da uno dotti dai sintetizzatori in FM ven-
status spettrale e percettivo ad gono utilizzati con una forte ca-
un altro, per fissare l'attenzione ratterizzazione ambigua, co-
su istanti di suono, ovvero sulle struendo un intreccio dialogico
distanze tra i diversi punti di tra- in cui coesistono figurazioni or-
sformazione del suono. chestrali che appartengono a ten-
II brano si articola su 3 percorsi denze linguistiche avanzate, COS!
che si intrecciano, iquali progres- come procedimenti di elabora-
sivamente sviluppano Ie parti zione massiva degli strumenti vir-
"metafoniche" appunto dei ma- tuali attraverso processamenti
teriali utilizzati. elettronici classici (ringmodula-
In un movimento sempre pili zione, reverberazione).
teso ed evidente che conduce Un insieme di metafore sono-
l'ascolto da un riconoscibile am- re quindi che in altra maniera
biente sonoro fortemente "stru- traduce ilrapporto strumenti con-
mentale", non soltanto nel senso creti/elaborazione elettroacusti-
puramente timbrico rna anche ca tipico di una certa prassi musi-
della scrittura, ad un progressivo cale, in quello di strumenti digi-
tentativo di raggiungere spazi tali midi!elaborazione elettroa-
acustici inusuali, opposti, contra- custica.
statio Nella costruzione del materia-
METAFONIE e costruito sui Ie sono stati adottati procedimenti
contrasti, suI passaggio continuo di generazione pseudo-random
tra i caratteri contrastanti od di microsequenze, in cui sono sta-
ambigui dei materiali. ti definiti gli ambiti parametrici
Suoni grezzi, metallici; uso di re- di accadimento degli eventi.
gistri gravi, acuti ed acutissimi; Le microsequenze sono state
contrapposizione di semplici re- memorizzate corne tracce midi
lazioni intervallari con "grumi" all'interno di un ambiente di la-

419
voro CUBASE per Atari, in que- La tessitura frequenziale esta-
sto modo esse sono state succes- ta centrata su poche altezze, spes-
sivamente trattate ed organizza- so in distanza di banda critica, ed
te come oggetti, rendendo possi- un certo numero di loro microva-
bile la. costruzione nello spazio rianti (fino a 32 cent dal suono
musicale di tessiture sonore va- base), distribuite su tutte Ie otta-
riabili in velocita e collocazione ve disponibili per ciascun sinte-
frequenziale, e nella verticalita tizzatore digitale e, durante la
possibile data dalle potenzialita fase di assemblaggio, su differen-
degli strumenti MIDI impiegati. ti spostamenti di registro.

Una metodica che ha contri- L'algoritmo di sintesi utilizza-


buito fortemente alIa sperimen- to eun modello di strumento FM
tazione di differenti complessita a 4 operatori, di tipo 4, presente
timbriche e dinamiche. nel TX81Z, opportunamente
In particolare, si e adottato un modificato nei rapporti di fre-
criterio di generazione continua quenza tra gli operatori e nei ri-
a "burst" delle note, in modo tale spettivi inviluppi.
da ottenere una mutazione dina- I materiali di sintesi, cos1 otte-
mica del materiale prodotto dai 2 nuti vengono successivamente
sintetizzatori digitali, sfruttando elaborati attraverso una scheda
sempre il massimo delle risorse DSP non commerciale, basata suI
disponibili. TMS320, sulla quale si e imple-
Questa tecnica produce all'a- mentato un programma di simu-
scolto dei microistanti di instabi- lazione di ringmodulazione e
lita percettiva, rumori dovuti alIa modulazione di ampiezza, con
continua sostituzione di un even- oscillatori ed inviluppi digitali di
to con l'altro. Controllando la controllo..
velocita con cui attivare questo La ricostruzione in chiave
tipo di generazione e il tempo di audionumerica di questi strumen-
sostituzione degli eventi ed inte- ti di elaborazione tipici del sinte-
grando mediante complessi fil- tizzatore analogico, hanno favo-
traggi digitali disponili in alcuni rito un comportamento ed un
algoritmi del DSP SPX90, si e rapporto con il suono sicuramen-
potuto costruire degli spettri acu- te molto stimolante, rna ha signi-
stici con diversa distribuzione ficato anche progettare un rap-
dell'energia formantica, partico- porto non convenzionale con gli
larmente utili nella progettazio- strumenti midi, che spesso ricon-
ne del suono pseudo-strumenta- ducono il compositore al proble-
Ie. ma di come stravolgere tali mezzi

420
tecnologici, apparentemente merciale, basata suI TMS320
"aperti", COS! come e gi~l awenuto (conversione DjA e AjD 16 bit)
per gli strumenti tradizionali. controllata da un 286PC, un
METAFONIE e stato compo- processore digitale di segnali
sto nel periodo Gennaio-Aprile SPX90-2 Yamaha.
1993 nella studio dell'autore, La sua esecuzione e esc1usiva-
mediante 2 FM expanderTX81Z mente per nastro magnetico.
Yamaha controllati da Atari 1040 METAFONIE e dedicato a
pc, una scheda DSP non com- Luigi Pestalozza.

1-'

421
ALCUNE RIFLESSIONI INTORNO AL BRANO
ELETTROACUSTICO CHROMATISM
Francesco Giomi
Divisione Musicologica CNUCE/C.N.R.
Conservatorio di Musica di Firenze
P.zza Belle Arti, 2
1-50122 Firenze (Italy)
fax: +39-55-2396785
E-mail: art@ifiidg.bitnet

Abstract textures. Many of the musical ele-


ments are repeated and amplified
The computer music piece from one single piece to another in
Chromatism takes as a starting point order to create timbrical, besides
the author's researches on structural, bonds between the six
electroacoustic music analysis, fragments. In the work there are
based on the sound object concept. both electronic sounds and sampled
It also integrates some experiences acoustic instruments. The first are
in the field of music and images in- used in order to emphasize the tim-
teraction. brical aspects, trying to insert
Chromatism includes six studies: compound objects characterized by
at first the studies should have a tonic and/or complex mass.
represented six of the twelve colors Sampled acoustic instruments were
of the chromatic disk, in order to re- used to create composite events,
create, at an auditory level, a sort of formed by a rhythmic assemblage of
sound colors. But during its compo- simple objects or by sound "groups"
sition the piece has partially lost this with a rhythmic function.
component in favour of the creation Chromatism has been realized at
of completed narrative structures the Musicological Department of
which are developed inside each of CNUCE/C.N.R. (Conservatory of
the six single fragments. The author Music of Florence) with the auto-
tried to link the single narrative matic composition software Teletau,
paths through an overall structure the Yamaha TX81Z synthesizers
comprising the six studies. and the Roland S550 sampler.
As far as the sound material is
concerned, each fragment takes into
consideration particular aspects of L'artdes sonsfixes
the electronic sound world, like the
alternation between sound and Un recente libro di Michel Chion
noise, the rhythm caused by par- [1] ci da 10 spunto, oltre che per il
tially random parameters and the titolo di questo paragrafo, anche per
envIronmental characteristics of dedicare alcune brevi riflessioni
certain timbricallharmonical preliminari alla musica elettroacu-

422
stica per nastro, alla quale il branD n lavoro musicale Chromatism
appartiene. n composltore francese trae origine proprio da alcune di
pone proprio l'accento sulla validita gueste ncerche condotte dall'autore
di una musica progettata, composta l~] e cerca anche di integrare alcune
e fissata esc1usivamente su nastro esperienze nel campo
magnetico. Chion la definisce con il dell'interazione suono-immagme
termine "arte dei suoni fissati", ad [5]. n branD non vuole essere, co-
indicare una musica dove il com- munque, ne un catalogo sonora ne
positore lavora artigianalmente con i tantomeno una summa esaustiva di
suoni, piuttosto che con i segni, in queste ricerche: raccoglie soltanto
una costante altemanza tra il fare e alcune idee e suggestioni nate du-
l'ascoltare. Suoni che si trasfor- rante illavoro sull'analisi.
mana gradualmente da materiale a
oggetto stesso della composizione, e
dove anche l'operazione di fis- Sei studi per nastro magnetico
satllra, la registrazione, diventa
momenta artistico. Chromatism e costituito da sei
Le frequenti obiezioni riguardo studi distinti per nastro magnetico di
alla mancanza del molo circa due mmuti ciascuno: inizial-
dell'interprete nella musica mente tali studi dovevano servire a
elettroacustica possono essere "rappresentare" musicalmente sei
superate, prima di tutto rilevando la dei dodici colori fondamentali del
presenza di forme di interpretazione disco cromatico, in modo tale da ri-
anche nei Qrocessi percettivi legati creare, a livello sonoro, alcune sug-
all'ascolto l2] e, in secondo luogo, gestioni che all'autore erano state
sviluppando, ai fini dell'esecuzione sugg~rite dal rapporto fra suoni e
concertistica, nuove forme di colon.
diffusione musicale legate Durante la sua lavorazione il
soprattutto alla progettazione di branD ha parzialmente perduto
specifiche sale di ascolto, strada questi connotati in favore della
peraltro ~ia intrapresa da alcuni creazione di semplici strutture nar-
centri sm europei che nord- rative sviluppate all'intemo di ogni
americani. elemento che 10 costituisce e colle-
A partire da questi e da altri con- gate fra 101'0 da un percorso musi-
cetti, quali per esempio 9.uelli cale unitario che attraversa l'intera
illustratI da Pierre Schaeffer gia nel struttllra del lavoro. Esiste infatti
1966 nel suo trattato [3], vengono una sorta di "continuum cromatico"
attualmente condotte numerose tra i sei pezzi costituenti l'intero
ricerche sull'analisi della musica branD che, anche se dotati di propria
elettroacustica. Queste cerCanO di autonomia, dovrebbero essere
formulare criteri per ascoltati consecutivamente. Si e
l'individuazione e la c1assificazione cercato infatti di far emergere, a
degli oggetti sonori presenti in un livello percettivo, la presenza di una
branD oltre che per la scoperta delle unita formale complessiva, che
101'0 modalita di organizzazione comunque ci si e ,ereoccupati di
formale e delle loro funzioni riverberare anche all interno di ogni
estesiche. singolo frantmento sonoro.

423
Ognuno dei sei studi affronta n brano elettroacustico ha avuto la
particolari aspetti dell'universo sua prima esecuzione italiana nel
sono£O, come la relazione tra suono Dicembre 1992 durante il Festival
e rumore (il primo e l'ultimo), al- G.A.M.O., al Conservatorio di Mu-
cune strutture ritmiche 0 aritmiche sica "L. Cherubini" di Firenze. E'
(il quarto e il quinto e in parte il stato successivamente eseguito du-
sesto) ottenute attraverso I'uso di rante il 23 0 Festival Intemazionale
parametri casuali, la "spaziaIWi" di Musica Sperimentale Synthese 93
armonica di certe tessiture timbriche di Bourges, nel Giugno 1993.
(il secondo e il terzo), il fenomeno
dei battimenti (ancora il quarto) e
cosl via. Sempre nel segno di una Bibliografia
certa continuita del tessuto sonoro,
molti elementi musicali (timbrici [1] M. Chion: "L' art des sons
prima di tutto rna anche ritmici e fixes", Ed. MetamkinelNota-
frequenziali) si ripetono, si fondono Bene/Sono-Concept, 1991.
e si amplificano da uno studio [2] F. Gionn, M. Ligabue:
all'altro. "Modalities of Signification in
Da un punto di vista del materiale Contemporary Music: A Proposal
sono£O di base, all'interno degli for an Analytical System",
studi sono presenti sia suoni elet- Contemporary Music Review,
tronici che strumenti acustici cam- forthcoming.
pionati. Per quanta riguarda i primi, [3] P. Schaeffer: "Traite des
l'interesse si e incentrato objets musicaux", Ed. du Seuil,
sull'aspetto timbrico con la ricerca 1966.
di oggetti spesso composti e carat- [4] F. Giomi, M. Ligabue: "Un
terizzati pnncipalmente da massa approccio estesico-cognitivo alia
tonica e/o complessa [3][6]. Gli descrizione dell' object sonore", Atti
strumenti acustici campionati sono del II Convegno Eu£Opeo di Analisi
stati usati per la creazione di eventi Musicale (R. Dalmonte, M. Baroni
sonori compositi, formati eds.), Universita di Trento, 1992.
dall'assemblamento ritmico di pili [5] M. Aitiani, F. Giomi: "The
oggetti semplici, 0 di "insiemi" artwork Nave di Luce: A Journey
sonori con funzione ritmica. into Telematics, Art and Music",
Chromatism e stato composto nel Leonardo, Vo1.24, N.2, 1991.
1992 nella studio della Divisione [6] M. Chion: "Guide des objets
Musicologica del CNUCE/C.N.R. sonores", INA-GRM/Buchet-
presso il Conservatorio di Musica di Chastel, 1983.
Firenze. Per l'organizzazione for-
male, la generazione e la gestione
degli eventi sonori e stato impiegato
il software di cornposizione auto-
matica Teletau mentre la p£Ogram-
mazione timbrica e stata effettuata
su due sintetizzatori Yamaha
TX81Z e un campionatore Roland
S550.

424
WERVELWIND
David Keane

School of Music
Queen's University
Kingston, Canada K7L 3N6
FAX: 1-416-265-6823
E-mail: keane@qucdn.queensu.ca

Introduction. produce on his instrument. The


structure of the piece is based on
Wervelwind (1991) is a work a slow evolution from quiet, air
for solo trombone and prepared sounds through inharmonic tim-
tape in which the trombonist per- bres to the more conventional
forms while continuously turning. harmonic timbres. The timbre is
The turning creates both visual specified in the score and the
and aural sensations of whirling player is asked to interact impro-
while it creates an effective dra- visationally with the gestures that
matic tension through the focused are on the tape. At certain points,
energy of the whirling gesture. however, the solo part is fully
Wervelwind is the Flemish word notated.
for "whirlwind."
Spatialization.
Origin of the work.
The conspicuously distinguish-
Leo Verheyen commissioned ing feature of Wervelwind is the
Wervelwind with the assistance incorporation of spatial location
of a Canada Council as a major component of the solo
Commissioning Grant. Verheyen, instrument's portion of the per-
one of the best known former formance. Throughout the work,
students of Vinko Globokar, is the trombonist rotates so that the
himself a well-known proponent bell of the instrument radiates out
of experimental music and is re- over 360 degrees while the player
sponsible for contemporary music is the axis of the rotation.
at the Antwerp (Belgium) The use of spatialization as an
Conservatory. important parameter in composi-
Leo requested a work for tape tion has been a major interest of
and trombone with a substantial mine for several decades. In ad-
amount of improvisation for the dition to a number of works that
soloist. Consequently, I created make use of antiphonal forces, I
an accompaniment generated have written quite a few works
through emulation of the rich va- locating performers in a circle
riety of timbres that Leo could around the audience [for exam-

425
pIe, Round Dance (1974) for small card placed on a lyre. The
wind ensemble, Tondo (1976) summary contains an outline of
and Orbis (1981) for string or- the sequence of events along with
chestra, Corona (1978) for SATB timing and tape-cue information.
chorus and orchestra, and Henge The summary card not only pro-
(1978) for trombone choir]; vides an aid to memory, but of-
called for specialized treatments fers a constant place for the per-
in solo works [for example, former to focus his/her eyes as an
Hornbeam (1979) for horn aid to avoiding dizziness and
equipped with tubing to direct disorientation while turning.
sound to various locations and The sonic result of the rotation
Carmina Tenebrarum (1983) an is what might be thought of as a
opera for soprano in which the slow Doppler effect, although the
performer rotates in a manner slow velocity and small distance
similar to that of the trombonist covered create very little, if any,
in Wervelwind]; and have ex- actual pitch shift. However, be-
ploited space in the larger cause the bell moves laterally and
"soundscape" of the out-of-doors at the same time alternately aims
[for example, Sea-Nauta (1984) away from and directly at the
for 8 ships in St. John's, Nfld. audience, there are marked cy-
harbour and Sound Lodge, a cling amplitude and timbrel shifts
computer-based installation for that strongly reinforce the rota-
parks]. tion in the auditory domain to
In Wervelwind, I conceived the create an effect like an human
rotation as a dominant feature of "Leslie speaker".
the musical structure. Leo and I It is worth mentioning here,
considered motorizing the rota- also, that the rotation of the trom-
tion, but we deemed it safer and bone provides an effective visual
more practical for the performer reinforcement for the auditory
to rotate on hislher feet. This cycle. Because little light is nec-
would both offer greater ease of essary for the performer to see
control in altering rotation speed music, the performance can be lit
and would provide a more secure with a single spot light placed
basis for the performer to main- immediately over the centre point
tain hislher balance--a critical of the rotation. When the per-
consideration in this case. former stands relatively close to
Given that it is not practical to the back wall of the stage, the re-
read music on a music stand flections of the light off the rotat-
when rotating, the improvisa- ing trombone offer flowing re-
tional aspect of the work proved flictions dancing around the
to be especially advantageous. player as an effective visual
However, the performer makes counterpart to the music.
use of a condensed summary of
the instructions for the work on a

426
The Accompaniment. companiment and making judge-
ments about balance with the ac-
The tape portion of companiment.
Wervelwind was assembled using Therefore, there are effectively
C-Lab Notator 3.1 software run- four important spatial foci: the
ning on a Atari Mega ST. All LEFT and RIGHT speakers
sounds were generated using fre- (points C and D in Figure 1), the
quency modulation synthesis on most FORWARD position of the
Yamaha tone modules (1-TX802, trombone (point A), and the most
1-TX81Z and 3-TX7). These rearward position (point B).
instruments were combined and Position B creates the darkest,
processed using a Yamaha DMP7 most subdued timbre, while posi-
digital mixer, and on-board signal tion A has the greatest presence.
processing supplemented by a The faster the performer's rota-
Roland DEP5. tion, the more marked is this
Plausible trombone timbres timbrel contrast. Rotations vary
were generated through (1) care- (at the discretion of the per-
ful tailoring of voices, (2) layer- former) from about 20 rotations-
ing of a variety of trombone per-second to a single rotation-
patches and carefully varying the per-second.
velocity of each, (3) elaborate
s~aker
control of tuning, and (4) use of a
great variety of multiple-voice C~D
groups.

Integration of solo and accom m

paniment.

To avoid overwhelming the


live performer, I avoided creating
trajectory of B
rotation effects in the accompa-
trombone bell
niment, although I make exten-
sive use of the two speaker loca- Audience
tions as distinct polarities. Figure 1 Performance Configuration.
Ideally, the speakers are placed
just safely beyond the circumfer- NOTE: a recorded performance
ence of the trombonist's rotation of WERVELWIND by Leo
and about a metre further from Verheyen is available on the
the audience than the lateral di- compact disc, DAVID KEANE:
ameter of the rotation. This ef- DIALOGICS [QUM 9301], from
fectively melds the solo and tape the Canadian Music Centre
sounds and, at the same time, of- Distribution Service, 20 St.
fers the performer excellent con- Joseph St., Toronto, Canada,
ditions for monitoring the ac- M4Y IJ9].

427
A COMPOSITION FOR CLARINET AND
REAL-TIME SIGNAL PROCESSING:
USING MAX ON THE IRCAM SIGNAL
PROCESSING WORKSTATION
Cort Lippe

IRCAM, 31 rue St-Merri, Paris, 75004, France


email: lippe@ircarn.fr

Introduction. of Max, the flexibility with which


one creates control patches in the
The composition Music for original Macintosh version of
Clarinet and ISPW, by the author, Max is carried over into the do-
was created using the IRCAM main of signal processing.
Signal Processing Workstation
(lSPW) and the software Max. Prototyping Environment.
The piece was commissioned by
the Center for Computer Music & The ability to test and develop
Music Technology, Kunitachi ideas interactively plays an im-
College of Music, Tokyo and re- portant role in musical applica-
alized at IRCAM during 1991, tions. Because of its single archi-
and at the Kunitachi College of tecture, the ISPW is a powerful
Music during a composer-in-resi- prototyping and production envi-
dency, 1991-92. ronment for musical composition
From 1988-1991, IRCAM devel- [4]. Prototyping in a computer
oped a real-time digital process- music environment often com-
ing system, the IRCAM Signal bines musical elements which
Processing Workstation (ISPW) traditionally have fallen into the
[1]. Miller Puckette has devel- categories of "orchestra" (sound
oped a version of Max for the generators) or "score" (control of
ISPW that includes signal pro- sound generators). Mainly due to
cessing objects, in addition to computational limitations, real-
many of the standard objects time computer music environ-
found in the Macintosh version of ments have traditionally placed
Max [2] [3]. Currently, there are hardware boundaries between
over 40 signal processing objects "orchestra" and "score": the
in Max. Objects exist for most sound generation is done on one
standard signal processing tasks, machine while the control is done
including: filtering, sampling, remotely from another. When de-
pitch tracking, threshold detec- veloping a synthesis algorithm
tion, direct-to-disk, delay lines, which makes extensive use of
FFTs, etc. With the ISPW version real-time control, it is extremely

428
helpful, if not essential, to de- ployed in the piece (see figure
velop the synthesis algorithm and below).
the control software together.
This is greatly facilitated when
sound generation and control run
on the same machine and in the
same environment.

Control and Signal Processing.

Real-time signal analysis of in-


struments for the extraction of
musical parameters gives com-
( lIIllllog-UHligital ccnvCl1a: )
posers useful information about
what an instrumentalist is doing. ----if
One of the signal processing ob-
jects found in Max offers rapid
and accurate pitch-detection. In
Musicfor Clarinet and ISPW, the
incoming clarinet signal is con-
verted via an analog-to-digital
converter and analyzed by this
--I -1" I
pitch-detection algorithm. The
pitch tracker outputs MIDI-style
pitches which are sent to a score
follower [5] (using the explode
Icompoailiooal algorithms I
object [6]). As the score follower
advances, it triggers the
t
signal processing modules
"electronic score" which is stored
in event lists. The event lists di-
rectly control the signal process-
ing modules. In parallel, com- ( wgital-to-anaJog ccnvCl1a:a )

positional algorithms also control


the signal processing. These com-
~
positional algorithms are them- ~ ~tWUDd distribution
selves controlled by the informa-
tion extracted from the clarinet Figure 1. Control and signal
input. Thus, the raw clarinet sig- processing flow.
nal, its envelope, continuous
pitch information from the pitch The signal processing used in
detector, the direct output of the Music for Clarinet and ISPWin-
score follower, and the electronic clude several standard signal pro-
score all contribute to control of cessing modules: reverb, delay,
the compositional algorithms em- harmonizing, flanging, frequency

429
shifting, spatializing, and fre-
quency/amplitude modulation. reverb harmonizer samplers spatlallzer
frequency noise filters
Several non-standard sampling shUter modulation
techniques are used also, includ-
reverb
ing a time-stretching algorithm,
developed by Puckette, which rrequency
shlrter
allows for the separation of sam-
ple transposition and sample du- harmonizer
ration. Thus, one can slow down noise
a sample playback while main- modulation

taining the original pitch, or samplers


change the pitch of a sample
playback without changing its filters

duration. Another sampling tech-


spatlallzer
nique, a kind of "granular" sam-
pling developed from techniques
described by Xenakis [7] and
Roads [8] for sound synthesis, is Figure 2. Crossbar of interconnections among
also used. Ten-second sound signal processing modules.
samples can be played back in a
variety of ways and orderings,
taking approximately 20-mil- Real-time Continuous Control
lisecond "sound grains" of the Parameters
sample at a time. (All of the
samples are made up of clarinet Real-time audio signal analysis of
phrases sampled in real-time dur- acoustic instruments, for the ex-
ing the performance of the piece.) traction of continuous control
Finally, using an automated sig- signals that carry musically ex-
nal crossbar (similar to a studio pressive information, can be used
patch-bay) to connect modules to to drive signal processing and
each other, signals can be sent sound generating modules, and
from the output of practically ev- can ultimately provide an instru-
ery module to the input of every mentalist with a high degree of
other module. This signal cross- expressive control over an elec-
bar maximizes the number of tronic score [10]. In the
possible signal paths and allows frequency domain, pitch tracking
for greater flexibility when using can be used to determine the
a limited number of signal pro- stability of pitch on a continuous
cessing modules [9] (see figure basis for recognition of pitch-
below). bend, portamento, glissando,
trill, tremolo, etc. In the
amplitude domain, envelope
following of the continuous
dynamic envelope for articulation

430
detection enables one to
determine flutter-tongue, stac- References.
cato, legato, sforzando,
crescendo, etc. In the spectral [1] E. Lindemann, M.
domain, FFfs, pitch tracking, and Starkier, and F. Dechelle. "The
filtering can be used to track con- IRCAM Musical Workstation:
tinuous changes in the spectral Hardware Overview and Signal
content of sounds for detection of Processing Features." In S.
multiphonics, inharmonic/har- Arnold and G. Hair, eds. Proceed-
monic ratios, timbral brightness, ings of the 1990 International
etc. High-level event detection Computer Music Conference. San
combining the analyses of fre- Francisco: International Com-
quency, amplitude, and spectral puter Music Association, 1990.
domains can provide rich control [2] M. Puckette. "The
signals that reflect subtle changes Patcher." In C. Lischka and J.
found in the input signal. Fritsch, eds. Proceedings of the
1988 International Computer
The Musician's Role. Music Conference. San Francisco:
International Computer Music
The dynamic relationship be- Association, 1988.
tween performer and musical [3] M. Puckette. "Combining
material, as expressed in the mu- Event and Signal Processing in
sical interpretation, can become the Max Graphical Programming
an important aspect of the Environment." Computer Music
man/machine interface for the Journal 15(3):68 - 77, 1991.
composer and performer, as well [4] C. Lippe et ai, "The IR-
as for the listener, in an environ- CAM Musical Workstation: A
ment where musical expression is Prototyping and Production Tool
used to control an electronic for Real-Time Computer Music."
score. The richness of composi- Proceedings, 9th Italian Collo-
tional information useful to the quium of Computer Music, 1991,
composer is obvious in this do- Genoa.
main, but other important aspects [5] M. Puckette, "EXPLODE:
exist: compositions can be fine- A User Interface for Sequencing
tuned to individual performing and Score Following." In S.
characteristics of different musi- Arnold and G. Hair, eds. Proceed-
cians, intimacy between per- ings of the 1990 International
former and machine can become Computer Music Conference. San
a factor, and performers can Francisco: International Com-
readily sense consequences of puter Music Association, 1990.
their performance and their musi- [6] M. Puckette and C. Lippe.
cal interpretation. "Score Following in Practice." In
Proceedings of the 1992 Interna-
tional Computer Music Confer-

431
ence. San Francisco: International
Computer Music Association,
1992.
[7] I. Xenakis. Formalized
Music. Bloomington: Indiana
University Press. (Pendragon,
1991; 1971.
[8] C. Roads. "Automated
Granular Synthesis of Sound."
Computer Music Journal 2(2):61
- 62, 1978.
[9] C. Lippe and M. Puckette.
"Musical Performance Using the
IRCAM Workstation." In B.
A1phonce and B. Pennycook, eds.
Proceedings of the 1991 Interna-
tional Computer Music Confer-
ence. San Francisco: International
Computer Music Association,
1991.
[10] D. Wessel, D. Bristow
and Z. Settel. "Control of Phras-
ing and Articulation in Synthe-
sis." Proceedings of the 1987 In-
ternational Computer Music
Conference. San Francisco: Inter-
national Computer Music Associ-
ation, 1987.

432
IHADA
PER SASSOFONO TENORE E
SUONI SINTETICI
Matteo Pennese

AGON <Acustica Infonnatica Musica>


Piazzale Egeo, 5 - 20126 Milano
Tel. 02-64429289
Fax 02-64422724

Abstract e che non necessariamente come


Through its numerical treatment, la struttura immaginata dal com-
the composer can create the positore?
sound. E se la risposta risultera affer-
But can we consider the timbre, mativa, in quale grado ?
using a loved-serialist term, a Stephen McAdams [1] propone
parameter? l' identificazione dei c.d. "ele-
In other words, has timbre the menti portatori di fonna" in base
capacity to take part in the musi- a sei criteri, qui di seguito rias-
cal form and, if so, in which de- sunti:
gree? 1. differenziazione in categorie
Starting from McAdams's clas- percettive discrete;
sification regarding the attributes 2. organizzazione delle categorie
of the form-carriers elements, percettive in modo tale da far sl
ihada rapresents an effort to un- che Ie loro relazioni siano d'
derstand which is the timbre ordine funzionale;
limit, taken as a compositive 3. capacita delle differenze di
element. The aim has been to natura e forza fra Ie relazioni
create a timbre-model, conside- funzionali, di creare stati di ten-
red as the "heart" of the mate- sione e distensione;
rial, and later on creating tension 4. qualita distintive d' una cate-
or distension states through its goria, qualita di relazioni fra Ie
distorsion. categorie ovvero modo in cui
queste ultime si collegano con Ie
Premessa altre, quali attrattivi dell' atten-
zione;
Puo i1 timbro, utilizzando un 5. disponibilita delle categorie,
termine caro alIa musica seriale, delle relazioni funzionali e della
considerarsi un parametro ? ordinamento di un sistema di
In altre parole, ha i1 timbro la classificazione, di essere apprese
capacita di contribuire alIa da un ascoltatore;
forma musicale, intendendo 6. mantenimento di un certo
questa come cio che si percepisce grado di invarianza delle rela-

433
zioni fra categorie attraverso Ie Prima dei modelli, delle gerar-
differenti classi di trasforma- chie, delle strutture, delle forme
ZlOne. - in altre parole prima della
Ma se ci rivolgiamo al timbro, cultura - esistono delle diffe-
gia al primo punto della sopra renze. Vi sono realta acustiche
esposta classificazione incon- ineludibili: i suoni con spettro
triamo grossi ostacoli. armonico sono piu distensivi di
Se altezza e timbro sono attributi quelli a spettro inarmonico, cosi
di un medesimo segnale acustico, come un' ottava sara sempre di-
tuttavia vi sono fra loro sostan- versa da una terza minore e, an-
ziali differenze. cora, un' ottava non accordata
Da un lato il "continuum" delle produce il fenomeno dei batti-
altezze ci porta ad una categoriz- menti, venendo percepito come
zazione discreta; d'altro lato al- intervallo piu "rugoso" rispetto
trettanto non puo dirsi del tim- ad un' ottava accordata (i batti-
bro. Ulteriore fattore che impe- menti e la rugosita sono dovuti
disce la discretizzazione del tim- ad un' interferenza, nell' orec-
bro e costituito dalla sua multi- chio interno, d' una medesima
dimensionalita, ovvero l' impos- banda critica, Pomp [3]).
sibilita di definire il timbro sulla In campo strumentale, un suono
base di categorie riconoscibili di violino risulta piu "brillante"
con precisione nei piu svariati di quello di una viola suonata
contesti. Cio comporta l' assenza ana medesima altezza ed intensita
della relazione di ottava, fattore (la "brillantezza" riproduce, a li-
che facilita l' apprendimento e la vello di percezione, la riparti-
memorizzazione. zione dell' energia sullo spettro,
Insomma, se altezze, intensita e Wessel [4]).
durate sono misurabili rispetti- Possiamo individuare, quindi,
vamente in Hertz, millisecondi e due livelli: nel primo parliamo
livello d' ampiezza, il timbro 10 di grado di tensione, nel se-
si puo, tutt' al piu, identificare condo, invece, prendiamo in
nel complesso delle tre compo- considerazione la dissonanza e la
nenti citate. II timbro non e un consonanza, e da qui inizia la
componente ma un composto. cultura.
E la molteplicita degli elementi
complica considerevolmente il ihada : obbiettivo e forma
trattamento dei fenomeni percet-
tivi (Miller 1952 [2]). In base a queste premesse, dun-
Vi e, tuttavia, una chiave di que, il mio primo obbiettivo
volta. L' universo timbrico non compositivo dei suoni di sintesi,
puo, nonostante tutto, definirsi e state quello di poter lavorare
come totalmente continuo. con uno strumento in grade di
Esistono delle differenze . creare un nudeo timbrico fisso
dal quale, tramite la modifica di

434
determinati parametri ci si po- talmente inarmonici esplodendo,
tesse allontanare 0 riavvicinare, alla fine, in un suono percussivo
senza perdere contatto con il in fortissimo, che a sua volta si
"cuore" timbrico. disintegra. Sulla coda di questa
In ihada si puo dunque indivi- suono, in pianissimo, compare
duare un timbro-modello, una un dolce rintocco di campana,
sorta di prototipo, udibile come che segna l' inizio della sezione
un suono di campana. seguente.
Distinguiamo cinque sezioni: IV sezione: si presenta il timbro-
I sezione: reiterazione di un ac- modello. La sezione, inoltre, e
cordo, la cui struttura rimane segnata da una pulsazione rego-
identica venendo modificate, per lare, che contribuisce alla mas-
contro, Ie zone di risonanza, che sima distensione.
vanno a rivelare una melodia na- V sezione: dagli accordi della
scosta fra Ie pieghe timbriche. sezione precedente si espande
Progressivamente l' accordo ac- una fascia lontana, che richiama
quista, da un lato, un transitorio il fantasma di uno scampan'io, su
d' attacco sempre piu breve, av- cui ricama il suo canto 10 stru-
vicinandosi al timbro-modello. mento solista. Per un attimo ri-
II sezione: compaiono delle li- compaiono forme variate del
nee, dei grumi sonori", come se
11 timbro-modello, risucchiate dalla
gli accordi iniziali si fossero sia fasciache lentamente si estingue.
contratti che disintegrati. Questa struttura si puo raffigu-
III sezione: cresce la tensione rare graficamente nella figura 1,
timbrica. I suoni divengono to-

sez.II

~
sez.I
sez.V

o sez.IV
timbro-modello
fig. 1

da cui si evince chiaramente la Credo che il timbro, ora come


maggiore 0 minore distorsione ora, rivesta una posizione piut-
rispetto al timbro-modello delle tosto ibrida nel vocabolario del
vane sezioni timbriche. compositore. Se da un lato e una
categoria percettiva, come l' al-
Conclusione tezza 0 la durata, d' altro lato ne

435
viene impedita la sua categoriz- ture", Computer Music Journal,
zazione concettuale. vol.3, nO 2.
Appare come elemento di in-
dubbio fascino per il composi-
tore, poiche rappresenta la su-
perficie, il colore, la "pelle" del
suono.
E, d' altra parte, si dimostra
strumento incapace di reggere da
solo un cammino formale con-
vincente, imponendo una re-
gressione verso il puro materiale
sonoro nella sua varieta fenome-
nologica, accatastato in succes-
sioni spesso banali.
II trattamento numerico del
suono, dunque, apre immensi
spazi al compositore di oggi;
tuttavia non si garantisce la
qualita musicale in virtu di un
materiale acustico sempre piu
variegato e ricco.

Riferimenti bibliografici

[1] S. McAdams, aa.vv.:


"Le timbre, metaphore pour la
composition", I.R.C.A.M.,
Christian Bourgois editeur, pp.
164-165-166-167, 1991.
[2] G.A. Miller, "The ma-
gical number seven, plus or mi-
nus two: ome limits on our ca-
pacity for processing if1;forma-
tion", Psychological Review, nO
63, pp. 81-97.
[3] R. Plomp, W.J.M.
Levelt, "Tonal consonance and
critical bandwidht", Journal of
the acoustical Society of
America, n° 46, 1965.
[4] D.L. Wessel, "Timbre
space as a musical control struc-

436

Potrebbero piacerti anche