Sei sulla pagina 1di 28

UNIT

2 Sound/Audio

Formatted on September 5, 2000

Objectives Contents

 To understand how computers process 1 The Nature of Sound 2

sound
2 Computer Representation of Sound 7
 To understand how computers
synthesize sound 3 Computer Music — MIDI 15

 To understand the differences between 4 Summary — MIDI versus digital audio 26


two major kinds of audio, namely
digitised sound and MIDI music 5 Exercises 27

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 1
1 The Nature of Sound

Sound is a physical phenomenon produced by the vibration of matter and transmitted as waves.
However, the perception of sound by human beings is a very complex process. It involves three
systems:

 the source which emits sound;


 the medium through which the sound propagates;
 the detector which receives and interprets the sound.

Amplitude

Sounds we heard everyday are very complex. Every Time

sound is comprised of waves of many different


frequencies and shapes. But the simplest sound we can
Period

hear is a sine wave.

Sound waves can be characterised by the following attributes:

Period Frequency Amplitude Bandwidth


Pitch Loudness Dynamic
COMP3600/SCI2600 Multimedia Systems
2. Sound/Audio
Department of Computer Science
(200009) Slide: 2
1.1 Pitch and Frequency

Period is the interval at which a periodic signal


repeats regularly.
Pitch is a perception of sound by human beings It
measures how ‘high’ is the sound as it is perceived
by a listener. Infra-sound 0 – 20 Hz
Frequency measures a physical property of a Human hearing range 20 – 20 kHz
wave. It is the reciprocal value of period f = P1 . Ultrasound 20 kHz – 1 GHz
The unit is Herts (Hz) or kiloHertz (kHz). Hypersound 1 GHz – 10 THz

Musical instruments are tuned to produce a set of Note Ratio Frequencies


fixed pitches. C 1:1 264
D 9:8 297
E 5:4 330
F 4:3 352
G 3:2 396
A 5:3 440
B 15:8 495
C 2:1 528

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 3
1.2 Loudness and Amplitude
The other important perceptual quality is This is known as the threshold of feeling. If the
loudness or volume. intensity is 10 12watt=m2, we may just be able
to hear it. This is know as the threshold of
Amplitude is the measure of sound levels. For a hearing.
digital sound, amplitude is the sample value.
The relative intensity of two different sounds is
The reason that sounds have different loudness measured using the unit Bel or more commonly
is that they carry different amount of power. deciBel (dB). It is defined by
The unit of power is watt. The intensity of
I2
sound is the amount of power transmitted relative intensity in dB = 10 log
through an area of 1m2 oriented perpendicular I1

to the propagation direction of the sound. Very often, we will compare a sound with the
threshold of hearing.
If the intensity of a sound is 1watt=m 2, we may
start feel the sound. The ear may be damaged.

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 4
160 dB Jet engine
130 dB Large orchestra at fortissimo
100 dB Car on highway
70 dB Voice conversation
50 dB Quiet residential areas
30 dB Very soft whisper
Typical sound levels generated by various sources 20 dB Sound studio

Intensity Sound Level Loudness


(watt=m2) dB

1 120 Threshold of feeling


3
10 90 fff
4
10 80 ff
5
10 70 f
6
10 60 mf
7
10 50 p
8
10 40 pp
9
10 30 ppp
12
Typical sound levels in music 10 0 Threshold of hearing

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 5
1.3 Dynamic and Bandwidth

 Dynamic range means the change in sound levels.


For example, a large orchestra can reach 130dB at its climax and drop to as low as 30dB at its
softest, giving a range of 100dB.
 Bandwidth is the range of frequencies a device can produce or a human can hear.
FM radio 50Hz – 15kHz
AM radio 80Hz – 5kHz
CD player 20Hz – 20kHz
Sound Blaster 16 sound card 30Hz – 20kHz
Inexpensive microphone 80Hz – 12kHz
Telephone 300Hz – 3kHz
Children’s ears 20Hz – 20kHz
Older ears 50Hz – 10kHz
Male voice 120Hz – 7kHz
Female voice 200Hz – 9kHz

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 6
2 Computer Representation of Sound
 Sound waves are continuous while computers are good at handling discrete numbers.
 In order to store a sound wave in a computer, samples of the wave are taken.
 Each sample is represented by a number, the ‘code’.
 This process is known as digitisation.
 This method of digitising sound is know as pulse code modulation (PCM).
Refer to Unit 1 for more information on digitisation.
 According to Nyquist sampling theorem, in order to capture all audible frequency components
of a sound, i.e., up to 20kH z , we need to set the sampling to at least twice of this.
This is why one of the most popular sampling rate for high quality sound is 4410H z .
 Another aspect we need to consider is the resolution, i.e., the number of bits used to represent
a sample.
Often, 16 bits are used for each sample in high quality sound. This gives the SNR of 96dB .

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 7
2.1 Quality versus File Size

The size of a digital recording depends on


the sampling rate, resolution and number of S file size bytes
channels. R sampling rate samples per second
S = R ( b=8)  
C D
b

C
resolution
channels
bits
1 - mono, 2 - stereo
Higher sampling rate, higher resolution gives D recording duration seconds
higher quality but bigger file size.
For example, if we record 10 seconds of stereo music at
44.1kHz, 16 bits, the size will be:
S = 44100  (16 8)  2  10
=

= 1; 764; 000bytes

= 1722:7Kbytes
1Kbytes = 1024bytes
= 1:68Mbytes Note:
1Mbytes = 1024Kbytes

High quality sound files are very big, however, the file size can be reduced by compression.

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 8
File size for some common sampling rates and resolutions
Sampling Stereo Size for
Rate Resolution /Mono for 1 Min. Comments
44.1KHz 16-bit Stereo 10.5MB CD-quality recording
44.1KHz 16-bit Mono 5.25MB A good trade-off for high-quality recordings of
mono sources such as voice-overs
44.1KHz 8-bit Stereo 5.25MB Achieves highest playback quality on low-end
devices such as most of the sound cards
44.1KHz 8-bit Mono 2.6MB An appropriate trade-off for recording a mono
source
22.05KHz 16-bit Stereo 5.25MB Darker sounding than CD-quality recording
because of the lower sampling rate
22.05KHz 16-bit Mono 2.5MB Not a bad choice for speech, but better to trade
some fidelity for a lot of disk space by dropping
down to 8-bit
22.05KHz 8-bit Stereo 2.6MB A very popular choice for reasonable stereo
recording where full bandwidth playback is not
possible
22.05KHz 8-bit Mono 1.3MB A thinner sound than the choice just above, but
very usable
11KHz 8-bit Stereo 1.3MB At this low a sampling rate, there are few
advantages to using stereo
11KHz 8-bit Mono 650K In practice, probably as low as you can go and still
get usable results
5.5KHz 8-bit Stereo 650K Stereo not effective
5.5KHz 8-bit Mono 325K About as good as a bad telephone connection

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 9
2.2 Audio File Formats

The most commonly used digital sound format in Windows systems is .wav files.

 Sound is stored in .wav as digital samples known as Pulse Code Modulation(PCM).


 Each .wav file has a header containing information of the file.
 type of format, e.g., PCM or other modulations
 size of the data
 number of channels
 samples per second
 bytes per sample
 There is usually no compression in .wav files.
Other format may use different compression technique to reduce file size.

 .vox use Adaptive Delta Pulse Code Modulation (ADPCM).


 .mp3 MPEG-1 layer 3 audio.
 RealAudio file is a proprietary format.
COMP3600/SCI2600 Multimedia Systems
2. Sound/Audio
Department of Computer Science
(200009) Slide: 10
Some common audio files formats

Extension MIME Type Platform Use


aif Audio/x-aiff Mac, SGI Audio
aifc Audio/x-aiff Mac, SGI Audio (compressed)
AIFF Audio/x-aiff Mac, SGI Audio
aiff Audio/x-aiff Mac, SGI Audio
au Audio/basic Sun, NeXT ULAW audio data
mov Video/QuickTime Mac, Win QuickTime video
mpe Video/mpeg All MPEG video
mpeg Video/mpeg All MPEG video
mpg Video/mpeg All MPEG video
mp3 Audio/x-mpeg All MPEG audio
qt Video/QuickTime Mac, Win QuickTime video
ra,ram Audio/x-pn-realaudio All RealAudio Sound
snd Audio/basic Sun, NeXT ULAW Audio Data
vox Audio/ All VoxWare Voice
wav Audio/x-wav Win WAV Audio

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 11
2.3 Audio Hardware
 Recording and Digitising sound:  Different sound card have different
 An analog-to-digital converter(ADC) capability of processing digital sounds.
converts the analog sound signal into When buying a sound card, you should
digital samples. look at:
 A digital signal processor(DSP)  maximum sampling rate
processes the sample, e.g. filtering,
 stereo or mono
modulation, compression, and so on.
 Play back sound:  duplex or simplex

 A digital signal processor processes the


sample, e.g. decompression,
demodulation, and so on ADC
 An digital-to-analog converter(DAC)
Digital
converts the digital samples into sound DSP
Sound
signal
 All these hardware devices are integrated DAC

into a few chips on a sound card

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 12
2.4 Audio Software

 Windows device driver — controls the hardware


device.
Many popular sound cards are Plus and Play.
Windows has drivers for them and can recognise
them automatically. For cards that Windows does
not have drivers, you need to get the driver from the
manufacturer and install it with the card.
 If you do not hear sound, you should check the settings,
such as interrupt, DMA channels, and so on.

 Device manager — the user interface to the hardware


for configuring the devices.
 You can choose which audio device you want to use
 You can set the audio volume

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 13
Mixer — its functions are:
 to combine sound from different sources
 to adjust the play back volume of sound sources
 to adjust the recording volume of sound sources
Recording — Windows has a simple Sound Recorder.
Editing — The Windows Sound Recorder has a limiting editing function, such as changing
volume and speed, deleting part of the sound.

There are many freeware and shareware programs for sound recording, editing and processing.

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 14
3 Computer Music | MIDI

Sound waves, whether occurred natural or man-made, are often very complex, i.e., they consist of
many frequencies. Digital sound is relatively straight forward to record complex sound. However,
it is quite difficult to generate (or synthesize) complex sound.

There is a better way to generate high quality music. This is known as MIDI — Musical
Instrument Digital Interface.

It is a communication standard developed in the early 1980s for electronic instruments and
computers. It specifies the hardware connection between equipments as well as the format in
which the data are transfered between the equipments.

Common MIDI devices include electronic music synthesisers, modules, and MIDI devices in
common sound cards.

General MIDI is a standard specified by MIDI Manufacturers Association. To be GM compatible,


a sound generating device must meet the General MIDI system level 1 performance requirement.
 minimum of 24 fully voices
 16 channels, percussion on channel 10
 minimum 16 simultaneous and different timbre instruments
This sign indicated that
 minimum 128 preset instruments the device is a general
 Support certain controllers MIDI device.

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 15
3.1 MIDI Hardware

An electronic musical instrument or a computer which has MIDI interface should has one or more
MIDI ports. The MIDI ports on musical instruments are usually labelled with:

IN — for receiving MIDI data;


OUT — for outputting MIDI data that are generated by the instrument;
THRU — for passing MIDI data to the next instrument.

MIDI devices can be daisy-chained together.

MIDI device

IN OUT

OUT IN THRU IN THRU IN THRU


MIDI device MIDI device MIDI device

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 16
3.2 MIDI Data

Unlike digital sound, MIDI data does not encode individual samples. MIDI data
encode musical events and commands to control instruments.
MIDI data are grouped into MIDI messages. Each MIDI message represents a Key C4 ON
musical event, e.g., pressing a key, setting a switch or adjusting foot pedals. A Key G3 OFF
sequence of MIDI messages is grouped into a track. Volume 105
An instrument or a computer satisfies both the hardware interface and the data
format is known as a MIDI device.

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 17
3.3 MIDI Channels and Modes

MIDI devices communicate with each other through channels. The


MIDI standard specifies each MIDI connection has 16 channels. Each
instrument can be mapped to a single channel (omni Off), or it can use
all 16 channels (Omni On).
Some instruments are capable of playing more than one note at the
same time, e.g., organs and piano. This is known as polyphony. Other
instruments, such as flute, is monophony since they can only play one
note at a time.
Each MIDI device must be set to one of the modes for receiving MIDI
data: Omni On/Poly
Omni On/Mono
Omni Off/Poly
Omni Off/Mono

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 18
3.4 Instrument Patch

Each MIDI device is usually capable of producing sound resembling several real instruments
and/or noise effects (e.g., telephone, aircraft). Each instrument or noise effect is known as a patch,
or preset.

The general MIDI standard specifies 128 patches(ranges from 0 to 127).

ID Sound ID Sound
0 Acoustic grand piano 35 Acoustic bass drum
7 Clarinet 45 Low tom
10 Music box 76 High wood block
19 Church organ 80 Mute triangle
40 Violin
48 String ensemble I
124 Telephone ring
125 Helicopter
126 Applause

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 19
3.5 MIDI les

When using computers to play MIDI music, the MIDI data are often stored in MIDI files. Each
MIDI files contains a number of chunks. There are two types of chunks:

 Header chunk — contains information about the entire file: the type of MIDI file, number of
tracks and the timing.
 Track chunk — the actual data of MIDI track.
Chunk 1 Chunk 2 Chunk 3 Chunk 4 Chunk 5 Chunk 6

There three types of MIDI file:


0 single multichannel track
Header chunk
1 one or more simultaneous track of a
sequence
Track chunk
2 one or more sequentially independent
single-track patterns MIDI event

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 20
Tracks, channels and patches

 Multiple tracks can be played at the same time.


 Each track can be assigned to a different channel.
 Each channel can accept more than one track.
 Each channel is assigned a patch, therefore generates sound of a particular instrument
Track 1 Chn 2
Track 2 Chn 3
Track 3 Chn 4
Track 4 Chn10

Device 1 Device 2 Device 3

Omni On
Chn 2<--> Piano Chn 10
Chn 2<--> Vialin
Chn3 <--> Guitar
Chn 4 <--> String ensemble

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 21
3.6 How MIDI Sounds Are Synthesized

A simplistic view is that:

 the MIDI device stores the characteristics of sounds produced by different sound sources;
 the MIDI messages tell the device which kind of sound, at which pitch is to be generated, how
long the sound is played and other attributes the note should have.

There are two ways of synthesizing sounds:

 FM Synthesis (Frequency Modulation) — Using one sine wave to modulate another sine wave,
thus generating a new wave which is rich in timbre. It consists of the two original waves, their
sum and difference and harmonics.
The drawbacks of FM synthesis are: the generated sound is not real; there is no exact formula
for generating a particular sound.
 Wave-table synthesis — It stores representative digital sound samples. It manipulates these
sample, e.g., by changing the pitch, to create the complete range of notes.

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 22
MIDI Sound Attributes

The shape of the amplitude envelop has great influence on the resulting character of sound. There
are two different types of envelop:

 Diminishing sound — gradually die out;


 Continuing sound — sustain until turned off.
On
Key
Off
Amplitude

Amplitude

Time Time
Deminishing sound Continuing sound

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 23
The Amplitude Envelop

 Delay — the time between when a key is


played and when the attack phase begins
 Attack— the time from no sound to
maximum amplitude
 Hold — the time envelop will stay at the peak
level beform starting the decay phase

Amplitude
 Decay — the time it takes the envelop to go
from the peak level to the sustain level
 Sustain — the level at which the envelop
remains as long as a key is held down
 Release — the time is takes for the sound to Attack Decay Sustain Time
Release
fade to nothing Delay Hold

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 24
3.7 MIDI software

MIDI player for playing MIDI music. This includes:


 Windows media player can play MIDI files
 Player come with sound card — Creative Midi player
 Freeware and shareware players and plug-ins— Midigate, Yamaha Midplug, etc.
MIDI sequencer for recording, editing and playing MIDI
 Cakewalk Express, Home Studio, Professional
 Cubasis
 Encore
 Voyetra MIDI Orchestrator Plus
Configuration — Like audio devices, MIDI devices require a driver. Select and configure MIDI
devices from the control panel.

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 25
4 Summary | MIDI versus digital audio
Digital Audio MIDI

 Digital representation of physical sound  Abstract representation of musical sounds


waves and sound effects
 File size is large if without compression  MIDI files are much more compact
 Quality is in proportion to file size  File size is independent to the quality
 More software available
 Much better sound if the sound source is of
 Play back quality less dependent on the high quality
sound sources
 Can record and play back any sound  Need some music theory
including speech  Cannot generate speech

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 26
5 Exercises
1. Your hard disk has 256Mbytes of free space. You keyboard is a general MIDI device which is set to
are going to record a speech with a sampling rate of ‘Omni On’ mode, Synthesiser 1 is a drum machine
11KHz, 8-bit resolution and a single channel. What which accepts signals on the drum channel only, and
is the length of the recording that can be stored in Synthesiser 2 is a guitar accepting signals from all
the hard disk? (Answer in seconds) channels but can only synthesise guitar sound. A
MIDI sequencer is playing a MIDI file on the
2. A multimedia presentation has 30 minutes of
computer. The table below lists tracks with their
CD-quality digital audio in .wav files. What is the
patch and channel settings. What kind of sound will
storage required for these files?
be heard from each device?
3. In a teleconferencing system, a network connection Track Patch Channel Track Patch Channel
having bandwidth of 100Kbytes/sec is allocated to 1 2 3 2 8 4
duplex audio link. What is the maximum sampling 3 28 5 4 49 6
rate and resolution in which uncompressed audio 5 57 7 6 73 2
data can be transmitted in real-time?
4. You are developing a network voice communication
program. It uses the Internet to connect two remote
users and allows them to talk to each other in real
IN OUT THRU IN THRU IN THRU
time. What is the most appropriate sampling rate for
recording their voice? Computer
Music Keyboard Synthesiser 1 Synthesiser 2

5. In the configuration shown below, the music

COMP3600/SCI2600 Multimedia Systems


2. Sound/Audio
Department of Computer Science
(200009) Slide: 27
5.1 Solution to Exercises

1. The size of one second of recording is 11,050 bytes. 4. Because the data we want to handle are voice, a
the length of recording that can be stored in sampling rate of 11KHz and mono will be enough.
256MBytes is This creates a data rate of 11K/sec. If the users are
(256  1024  1024) =11; 025 = 24; 347(seconds)
connected to the Internet via a LAN, this data rate
should be handled by the system adequately.
2. CD-quality digital audio means 44,100Hz Sampling However, if they are connected via modems, the
rate, 16-bit resolution and stereo. data rate will be too high. Compression technique
44100  2  2  (30  60) has to be used.
= 317; 510; 000 bytes 5. Since the music keyboard can accept signal from all
= 302:8 MBytes channels, the general MIDI patch map, we will hear
the following sounds:
3. Since the link is duplex, we will allocate half the
bandwidth to each direction, i.e., 50Kbytes/sec. Track Patch Channel Sound
Using the formula below 1 2 3 Bright Acoustic Piano
S = R ( b=8)  
C D;
2 8 4 Clavinet Chromatic
3 28 5 Electric Guitar
and assuming we record in mono, we have 4 49 6 String Ensemble 1
50; 000 = R  (b=8). Since it is a teleconferencing 5 57 7 Trumpet
application, the sound will mostly be speech. We 6 73 2 Piccolo
can use lower sampling rate and higher resolution.
Therefore, we can use 22.05KHz sampling rate and Synthesiser 2 will produce no sound because no
16-bit resolution. The most important limit is that signal is sent to Channel 10. Synthesiser 3 will
the data can not be larger than the bandwidth, produce sound for signals from all channels but they
otherwise, we will not be able to transmit them. all sound like guitar.
COMP3600/SCI2600 Multimedia Systems
2. Sound/Audio
Department of Computer Science
(200009) Slide: 28

Potrebbero piacerti anche