Sound / Audio: Multimedia Fundamentals
Sound / Audio: Multimedia Fundamentals
dB Example
30 Very soft whisper
70 Voice conversation
130 75-piece orchestra
170 Jet engine
Human Perception
max
amplitude
Time
min
Period (T)
Sound Waves
Sound waves can be characterized by the following
attributes:
Period
The interval at which a periodic signal repeat
regularly.
Time for cycle to occur:
T=1/f
Sound Waves
Frequency
Frequency is the speed of the vibration, and this determines the
pitch of the sound. It is only useful or meaningful for musical sounds,
where there is a strongly regular waveform.
Measures a physical property of a wave. The unit is Herts(Hz) or
kiloHertz(kHz). f=1/T
One hertz is one cycle wave per second.
Frequency ranges:
Ultrasound 20 kHz- 1GHz
Cosmic rays – 1022 Hz
Human hearing range up to 20kHz
Amplitude
The measure of sound levels. Intensity of the sound.
For a digital sound, amplitude is the sample value.
Sound Waves
Wave length
Distance sound travels in one cycle
The higher the frequency of the signal, the shorter the
wavelength
20 Hz is 56 feet
20 kHz is 0.7 inch
Bandwidth
Frequency range
Range of frequencies a device can produce or a human can hear.
Example:-
FM radio 50Hz-15kHz
AM radio 80Hz-5kHz
CD player 20Hz- 20kHz
Telephone 300Hz-3kHz
Older human ears 50Hz- 10kHz
Wavelength and Speed
Sound travels at approximately 344m/sec or 1130 ft/sec in air
Audible sound covers the frequency range from about 20 Hz to 20kHz.
The wavelength of sound of a given frequency is the distance between
successive repetitions of the waveform as the sound travels through air. It
is given by the following equation:
wavelength = speed/frequency
or, using the common abbreviations of c for speed, f for frequency, and l for
wavelength:
l = c/f
Sound
The faster the vibration the higher the pitch of the note and the bigger
the vibration the louder the note.
Decibel
A dimensionless unit; no specifically defined physical quantities
measures sound pressure or electrical pressure (voltage) levels.
It is a logarithmic unit that describes a ratio of two intensities, such as two
different sound pressures, two different voltages, and so on.
A bel is a base-ten logarithm of the ratio between two signals.
means that for every additional bel on the scale, the signal represented
is ten times stronger.
Example, two loudspeakers, the first playing a sound with power P1, and
another playing a louder version of the same sound with power P0
The difference in decibels between the two is defined to be:
db 10log10
p1
p0
If the second produces twice as much power than the first, the difference in dB
is
10 log (Po/P1) = 10 log 2 = 3 dB.
Common Sounds
Dynamic and Bandwidth
Types of digital audio
2 main formats:
1. Sampled/Digitized sounds
o digitally converted analog sounds.
o the analog sound waves are captured, converted, and transcribed into a
digital format.
o usually saved as WAV or AIFF files.
2. Synthesized sounds-MIDI
o recreations of many events that form together to create one sound
output.
o Mostly is recreated though the MIDI (Musical Instrument Digital
Interface) format.
o pitch, frequency, period, and speed data are recorded as numeric
format.
o when replayed, this numeric format is recreated into actual sound.
Synthesized Sounds -MIDI
Advantages Disadvantages
Smaller file size since it is recreated sound, the
composition of many different quality of the output is dependent
instruments without actually on the quality of the sound card or
having the "real“ instruments MIDI instrument
there to do the recording. Some instruments and more
tempo (speed) of the MIDI files expensive soundcards sound
can also be changed at will, indistinguishable from analog
without changing the actual sounds, while other inexpensive
pitch of the piece. soundcards make the MIDI output to
Instrumentation can also be sound "tinny" or fake.
changed instantaneously.
Sound in Multimedia Application
Types of Audio in Multimedia Applications:
Music – set the mood of the presentation, enhance the emotion,
illustrate points
Sound effects – to make specific points, e.g., squeaky doors,
explosions, wind, ...
Narration – most direct message, often effective
Sound in Multimedia Application
Analog sound need to be converted into digital data
Conversion process is sampling, in which every fraction
of a second a sample of the sound is recorded in digital
bits.
Two factors that affect the quality of the digitized sound
are
(1) the number of times the sample is taken, which is
called the sample rate, and
(2) the amount of information stored about the sample,
which is called the sample size.
Digitization
conversion to a stream of numbers, and preferably these
numbers should be integers for efficiency
Sampling means measuring the quantity we are interested in,
usually at evenly-spaced intervals
The rate at which it is performed is called the sampling
frequency. For audio, typical sampling rates are from 8 kHz
(8,000 samples per second) to 48 kHz. This range is determined
by the Nyquist theorem
Sampling in the amplitude or voltage dimension is called
quantization
Nyquist Theorem
The Nyquist theorem states how frequently we must sample in
time to be able to recover the original sound. For correct
sampling we must use a sampling rate equal to at least twice the
maximum frequency content in the signal. This rate is called the
Nyquist rate.
Nyquist Theorem: If a signal is band-limited, i.e., there is a
lower limit f1 and an upper limit f2 of frequency components in
the signal, then the sampling rate should be at least 2(f2 − f1)
Signal to Noise Ratio (SNR)
Measurement of the quality of the signal
The ratio of the power of the correct signal and the noise is called
the signal to noise ratio (SNR)
usually measured in decibels (dB)
Digital Audio
time
Sample rate & Sample size
The most common sample rates are 11.025 kHz, 22.05 kHz, and 44.1
kHz (CD audio quality), 48kHz (DVD audio quality).
The higher the sample rate, the more samples that are taken and thus the
better the quality of the digitized sound.
The two most common sample sizes are 8 bit and 16 bit.
An 8-bit sample allows for 256 values that are used to describe the sound,
a 16-bit sample allows for 65,536 values.
The higher the sample size, the better the quality of the digitized sound and
the larger the file size.
Some highs and lows will be lost; however, it is difficult to detect this,
especially with the quality of speakers found on most computers today.
Bit Depth
One more important factor when sampling sounds is bit depth.
When sampling music for web format, a 16-bit rate is adequate. For
many instances, 8-bit rate may work as well and decrease the file size.
Quality versus File size
Sampling Resolution Stereo or Bytes needed
Rate Mono for 1 Minute Note
44.1 kHz 16-bit Stereo 10.5MB CD-quality
recording
44.1kHz 8-bit Mono 2.6MB Recording a
mono source
22.05 kHz 16-bit Stereo 5.25MB CD-ROM
projects
5.5kHz 8-bit Mono 325K A bad
telephone
connection
Quality vs File size
• Sampling at higher rates such as 44.1 kHz, more
accurately captures the high-frequency content of
the sound.
• Audio resolution such as 8 bit or 16 bit determines
the accuracy with a sound can be digitized.
• More bits for the sample size yields a recording that
sounds more like its original.
Quality vS File size
The size of a digital recording depends on the
sampling rate, number of bits and number of
channels.
S=R x (b/8) x C x D
S file size - bytes
R sampling rate -samples per seconds
b number of bits - bits
C channel 1-mono 2- stereo
D recording duration - seconds
Example
If we record 10 seconds of stereo music at 44.1kHz,
resolution-16 bits, the file size is
MP3 compression can reduce audio file size from about 15-20 Megabytes to
2-5 Megabytes.
MP3 compression works by taking out the highs and lows that are
inaudible to the human ear, and also highs and lows that are "hidden"
behind other sounds.
For example, if you had an audio file that contained simultaneous sounds
of a train and a single bell, obviously, the sound of the bell would be
unable to be heard. MP3 compression takes out the sound of the bell,
thereby reducing the file sizes.
Audio File Format
Audio File Format Standard
AIFF (Audio Interchange File Format)
Developed by Apple Inc.
Uncompressed audio – lossless format – bigger file size
e.g.: 1 min audio data (stereo with 44.1kHz sampling rate at 16 bit) : 10Mb
WAV (Waveform Audio file format)
Developed by Microsoft and IBM
Main format used on Windows for raw and typically uncompressed audio
Uncompressed - big size file – uncommon for audio sharing over the Internet.
MP3
designed by the Moving Picture Experts Group (MPEG) .
Audio coding scheme with lossy data compression.
The compression works by reducing accuracy of certain parts of sound that are considered to be beyond
the auditory resolution ability of most people
MP3 can provide about 12:1 compression from an 44kHz 16-bit stereo WAV file without noticeable
degradation of sound quality
MP3 playback is not recommended on machines slower than a Pentium or equivalent.
De facto standard of digital audio compression for the transfer and playback of music on most digital
audio players.
MP3
Trade-off when creating an MP3 file : amount of space used and sound quality of the result.
MP3 files quality depend on:
1. Bit-rate can be set (specifies how many kilobits the file may use per second of audio).
higher the bit rate, larger the compressed file, the closer it will sound to the original file.
too low a bit rate, compression artifacts may be audible in the reproduction.
2. quality of the encoder
Diff encoder different quality
3. difficulty of the signal being encoded
Winamp is a free digital audio player for Windows, being able to play both
MP3 and internet radio stations.
Most popular:
Adobe Audition – Adobe SoundBooth
Best free:
Audacity
MIDI (Musical Instrument Digital Interface)
It is a communication standard developed in the early 1980s for
electronic instruments and computers.
It specifies the hardware connection between equipments as well as the
format in which the data are transferred between the equipments.
In some ways it is the sound equivalent of vector graphics
Common MIDI devices include electronic music synthesizers, modules,
and MIDI devices in common sound cards.
MIDI is not sound. MIDI works by sending data signals between
equipment, not audio signals. These signals are also known as "events“
(via series of commands).
it codes “events” that stand for the production of sounds e.g. values for the
pitch of a single note, its duration, and its volume
File size is compact but sound generated depends on playback device;
different machine different quality
only suitable for recording music; they cannot be used to store dialogue.
more difficult to edit and manipulate.
MIDI Devices
4 basic logical groups of MIDI devices:
1. Synthesizers - components that generate sounds based on the input
of MIDI software messages; create pitched and/or percussion sound
2. Controllers - devices to generate MIDI software messages. MIDI
controllers can take the form of almost any acoustic or electronic
instrument such as keyboards, guitars, drum sets, drum pads, and
even woodwind-like instruments. Keep in mind that a controller does
NOT synthesize or generate audible music. MIDI controllers generate
MIDI software messages that are routed through one or more MIDI
ports
3. Sequencers - device incorporating both MIDI software and hardware,
which is used for storing and replaying MIDI software message
sequences. In effect, the sequencer is the electronic version of the
musician in the MIDI world.
4. Networks
MIDI Synthesizers
Three types:
(i) Integrated Keyboard and Synthesizer,
(ii) Rack-mounted Synthesizer, and
(iii) a Drum Machine.
MIDI CONTROLLER
MIDI Sequencer
Studio Setup
MIDI Data
MIDI data does not encode
individual samples.
MIDI data encode musical
events and commands to
control instruments.
MIDI data are grouped into
MIDI messages.
Each MIDI message
represents a musical event,
e.g., pressing a key, setting a
switch or adjusting foot
pedals.
A sequence of MIDI messages
is grouped into a track.
An instrument or a computer
satisfies both the hardware
interface and the data format
is known as a MIDI device.
MIDI Channels and Modes
MIDI devices communicate with each other through channels.
The MIDI standard specifies each MIDI connection has 16 channels.
Each instrument can be mapped to a single channel (omni Off), or it can
use all 16 channels (Omni On).
Some instruments are capable of playing more than one note at the same
time, e.g., organs and piano. This is known as polyphony.
Other instruments, such as flute, is monophony since they can only play
one note at a time.
Each MIDI device must be set to one of the modes for receiving MIDI data:
Instrument Patch
Digital sound:
https://archive.org/details/DisneysFrozenOST
Why use Audio?
Obvious advantage: disabled users
For other users:
extra channel of information, if the meaning is unclear using visual
information alone, the audio may clarify it.
provide additional information to support different learning styles.
can add a sense of realism, allow to convey emotion, time period, geographic
location, etc.
directing attention to important events. Non-speech audio may be readily
identified by users, for example the sound of breaking glass to signify an
error. Since audio can grab the users attention so successfully, it must be
used carefully so as not to unduly distract from other media.
It can add interest to a presentation or program.
However…
Like most media, files can be large. However files sizes can be
reduced by various methods and streamed audio can be delivered
over the Web.
Audio can be easily overused. When used in a complex environment
it can increase the likelihood of cognitive overload. Studies have
shown that while congruent use of audio and video can enhance
comprehension and learning, incongruent material can significantly
reduce it. That is, where multiple media are used they should be
highly related to each other to be most effective.
For most people, audio is not as memorable as visual media.
Good quality audio can be difficult to produce, and like other media
most commercial audio, particularly music, is copyright.
Users must have appropriate hardware and software. In an open plan
environment this must include headphones.