0% found this document useful (0 votes)
45 views160 pages

Computer Graphics Systems Overview

Computer graphics systems allow for the creation of pictures and drawings using computers. There are two main types of display systems: raster and random scan. Raster systems work by sweeping an electron beam across the screen in rows to illuminate pixels stored in a frame buffer. Random scan systems draw pictures by processing individual line drawing commands stored in a display list. Key aspects of computer graphics systems include the representation of images as matrices of pixels, tools for manipulating images like rotation and scaling, and applications in areas like design, simulation, and entertainment.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views160 pages

Computer Graphics Systems Overview

Computer graphics systems allow for the creation of pictures and drawings using computers. There are two main types of display systems: raster and random scan. Raster systems work by sweeping an electron beam across the screen in rows to illuminate pixels stored in a frame buffer. Random scan systems draw pictures by processing individual line drawing commands stored in a display list. Key aspects of computer graphics systems include the representation of images as matrices of pixels, tools for manipulating images like rotation and scaling, and applications in areas like design, simulation, and entertainment.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 160

COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES
UNIT I

INTRODUCTION
LEARNING OBJECTIVES

 To understand the different primitives like line, circle and their construction using
different methods.
 To learn different clipping algorithms and their importance in visualization

1.1 OVERVIEW OF GRAPHICS SYSTEMS

Construction of pictures / drawings through computers and various manipulations are


known as computer graphics. An image is nothing but 2d quantity of pixels, arranged in the
form of matrix. The comparison between image and graphics is as given in table 1.1 below.
Tools and techniques like rotation, scaling and translation are common for both graphics
and image processing. Graphics find applications like building drawing, machine drawing,
air craft design, simulation and modeling, presentation, advertisement, entertainment and
game applications. Apart from that it has applications in critical domains like accident
analysis, earth quake analysis, weather forecasting and atomic radiation pattern analysis.

Table 1.1 Image Vs Graphics


Image Graphics
Real picture Artificially created
Captured through camera / scanner Drawn using computer
Editable only as a sub picture level Individual elements are editable
Two dimensional Any number of dimensions

Raster Algorithms

The goal of these raster algorithms is to convert graphical primitives into a set of
pixels as efficiently as possible. This process of converting from primitives to pixels is
known as scan conversion. These algorithms typically use incremental methods in order to
minimise the number of calculation (particularly multiplications and divisions) that are

1 ANNA UNIVERSITY CHENNAI


DMC 1934

performed during each iteration. These techniques use integer arithmetic rather than floating-
NOTES point arithmetic.

1.1.1 Video Display Devices

1.1.1.1 CRT monitors

A beam of electrons emitted by an electron gun, passes through focusing and deflection
systems that direct the beam toward specified positions on the phosphor-coated screen.
The phosphor then emits a small spot of light at each position contacted by the electron
beam. Because the light emitted by the phosphor fades very rapidly, some method is
needed for maintaining the screen picture. One way to keep the phosphor glowing is to
redraw the picture repeatedly by quickly directing the electron beam back over the same
points. This type of display is called a refresh CRT.

The maximum number of points that can be displayed without overlap on a CRT is
referred to as the resolution. Typical resolution on high-quality systems is 1280 by 1024,
with higher resolutions available on many systems. High resolution systems are often referred
to as high-definition systems. Another property of video monitors is aspect ratio. This
number gives the ratio of vertical points to horizontal points necessary to produce equal-
length lines in both directions on the screen. An aspect ratio of 3/4 means that a vertical line
plotted with three points has the same length as a horizontal line plotted with four points.

1.1.1.2 Raster Scan Displays

The most common type of graphics monitor employing a CRT is the raster-scan
display, based on television technology. In a raster-scan system, the electron beam is
swept across the screen, one row at a time from top to bottom. As the electron beam
moves across each row, the beam intensity is turned on and off to create a pattern of
illuminated spots. Picture definition is stored in a memory area called the refresh buffer or
frame buffer. This memory area holds the set of intensity values for all the screen points.
Stored intensity values are then retrieved from the refresh buffer and “painted” on the
screen one row (scan line) at a time. Each screen point is referred to as a pixel or pel

2 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

(shortened form of picture element). The capability of a raster-scan system to store intensity
information for each screen point makes it well suited for the realistic display of scenes NOTES
containing subtle shading and color patterns. Home television sets and printers are examples
of other systems using raster-scan methods.

A system with 24 bits per pixel and a screen resolution of 1024 by 1024 requires 3
megabytes of storage for the frame buffer. On a black-and-white system with one bit per
pixel, the frame buffer is commonly called a bitmap. For systems with multiple bits per
pixel, the frame buffer is often referred to as a pixmap.

Refreshing on raster-scan displays is carried out at the rate of 60 to 80 frames per


second, although some systems are designed for higher refresh rates. Sometimes, refresh
rates are described in units of cycles per second, or Hertz (Hz), where a cycle corresponds
to one frame. Using these units, we would describe a refresh rate of 60 frames per second
as simply 60 Hz. At the end of each scan line, the electron beam returns to the left side of
the screen to begin displaying the next scan line. The return to the left of the screen, after
refreshing each scan line, is called the horizontal retrace of the electron beam. And at the
end of each frame (displayed in 1/80th to 1/60th of a second), the electron beam returns
(vertical retrace) to the top left comer of the screen to begin the next frame. On some
raster-scan systems, each frame is displayed in two passes using an interlaced refresh
procedure. In the first pass, the beam sweeps across every other scan line from top to
bottom. Then after the vertical retrace, the beam sweeps out the remaining scan lines (Fig.
2-8). Interlacing of the scan lines in this way allows us to see the entire s m n displayed in
one-half the time it would have taken to sweep across all the lines at once from top to
bottom. Interlacing is primarily used with slower refreshing rates. On an older, 30 frame
per-second, non-interlaced display, for instance, some flicker is noticeable. But with
interlacing, each of the two passes can be accomplished in 1/60th of a second, which
brings the refresh rate nearer to 60 frames per second. This is an effective technique for
avoiding flicker, providing that adjacent scan lines contain similar display information.

Color CRTs in graphics systems are designed as RGB monitors. These monitors use
shadow-mask methods and take the intensity level for each electron gun (red, green, and
blue) directly from the computer system without any intermediate processing. High-quality

3 ANNA UNIVERSITY CHENNAI


DMC 1934

raster-graphics systems have 24 bits per pixel in the frame buffer, allowing 256 voltage
NOTES settings for each electron gun and nearly 17 million color choices for each pixel. An RGB
color system with 24 bits of storage per pixel is generally referred to as a full-color system
or a true-color system.

1.1.1.3 Random Scan Displays

Random scan monitors draw a picture one line at a time and for this reason are also
referred to as vector displays. The component lines of a picture can be drawn and refreshed
by a random-scan sys- tem in any specified order. A pen plotter operates in a similar way
and is an example of a random-scan, hard-copy device. Refresh rate on a random-scan
system depends on the number of lines to be displayed. Picture definition is now stored as
a set of line drawing commands in an area of memory referred to as the refresh display file.
Sometimes the refresh display file is called the display list, display program, or simply the
refresh buffer. To display a specified picture, the system cycles through the set of commands
in the display file, drawing each component line in turn. After all line drawing commands
have been processed, the system cycles back to the first line command in the list.

Random-scan displays are designed to draw all the component lines of a picture 30
to 60 times each second. Random-scan systems are designed for line drawing applications
and cannot display realistic shaded scenes. Since picture definition is stored as a set of line
drawing instructions and not as a set of intensity values for all screen points, vector displays
generally have higher resolution than raster systems. Also, vector displays produce smooth
line drawings because the CRT beam directly follows the line path. A raster system, in
contrast, produces jagged lines that are plotted as discrete point sets.

1.1.1.4 Flat panel displays

Flat-panel displays include small TV monitors, calculators, pocket video games, laptop
computers etc., we can separate flat-panel displays into two categories: emissive displays
and non-emissive displays. The emissive displays (or emitters) are devices that convert
electrical energy into light. Plasma panels, thin-film electroluminescent displays, and Light-
emitting diodes are examples of emissive displays. Non-emissive displays (or non-emitters)
use optical effects to convert sunlight or light from some other source into graphics patterns.
The most important example of a non-emissive flat-panel display is a liquid-crystal device.

Plasma panels, also called gas-discharge displays, are constructed by filling the
region between two glass plates with a mixture of gases that usually includes neon. Another
type of emissive device is the light-emitting diode (LED). A matrix of diodes is arranged
to form the pixel positions in the display, and picture definition is stored in a refresh buffer.

4 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES

Liquid Crystal displays are used in small systems, such as calculators and portable,
laptop computers. These non-emissive devices produce a picture by passing polarized
light from the surroundings or from an internal light source through a liquid-crystal material
that can be aligned to either block or transmit the light. Rows of horizontal transparent
conductors are built into one glass plate, and columns of vertical conductors are put into
the other plate. The intersection of two conductors defines a pixel position. Polarized light
passing through the material is twisted so that it will pass through the opposite polarizer.
The light is then reflected back to the viewer. To turn off the pixel, we apply a voltage to the
two intersecting conductors to align the molecules so that the light is not twisted. This type
of flat-panel device is referred to as a passive-matrix LCD. Picture definitions are stored
in a refresh buffer, and the screen is refreshed at the rate of 60 frames per second, as in the
emissive devices. Another method for constructing LCD’s is to place a transistor at each
pixel location, using thin-film transistor technology. The transistors are used to control the
voltage at pixel locations and to prevent charge from gradually leaking out of the liquid-
crystal cells. These devices are called active-matrix displays.

5 ANNA UNIVERSITY CHENNAI


DMC 1934

1.1.1.5 Other display devices


NOTES
Graphics monitors for the display of three-dimensional scenes have been devised
using a technique, that reflects a CRT image from a vibrating, flexible mirror. Stereo scopic
and virtual reality systems are more advanced in the way they provide interaction. Another
technique for representing three dimensional objects is displaying stereoscopic views. This
method does not produce true three-dimensional images, but it does provide a three-
dimensional effect by presenting a different view to each eye of an observer so that scenes
do appear to have depth.

1.1.2 Raster Scan Systems

Interactive raster graphics systems typically employ several processing units. In addition
to the central processing unit, or CPU, a special-purpose processor, called the video
controller or display controller, is used to control the operation of the display device. Here,
the frame buffer can be anywhere in the system memory, and the video controller accesses
the frame buffer to refresh the screen.

Raster scan systems also use a separate processor, sometimes referred to as a graphics
controller or a display coprocessor. The purpose of the display processor is to free the
CPU from the graphics chores. In addition to the system memory, a separate display
processor memory area can also be provided. A major task of the display processor is
digitizing a picture definition given in an application program into a set of pixel-intensity
values for storage in the frame buffer. This digitization process is called scan conversion.
Graphics commands specifying straight lines and other geometric objects are scan converted.

1.1.3 Random Scan systems

An application program is input and stored in the system memory along with a graphics
package. Graphics commands in the application program are translated by the graphics
package into a display file stored in the system memory. This display file is then accessed
by the display processor to refresh the screen. The display processor cycles through each
command in the display file program once during every refresh cycle. Sometimes the display
processor in a random-scan system is referred to as a display processing unit or a graphics
controller.
6 ANNA UNIVERSITY CHENNAI
COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

1.1.4 Input Devices


NOTES
Keyboard : An alphanumeric keyboard on a graphics system is used primarily as a device
for entering text strings.

Mouse : A mouse is small hand-held box used to position the screen cursor.

Trackball and Spaceball: As the name implies, a trackball is a ball that can be rotated
with the fingers or palm of the hand to produce screen-cursor movement. While a trackball
is a two-dimensional positioning device, a spaceball provides six degrees of freedom.

Joysticks : A joystick consists of a small, vertical lever (called the stick) mounted on a
base that is used to steer the screen cursor around.

Data Glove: a data glove that can be used to grasp a “virtual” object. The glove is
constructed with a series of sensors that detect hand and finger motions.

Digitizers: A common device for drawing, painting, or interactively selecting coordinate


positions on an object is a digitizer. These devices can be used to input coordinate values
in either a two-dimensional or a three-dimensional space.

Image Scanners: Drawings, graphs, color and black-and-white photos, or text can be
stored for computer processing with an image scanner by passing an optical scanning
mechanism over the information to be stored. The gradations of gray scale or color are
then recorded and stored in an array.

Touch Panels: As the name implies, touch panels allow displayed objects or screen
positions to be selected with the touch of a finger. A typical application of touch panels is
for the selection of processing options that are represented with graphical icons.

Light Pens: These pencil-shaped devices are used to select screen positions by detecting
the light coming from points on the CRT screen.

Voice Systems: Speech recognizers are used in some graphics workstations as input
devices to accept voice commands

1.1.5 Print Devices

The quality of the pictures obtained from a device depends on dot size and the number
of dots per inch, or Lines per inch, that can be displayed. Printers produce output by either
impact or non-impact methods. Impact printers, press formed character faces against an
inked ribbon onto the paper. A line printer is an example of an impact device, with the
typefaces mounted on bands, chains, drums, or wheels. Non-impact printers and plotters
use laser techniques, ink-jet sprays, xerographic processes (as used in photocopying
machines), electrostatic methods, and electro thermal methods to get images onto Paper.
In a laser device, a laser beam mates a charge distribution on a rotating drum coated with

7 ANNA UNIVERSITY CHENNAI


DMC 1934

a photoelectric material, such as selenium. Toner is applied to the drum and then transferred
NOTES to paper. Ink-jet methods produce output by squirting ink in horizontal rows across a roll
of paper wrapped on a drum. The electrically charged ink stream is deflected by an electric
field to produce dot-matrix patterns.

1.1.6 Graphics Software

A general graphics programming package provides an extensive set of graphics


functions that can be used in a high-level programming language, such as C or FORTRAN.
An example of a general graphics programming package is the GL (Graphics Library)
system on Silicon Graphics equipment. Basic functions in a general package include those
for generating picture components (straight lines, polygons, circles, and other figures),
setting color and intensity values, selecting views, and applying transformations.

Coordinate Systems

Any picture definition must be converted to Cartesian coordinates before they can be
input to the graphics package. Several different Cartesian reference frames are used to
construct and display a scene. We can construct the shape of individual objects, such as
trees or furniture, in a scene within separate coordinate reference frames called modeling
coordinates, or sometimes local coordinates or master coordinates. Once individual object
shapes have been specified, we can place the o b F s into appropriate positions within the
scene using a reference frame called world coordinates. Finally, the world-coordinate
description of the scene is transferred to one or more output-device reference frames for
display. These display coordinate systems are referred to as device coordinates or screen
coordinates in the case of a video monitor.

Graphics Functions

A general-purpose graphics package provides users with a variety of functions for


creating and manipulating pictures. These routines can be categorized according to whether
they deal with output, input, attributes, transformations, viewing, or general control. The
basic building blocks for pictures am referred to as output primitives. They include character
strings and geometric entities, such as points, straight lines, curved Lines, filled areas
(polygons, circles, etc.), and shapes defined with arrays of color points. Routines for
generating output primitives provide the basic tools for constructing pictures. Attributes
are the properties of the output primitives; that is, an attribute describes how a particular
primitive is to be displayed. They include intensity and color specifications, line styles, text
styles, and area-filling patterns. Functions within this category can be used to set attributes
for an individual primitive class or for groups of output primitives. We can change the size,
position, or orientation of an object within a scene using geometric transformations. Similar
modeling transformations are used to construct a scene using object descriptions given in
modeling coordinates. Given the primitive and attribute definition of a picture in world

8 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

coordinates, a graphics package projects a selected view of the picture on an output


device. Viewing transformations are used to specify the view that is to be presented and NOTES
the portion of the output display area that is to be used. Pictures can be subdivided into
component parts, called structures or segments or objects, depending on the software
package in use. Each structure defines one logical unit of the picture. Interactive graphics
applications use various kinds of input devices, such as a mouse, a tablet, or a joystick.
Input functions are used to control and process the data flow from these interactive devices.
Finally, a graphics package contains a number of housekeeping tasks, such as clearing a
display 5,-reen and initializing parameters, we can lump the functions for carrying out these
chores under the heading control operations.

Software standards

Graphical Kernel System (GKS): This system was adopted as the first graphics software
standard by the International Standards Organization.

PHIGS: The second software standard to be developed and approved by the standards
organizations was PHIGS (Programmer’s Hierarchical Interactive Graphics standard),
which is an extension of GKS. Increased capabilities for object modeling color specifications,
surface rendering, and picture manipulations are provided.

Standard graphics functions are defined as a set of specifications that is Independent


of any programming language. A language binding is then defined for a particular high-level
programming language. This binding gives the syntax for accessing the various standard
graphics functions from this language.

Although PHIGS presents a specification for basic graphics functions, it does not
provide a standard methodology for a graphics interface to output devices. Nor does it
specify methods for storing and transmitting pictures. Separate standards have been
developed for these areas. Standardization for device interface methods is given in the
Computer Graphics Interface (CGI) system. And the Computer Graphics Metafile (CGM)
system specifies standards for archiving and transporting pictures.
9 ANNA UNIVERSITY CHENNAI
DMC 1934

1.1.7 Interactive Input Devices


NOTES
Graphics applications use several kinds of input data. To make graphics packages
independent of the particular hardware devices used, input functions can be structured
according to the data description to be handled by each function. This leads to logical input
device classification in terms of data to be input by the device.

1.1.7.1 Logical Classification of Input Devices:

Locator: a Device for specifying a coordinate position (x,y)

Stroke: a device for specifying a series of coordinate positions

String: a device for specifying scalar values

Valuator: a device for specifying scalar values

Choice: a device for selecting menu options

Pick: a device for selecting picture components


Locator devices: A standard method for interactive selection of a coordinate point is by
positioning the screen cursor. We can do this with a mouse, joystick, trackball, space ball,
thumbwheels, dials, a digitizer stylus or hand cursor, etc.,
A general purpose keyboard usually has four cursor control keys that move the screen
cursor accordingly. Light pens have also been used to input coordinate positions, but some
special implementation considerations are necessary.
Stroke devices: This class of logical devices is used to input a sequence of coordinate
positions. Stroke-device input is equivalent to multiple calls to a locator device. This set of
input points is often used to display line sections. Many of the physical devices used for
generating locator input can be used as stroke devices. Continuous movement of a mouse,
trackball, joystick, or tablet hand cursor is translated into a series of input coordinate
values. The graphics tablet is one of the more common stroke devices.
String devices: The standard device used for string input is the keyboard. Input character
strings are typically used for picture or graph labels.
Valuator devices: This logical class of devices is employed in graphics systems to input
scalar values. Valuators are used for setting various graphics parameters such as rotation
angle and scale factors, and for setting physical parameters associated with a particular
application.
A typical physical device used to provide valuator input is a set of control dials. Any
keyboard with a set of numeric keys can be used as a valuator device. Joysticks, trackballs,
tablets and other interactive devices can be adapted for valuator input by interpreting
pressure movement of the device relative to a scalar range.

10 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Choice devices: Graphics packages use menus to select programming options, parameter
values and object shapes to be used in constructing a picture. A choice device is defined as NOTES
one that enters a selection from a list of alternatives. Commonly used choice devices are a
set of buttons, a cursor positioning device, such as a mouse, trackball, or keyboard cursor
keys, and a touch panel.

For screen selection of listed menu options, we can use cursor-control devices. When
a coordinate position (x,y) is selected, it is compared to the coordinate extents of each
listed menu item, A menu item with vertical and horizontal boundaries at the coordinate
values xmin, xmax, ymin, and ymax is selected if the input coordinates (x,y) satisfy the inequalities

xmin d” x d” xmax , ymin d” y d” ymax

Pick devices: These are used to select parts of a scene that are to be transformed to
edited in some way. Typical devices used for object selection are the same as those for
menu selection: the cursor-positioning devices. With a mouse or joystick, we can position
the cursor over the primitives in a displayed structure and press the selection button. The
position of the cursor is then recorded, and several levels of search may be necessary to
locate the particular object that is to be selected. First the cursor position is compared to
the coordinate extents of the various structures in the scene. If the bounding rectangle of a
structure contains the cursor coordinates, the picked structure has been identified. But if
two or more structure areas contain the cursor coordinates, further checks are necessary.

1.1.7.2 Input functions

Graphical input functions can be set up to allow users to specify the following options:
 Which physical devices are to provide input within a particular logical classification
(for example, a tablet used as a stroke device)
 How the graphics program and devices are to interact (input mode).
 When the data are to be input and which device is to be used at that time to deliver
a particular input type to the specified data variables.

Input modes
Functions to provide input can be structured to operate in various input modes which
specify how the program and input devices interact. Input could be initiated by the program,
or the program and input devices both could be operating simultaneously, or data input
could be initiated by the devices. These three input modes are referred to as request
mode, sample mode, and event mode.
In request mode, the application program initiates data entry. Input values are requested
and processing is suspended until the required values are received.
In sample mode, the application program and input devices operate independently.
Input devices now may be operating at the same time that the program is processing other

11 ANNA UNIVERSITY CHENNAI


DMC 1934

data. New input values from the input devices are stored, replacing previously input data
NOTES values. When the program requires new data, it samples the current values from the input
devices.

In event mode, the input devices initiate data input to the application program. The
program and the input devices again operate concurrently, but now the input devices deliver
data to an input queue. All input data are saved. When the program requires new data, it
goes to the data queue.

1.2 DDA TECHNIQUE

There are two popular methods, namely DDA and Bresenham’s methods through
which line can be scan converted. Let us discuss about Digital Differential Algorithm.

1.2.1 Digital Differential Analyzer algorithm

Normally two end points will be given as input for the algorithm. From the given
points slope is found. Slope can have any value with positive and negative quantity. Slope
is of floating value in nature. The system can display only discrete quantities or integer
addressable pixel values. Ideal line is the one which has exact quantities of x and y co-
ordinates with float values. The goal is to identify the pixels that lie near the ideal line
imposed on the 2D raster grid. These pixels will form the scan-converted line. The scan
converted line should have constant and even brightness / thickness independent of
orientation and length and be drawn as rapidly as possible. The scenario is as pictured
below, where the shaded part indicates the potential displayable candidate points, and
dark line is nothing but ideal line. The question here is, how did we choose which of these
pixels to be selected for proper display of line?

How did we
choose which
of these pixels?

Figure 1.1 A Line display with 2D raster grid

DDA Algorithm

The digital differential analyser (DDA) algorithm is a basic incremental algorithm.

Let Xi+1 = Xi +”X, then the equation if a line is :

Yi = mxi +B

Þ yi+1 = mxi+1 + B

12 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Þ yi+1 = m(xi+”x)+B
NOTES
Þ yi+1 = yi +m”x

Assuming “x is 1 and |m| < = 1, then

Yi+1 = yi +m

When |m| >1, then “y = 1 and

Xi+1 = xi +(1/m)

Since program variables have limited precision, the cumulative build up of error can
cause problems (Disadvantage) for long lines.

Pseudo code for DDAAlgorithm

void ddaLine (int x0, int y0, int x1, int y1)
{
int x, y;
double dy = y1 – y0;
double dx = x1 – x0;
double m = dy/dx;
double incr = m;
if (m > 1.0)
{
incr = 1/m;
y = x0;
x = y0;
}
for (x = x0; x<= x1; x++)
{
writePixel (x,y);
y = y + incr;
}
return;
}

However, it doesn’t deal with special cases like horizontal and vertical lines where
slope is 0 and Infinity respectively.

13 ANNA UNIVERSITY CHENNAI


DMC 1934

1.3 BRESENHAM’S LINE ALGORITHM


NOTES
Bresenham developed an incremental algorithm that uses only integer arithmetic. The
technique can also be applied to circle, but not to arbitrary curves and conics (ellipses and
hyperbolae). The midpoint algorithm reduces to Bresenham’s algorithm for lines and circles.
We will concentrate on lines with slope from 0 to 1 and then later indicate how this can be
generalized. This algorithm proposes technique about which is the best pixel for display
among the candidate pixels. The scenario is as shown below,

Figure 1.2 Line in a Grid with Mid point notation

The black dot is current pixel position of the line. Since the slope is less than one
(Assumption) straight increment in x co-ordinate is given for next pixel and y coordinate
pixel (NE or E) is under the consideration for next pixel display. Let Q be the intersection
of the line and the grid-line (Xp+1). The choice of pixel to use for the line is between E and
NE. The obvious choice is to pick the pixel closest to Q. Another way to ask the same
question is: Is Q above or below M (the midpoint)? The error in the choice is always
d” 1/2. To determine which side of M is Q we use both forms of the equation of a line.

F(x,y) = ax+by+c=0

Y=dy/dx x + B where dy = y1-y0 and dx=x1-x0

The b in the slope-intercept form and the b in the explicit form are unrelated. Therefore,

f(x,y) = dy (x) –dx(y)+b.dx=0

so a=dy, b= -dx and c= b.dx

The sign of the midpoint f(xp+1,yp+1/2) will determine if m is above or below the line.

Hence the decision variable


d=f(xp+1,yp+1/2)+c
if d>0, we choose ne
 if d=0, we choose e
 if d<0, pick either

14 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

note that if a point is:


– on the line f(x,y) is zero.
NOTES
– above the line f(x,y) is positive
– below the line f(x,y) is negative.

Once we have chosen either e or ne we can calculate the value of d for the next grid
line

(xp +2).

If e is chosen, then:

Dnew = f(xp+2,yp+1/2)=a(xp+2)+b(yp+1/2)+c

But

Dold = a(xp+1)+b(yp+1/2)+c

Subtracting yields dnew = dold +a

If ne is chosen, then:

Dnew = f(xp+2,yp+ 3/2)=a(xp+2)+b(yp+3/2)+c

Subtracting dold yields:

Dnew = dold + a + b

The change in the decision variable


 if e is chosen is “e = a = dy; and”
 if ne is chose is “ne = a+b = dy-dx

The initial value of d is determined directly from the first midpoint.

F(x0+1,y0+1/2 ) = a(x0+1)+b(y0+1/2)+c=ax0+by0+c+a+b/2

= f(x0,y0)+a+b?2

since f(x0,y0) is by definition on the line, then

– dstart= a+b/2 = dy – dx/2


 to get rid of the division in dstart we multiply every thing by 2. Since this does not
change the sign of the decision variable d, it does effect the correctness of the
algorithm, only its efficiency.
 the following code generated lines with slopes from 0 to 1.

15 ANNA UNIVERSITY CHENNAI


DMC 1934

for compatibility with the circle and ellipse algorithms we will increment x and y after
NOTES updating the decision variable.

Pseudo code for midpoint line algorithm,


void midpointline (int x0, int y0, int x1, int y1)
{
int dx = x1 – x0;
int dy = y1 – y0;
int d = 2*dy – dx; // start value for decision var
int incre = 2*dx;
int incrne = 2*(dy – dx);
int x = x0;
int y = y0;
writepixel (x,y);
while (x < x1)
{
if (d <= 0)
{ // choose e
d = d + incre;
x++;
}
else
{ // choose ne
d = d + incrne;
x++; y++;
}
writepixel (x,y)
}
}

extending the code to cover the other 7 octants can be done by looking at the
relationships between the octants.

16 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES

Figure 1.3 Circle and its 8 octants with X and Y polarity

 by swapping the x and y values the algorithm can be applied in octant 2.


– can you figure out the rest?
– endpoint order

 when scanning a line from p0 to p1 and the decision variable was 0 we choose the e
pixel.

 when scanning from p1 to p0 and the decision variable is 0 picking the w pixel would
be mistake!
– it would produce different pixels when travelling in different directions.need to
choose sw pixel.
– variable intensity

 a line with slope 0 will have the same number of pixels as a line with slope 1, even
though the second line is “2 times longer.

– can be dealt with if pixels have multiple intensity levels.

1.4 CIRCLE DRAWING ALGORITHMS

The simple approach is to use the equation of the circle:

x2 + y2 = r2

y2 =?r2-x2???

 we can increment from 0 to r and solve for y. This has 2 major problems:
– inefficient: multiply and square root operations.
– as the slope become “ large gaps appear.

17 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 1.4 Circle plotted with pixels from y axis

1.4.1 Midpoint Circle Algorithm

 We focus on one octant of the circle and then we can map the results to the other
7 octants.
 Similar to the midpoint line algorithm we choose between 2 candidate “next” pixels
by using a midpoint criterion.

Figure 1.5 Circle with with midpoint M notation

Location of current pixel is at p=( Xp, Yp)

 let f(x,y) = x2 + y2 – r
– f is zero for points on the circle.
– f is positive for point outside the circle.
– f is negative for point inside the circle

18 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

The decision variable , d , is


Dold = f(xp+2,yp- 1/2)=(xp+1)2+(yp- 1/2)2-r2 NOTES
If dold <0, then e is chosen, and the decision variable is:
Dnew = f(xp+2,yp-1/2)=(xp+2)2+(yp-1/2)2-r2
Evaluvating and subtracting yields:
Dnew = dold+2xp+3
e=2xp+3
If dold e”?0, then se is chosen, and the decision variable is:
Dnew = f(xp+2,yp-3/2)=(xp+2)2+(yp-3/2)2-r2
Dnew = dold + 2xp -2yp+5
e = 2xp-2yp+5
The start of the circle is at f(0,r) and the first midpoint is at (1,r-1/2). Hence the initial
value of the decision variable is:
Dstart = f(1,r-1/2)=1+(r2-r+1/4)-r2
dstart = 5/4-r

To remove the fractions from the calculation substitute n = d-1/4. The initial value is
n=1-r, and since n will be incremented by integer values(“e,”se) we can change the test to
n<0.
 We can improve the algorithm further by realising that the value for ge and gse are
linear functions. By applying the differencing approach again we can remove them.
 if we choose e then the evaluation point moves to (xp+1,yp).
“eold = “e(xp,yp)=2xp+3
“enew = “e(xp+1,yp)=2(xp+1)+3
“enew-”eold =2
“seold = “se(xp,yp)=2xp-2yp+5
“seold = “se(xp+1,yp)=2(xp+1)-2yp+5
“senew-”seold=2
similarly, if we choose se then the evaluation point moves to (xp+1,yp-1).
“eold = “e(xp,yp)=2xp+3
“enew=”e(xp+1,yp-1)=2(xp+1)+3
“enew – “eold = 2
“seold = “se(xp,yp)=2xp-2yp+5
“senew = “se(xp+1,yp-1)=2(xp+1)-2(yp-1)+5
“senew – “seold =4
 the midpoint circle algorithm with second order difference is therefore as given below.

19 ANNA UNIVERSITY CHENNAI


DMC 1934

Pseudo code for mid point algorithm with second order difference
NOTES void midpointcircle (int radius)
{
int x = 0;
int y = radius;
int d = 1 - radius;
int deltae = 3;
int deltase = (-2 * radius) + 5;
writepixel (x,y);
while (y > x)
{
if (d < 0)
{ // choose e
d = d + deltae;
deltae = deltae + 2;
deltase = deltase + 2;
}
else
{ // choose se
d = d + deltase;
deltae = deltae + 2;
deltase = deltase + 4;
y—;
}
x++;
writepixel (x,y) ;
}
}
1.5 LINE CLIPPING
Cohen-Sutherland 2d clipping Method
 The cohen-sutherland line clipping algorithm clips lines to rectangular clipping regions.
 The clipping region is defined by 4 clipping planes.
 The 2d plane is divided into regions, each with its own outcode.
 The endpoints of each line to be clipped are assigned a region depending on their
location.

20 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES

Figure 1.6 Region code representation for cohen-sutherland clipping

Cohen-sutherland line clipping algorithm


 If both endpoints’ outcodes are 0000 then the line is trivially accepted and rendered
as is.
 If a bitwise and of both endpoints is not zero then the line can be trivially rejected
and not rendered at all.
 If the line cannot be trivially accepted or rejected it may enter the clipping region.
– divide the line in 2 based on where it intersects with one of the clipping planes.
 The segment outside the clipping plane can be discarded.
 The remaining segment’s new endpoint is assigned an outcode and it is tested.
– the clipping planes can be tested in any order, but the order must be the same
each time the algorithm is used.
 We will use the order of our outcode bits: top, bottom, right, left.

Pseudo code cohen-sutherland line clipping algorithm (contd.)


Class cohensutherland
{
public static final byte top = 0x1;
public static final byte bottom = 0x2;
public static final byte right = 0x4;
public static final byte left = 0x8;
Void clip2d (float x0, float y0, float x1, float y1, float xmin, float xmax,
float ymin, float ymax)
{
byte outcode0, outcode1, outcodeout;
boolean accept = false, done = false;

21 ANNA UNIVERSITY CHENNAI


DMC 1934

float x, y;
NOTES outcode0 = computeoutcode (x0, y0, xmin, xmax, ymin, ymax);
outcode1 = computeoutcode (x1, y1, xmin, xmax, ymin, ymax);
do
{
if (!(outcode0 | outcode1))
{
accept = true; done = true; // trivial accept
}
else if (outcode0 & outcode1)
{
done = true; // trivial reject
}
else
{
// Calculate intersection with a clipping plane using
// y = y0 +slope*(x-x0) and x = x0 + (1/slope)*(y-y0)
outcodeout = outcode0 ? outcode0 : outcode1;
if (outcodeout & top)
{
x = x0 + (x1-x0)*(ymax-y0)/(y1-y0);
y = ymax;
}
else if (outcodeout & bottom)
{
x = x0 + (x1-x0)*(ymin-y0)/(y1-y0);
y = ymin;
}
else if (outcodeout & right)
{
y = y0 + (y1-y0)*(xmax-x0)/(x1-x0);
x = xmax;
}
else if (outcodeout & left)
{

22 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

y = y0 + (y1-y0)*(xmin-x0)/(x1-x0);
x = xmin; NOTES
}
}
if (outcodeout == outcode0)
{
x0 = x; y0 = y; outcode0 = computeoutcode (x0, y0, xmin, xmax, ymin,
ymax);
}
else
{
x1 = x; y1 = y;
outcode1 = computeoutcode (x1, y1, xmin, xmax,
ymin, ymax);
}
} while (done == false);
if (accept)
{
drawline (x0, y0, x1, y1);
}
}
byte outcode ()
{
byte outcode = 0;
if (y > ymax)
outcode = outcode | top;
else if (y < ymin)
outcode = outcode | bottom;
if (x > xmax)
outcode = outcode | right;
else if (x < xmin)
outcode = outcode | left;
return outcode;
}
}

23 ANNA UNIVERSITY CHENNAI


DMC 1934

1.6 TEXT CLIPPING


NOTES
The task here is to clip text against the window boundary. There are 3 types of text
clipping, namely All or none text clipping, All or none character clipping, Clipping individual
character.

All or none text clipping : Since STRING 1 is partial with respect to the clip
window STRING 1 is completely excluded(clipped). Here STRING 2 alone is displayed
after clipping.

Figure 1.7a Text Clipping Scenario

 All or none character clipping : Here character which is completely inside is


alone displayed after clipping as shown below.

Figure 1.7b Text Clipping Scenario


 Clipping individual character : Here character which is partial overlap with
boundary is displayed partial (ie. VISIBLE PART ALONE)as it is to create more
realistic.

24 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES

Figure 1.7c Text Clipping Scenario

QUESTIONS

1) Discuss the pros and cons of DDA method.


2) List out the applications of graphics.
3) Design and develop an algorithm for ellipse using Mid-Point method.
4) Modify the Cohen – Sutherland algorithm for a polygonal object
5) Suggest a suitable method of text clipping for animated character display with
justification.

25 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

26 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES
UNIT II

2D TRANSFORMATIONS
LEARNING OBJECTIVES

 To understand the different types of transformations namely scaling, rotation,


translation, reflection and shear.
 To learn different types of curves and mapping transformation.

2.1 2D TRANSFORMATIONS

Transformation means a complete change in graphical objects and shapes or any


output primitives.

 There are 5 types of transformations namely translation, scaling, rotation and also
special transformations namely reflection and shear that are essential.
– Translation

 Moving an object to a new position by adding to the object x and y coordinates.


– Scaling

 Changing the size of the object. This can be uniform where both dimensions are
resized by the same factor, or non-uniform.
– Rotation

 Object is rotated around the origin by a specified angle è.

2.1.1 2D Translation – Approach A

 We want to move a point p at (x,y) to a new position p’ at (x’, y’) where


X’ = X + DX and Y’ = Y + DY
 we can represent the points and the translation as column vectors.

x x d x 
p   , p'   , T   
 y  y d y 

27 ANNA UNIVERSITY CHENNAI


DMC 1934

 the translation can be expressed as;


NOTES P’ = P + T
 a line or shape can be translated by translating its vertices and redrawing the line or
shape.
2.1.2 2D Scaling – Approach A
 scaling refers to rescaling along an axis. A point can be scaled along the x-axis or
the y-axis, or both. It provides zoom-in and zoom- out effects on the graphical
objects.

X’ = SX.X Y’ = SY.Y
SX= Scaling coefficient in X direction
SY = Scaling coefficient in Y direction

 in matrix form this is:

 x1   sx 0  x
y    0 s y   y 
 1 

 or by symbol notation p’ = s.p ; where s is the scaling matrix

2.1.3 2D Rotation – Approach A

 points are rotated around the origin by an angle è. For complete objects like
squares, rectangles or any polygons point wise processing is done to completely
rotate and then lines are drawn to complete the shape.

x1  x. cos   y. sin  , y1  x. sin   y. cos

 In matrix form rotation of an angle è with respect to origin for the given (x,y) point
is as follows:

 x1  cos   sin   x


 y    sin  cos    y
 1    

by symbol notation P’ = R.P

28 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

2.1.4 Homogeneous Coordinates


NOTES
Need for homogeneous transformation is as follows. The problem with approach a is
that different transformations are handled differently by means of matrix addition and
multiplications.
– P’ = P + T
– P’ = S.P
– P’ = R.P
 the solution is to use homogeneous coordinates.
 each point (x, y) is represent as a triple (x, y, w).
– two points are the same is one is a multiple of the other.
 (1,2,3) and (3,6,9) represent the same point.
– each 2d point is now a line in a 3d space.

The w = 1 plane is our 2d space and the intersection of the line with this plane gives
us the point.

in a homogeneous coordinate system translation, scaling and rotation are matrix
multiplications.

Homogeneous transformations

 TRANSLATION

 x1  1 0 d x   x 
 1   
 y   0 1 d y   y 
 1   0 0 1   1 
 

 SCALING

 x1  s x 0 0   x 
 y1  0 s 0  y 
   y  
 1   0 0 1   1 

 ROTATION

 x1  cos  sin  0  x 
 1   
 y    sin  cos  0   y 
 1   0 0 1   1 
 

29 ANNA UNIVERSITY CHENNAI


DMC 1934

THE GENERAL FORM OF THE 2D TRANSFORMATION MATRIX IS:


NOTES
r11 r12 tx
r21 r22 ty
0 0 1

 multiplying a set of transformation matrices is equivalent to applying a sequence of


transformations one after another.
– the order is very important.
– if you want to apply transformation a then b and finally c, you can pre-calculate
the combined effect as c.b.a
– The resulting transformation is an affine transformation.
 preserves parallel lines, but not lengths or angles
An example for transformation is as follows;

 suppose we wish to reduce the square in the following image to half its size and
rotate it by 45° about point p.

 we need to remember that scaling and rotation takes place with respect to the
origin.
- translate square so that the point around which the rotation is to occur is at the
origin.
- Scale
- Rotate
- translate origin back to position p.

30 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

ANOTHER EXAMPLE
NOTES

 because of the particular structure of the last row of the 2d homogeneous


transformation matrix the number of operations can be reduced from 9
multiplications and 6 additions to 4 multiplications and 4 additions.
X’ = X.R11 + Y.R12 + TX
Y’ = X.R21 + Y.R22 + TY
 if a model is being incrementally rotated by a small angle è, then by noting that cosè
is very close to 1 for small values of è then
X’ = X - Y.SINÈ
Y’ = X.SINÈ + Y

just 2 multiplications and 2 additions.

– this is just an approximation and as the errors accumulate the image will become
unrecognizable.
– a better approximation is
X’ = X - Y.SINÈ
Y’ = X’.SINÈ + Y = (X - Y.SINÈ).SINÈ + Y
= X.SINÈ + Y(1-SIN2È)

 the corresponding 2x2 matrix has a determinant of 1 and hence preserves areas.

31 ANNA UNIVERSITY CHENNAI


DMC 1934

2.1.5 Reflection
NOTES
It is all about generation of mirror effects with reference to x, y or both the coordinates.
The general form of reflection matrix is as follows, where RX and RY are corresponding
reflection coefficients.

RX 0 0

0 RY 0

0 0 1

The matrix notation is as follows for reflection about x axis is ;

1 0 0

0 -1 0

0 0 1

The matrix notation is as follows for reflection about y axis is ;

-1 0 0

0 1 0

0 0 1

The matrix notation is as follows for reflection about both x and y axis is ;

-1 0 0

0 -1 0

0 0 1

2.1.6 Shear Transformation

Shear transformation creates the ‘slide over effects’ on the object of operation.
Assume that books are stacked one above the other, if force is applied in one direction it
causes the distorted effect on books. There are x shear and y direction shear corresponding
transformation matrix is as follows;

1 Shx 0

0 1 0

0 0 1

32 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

1 0 0
NOTES
Shy 1 1

0 0 1

Where Shx and Shy are shear co-ordinates

2.2 INTERACTIVE INPUT METHODS

Functions to provide input can be structured to operate in various input modes.

Request Mode : The application program initiates data entry. Input values are
requested and processing is suspended until the required values are received. Program
waits until the data are delivered.(only one input device will be operating.)

Sample mode: The application program and input devices operate independently.
Input devices may be operating at the same time that the program is processing other
data. New input values are stored , replacing previous values. When the program requires
new data it samples the current values.

Event Mode: Program and the input devices again operate concurrently, but now
the input devices deliver data to an input queue. All input data are saved. When the program
requires new data, it goes to the data queue.

2.3 POLYGON MESHES


 A polygon mesh is a set of bounded polygons such that each edge is shared by at
most 2 polygons.
 polygon meshes can be represented in several ways.
– explicit representation
 surface is a set/list of polygons.
 each polygon is represented by a list of vertices.
 the order in which the vertices are stored defines the orientation of the polygon.
 inefficient use of memory as shared vertices are duplicated. Also no record of
shared vertices or edges. If a vertex is modified the complete set of polygons need
to searched to see which polygons are affected.
– pointers to a vertex list
 each polygon is defined by a list of indices into a list of vertices.
 more efficient memory usage as shared vertices are not duplicated.
 a change to a vertex is immediately reflected in each polygon that uses that vertex.
 edges may still be duplicated, hence an edge that is shared between 2 polygon will
be drawn twice.
 finding the 2 polygons that share an edge is still requires an nlog2n search (at best).

33 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

– Pointers to an edge list


– The polygon mesh is represented by a list of vertices, a list of edges and a set/
list of polygons.
 each polygon is defined by its edges.
 each edge is defined by two indices in to the vertex list and the two polygons that
share the edge, i.e. (v1,v2, p1, p2). If the edge is used by only one polygon then
either p1 or p2 is null.
 determining which edge is incident on a given vertex still requires all edges to be
inspected.

V = (V1,V2,V3,V4,V5)

E1 = (V1,V2,P2,Ö)

E2 = (V2, V3,P3,Ö)

E6 = (V5,V1,P1,Ö)

E7 = (V1,V6,P1, P2)

E8 = (V6,V2,P2,P3)

P1 = (E7,E5,E6), P2=(E1,E8,E7)

P3=(E8,E2,E3,E4,E5)
 When working with polygon meshes, it is important to ensure that the mesh
representation is consistent.
– All polygons are closed.

34 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

– each edge is used at least once and not more than twice.
– each vertex is referenced by at least two edges. NOTES
 plane equations
– if a polygon is defined by more than 3 vertices, the polygon may be non-planar.
– if the polygon is defined by 3 vertices it is planar and defined by:

ax + by + cz + d = 0

Where [a, b, c]t is the normal to the plane.

– the normal is also defined by p1p2xp2p3 (or p2p3xp2p1 or p3p1xp3p2).


Once you have a, b and c you can use any vertex to calculate d.

2.4 SPLINES AND PARAMETRIC CUBIC CURVES

 A spline curve is defined by a set of control points. These control points are either:
– interpolating control points: the curve passes through these control points.
– approximating control points: these control point exert an influence of the curve.
 the parametric equation for a line is:
X=X0(1-T)+X1T,0D”TD”1
Y=Y0(1-T)+Y2T,0D”TD”1
X=X0+(X2-X0)T
Y=Y0+(Y1-Y0)T
X=A0+A1T
Y=B0+B1T

 Extending this idea to cubic curves we Have:

X=A0+A1T+A2T2+A3T3 ,0D”TD”1

Y=B0+B1T+B2T2+B3T3 ,0D”TD”1

 Substituting t=0 at (x,y)=(x0,y0) and t=1 at

(x,y) = (x1,y1) we get 4 equations with 8 unknowns.

X0=A0

Y0=B0

X1=A0+A1+A2+A3

Y1=B0+B1+B2+B3

35 ANNA UNIVERSITY CHENNAI


DMC 1934

 We need more information to calculate the coefficients. When we join 2 curves,


NOTES we usually want to control the slope of the curve at the join.

Curves that join have geometric continuity, g0. Curves whose tangents vectors have
equal magnitude and direction at the join have parametric continuity, c1.

2.5 BEZIER CURVES

 A 4 point bezier curve is defined by 4 control points; p0, p1, p2, p3.
– p0 and p3 are interpolating control points through which the curve must pass.
– p1 and p2 are approximating control points that control the slope of the curve at
p0 and p3 respectively.
 the slope of p0p1 defines the tangent gradient at p0 and the slope of p2p3 defines
the tangent gradient at p3.

 The parametric equations for a 4 point bezier curve is:


X = X0(1-T)3+3X1(1-T)2T+3X2(1-T)T2+X3T3
Y = Y0(1-T)3+3Y1(1-T)2T+3Y2(1-T)T2+Y3T3
 Differentiating yields,
DX
/DT = -3X0(1-T)2+3X1(1-T)2-6X1(1-T)T-6X2(1-T)T2-3X2T2+3X3T2
DY
?DT = -3Y0(1-T)2+3Y1(1-T)2-6Y1(1-T)T-6Y2(1-T)T2-3Y2T2+3Y3T2
 At t = 0
DX
/DT = -3X0+3X1
DY
/DT = -3Y0+3Y1
So DY/DX = (-3Y0+3Y1)/(-3X0+3X1) = (Y1-Y0)/(X1-X0)
N-1

X(T) =” BRXR(1-T)N-R-1TR
R=0

N-1

Y(T) =” BRYR(1-T)N-R-1TR
R=0

where BR = (N-R)/R.BR-1 and B0=1

36 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 PSEUDO CODE TO IMPLEMENT A BEZIER CURVE.


NOTES
void drawbezier (int n, float[] x, float[] y)
{
int b = new int [n];
int i, r;
float xcoeff = new float [n];
float ycoeff = new float [n];
float t, xt, yt, ttopowerr, tcomptopower;
b[0] = 1;
xcoeff[0] = b[0] * x[0];
ycoeff[0] = b[0] * y[0];
for (r = 1; r < n; r++_
{
b[r] = b[r-1] * (float)(n - r) / r;
xcoeff[r] = b[r] * x[r];
ycoeff[r] = b[r] * y[r];
}
for (i = 1; i < 100; i++) // from p0 to pn in 100 steps
{
t = 0.01 * i;
ttopowerr = 1;
tcomptopower = 1 – t;
for (r = 2; r < n; r++)
{
tcomptopower = tcomptopower * (1 - t);
}
xt = 0.0;
yt = 0.0;
for (r = 0; r < n; r++)
{
xt = xt + xcoeff[r] * ttopowerr * tcomptopower;
yt = yt + ycoeff[r] * ttopowerr * tcomptopower;
ttopowerr = ttopowerr * t;
tcomptopower = tcomptopower / (1 - t);
}
// plot (xt, yt)
}
}

37 ANNA UNIVERSITY CHENNAI


DMC 1934

2.6 WINDOW AND VIEWPORT TRANSFORMATION


NOTES
Some facts about our “universe”
 objects can be placed anywhere in our “universe”. They can be described in terms
of a world coordinate system. There can be several sources of lighting.
 we are viewing our “universe” from a particular position in the universe.
 we are viewing our universe through a particular “window” which we call a view
port. In other words window is all about “what is to be viewed”
 and view port is all about “where the captured image is to be displayed”.

Example for the window can be a square/rectangle dotted lines present in most of the
camera’s viewing glass. There can be any number of view ports present or opened in the
viewing device. View ports are justified for the sake of different viewing devices like crt,
lcd panels etc which widely vary their dimensions in any practical scenario.

Figure 2.1 The Big Picture

A point at position (Xw, Yw) in the window is mapped into position (Xv, Yv) in the
associated viewport. To maintain the same relative placement in the viewport as in the
window, we need that,

(Xv-Xvmin) / (Xvmax- Xvmin) = (Xw-Xwmin) / (Xwmax-Xwmin)

Similarly for y coordinate also.

Where Xv min and max are viewport minimum and maximum coordinates, Similarly
Xw are window coordinates.

Solving this expressions yields,

Xv= Xvmin + (Xw-Xwmin)Sx ; Yv = Yvmin +(Yw-Ywmin) Sy

38 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Where Sx and Sy are scaling factors with values as below,


NOTES
Sx= (Xvmax- Xvmin) / (Xwmax-Xwmin)

Sy= (Yvmax- Yvmin) / (Ywmax-Ywmin)

QUESTIONS
1. Perform a 2d scaling(sx=5, sy = 5) for a point (10,10), with respect to a reference
axis located about 30 degrees from x axis.
2. What is composite transformation matrix? State the need.
3. State the examples for different types of input modes?
4. Define an efficient polygon representation for a cylinder.
Wirte a routine to display any specified conic in the xy plane using rational bezier
spline representation

39 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

40 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES
UNIT III

3D TRANSFORMATIONS
LEARNING OBJECTIVES

 To understand the different types of transformations, projections, visible surface


detection procedures and polygon rendering.
 To learn different color models, animation and morphing.

3.1 THREE DIMENSIONAL CONCEPTS


 Some facts about our “universe”
 Objects can be placed anywhere in our “universe”. They can be described in
terms of a world coordinate system.
 There can be several sources of lighting.
 We are viewing our “universe” from a particular position in the universe.
 We are viewing our universe through a particular “window” which we call a
view port. In other words window is all about “what is to be viewed”

And view port is all about “where the captured image is to be displayed”.

Example for the window can be a square/rectangle dotted lines present in most of the
camera’s viewing glass. There can be any number of view ports present or opened in the
viewing device. View ports are justified for the sake of different viewing devices like crt,
lcd panels etc which widely vary their dimensions in any practical scenario.

41 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 3.1 The Big picture with view port

Creating the image


 In order to figure out what appears in the view port we need to orient the universe
 So that the view point is at the origin and
 The viewer is looking down an axis.
 project the universe on the view port.
 clip the projection to the view port.
 remove hidden surfaces.
 calculate the colour and intensity based on:
 The relative positions of the lighting sources and viewer
 The types of lighting sources and
 The properties of the surfaces of the objects in our universe.
The graphics pipeline is described as follows,
 A scene consists of a set of models place in the scene.
 trees in a forest.
 models are specified in a model coordinate system (mcs).
 Models are built of primitive shapes (usually triangles).
 Models are transformed to place them in the world/universe.
 The world/universe is specified in the world coordinate system (wcs).

42 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 To view the world we transform the world to place the viewer at the origin
looking down an axis. Now the world is specified in the viewer coordinate NOTES
system (vcs).
 This 3D world from the viewer’s point of view is projected on to and clipped to
a viewing window. now we have a 2D device scene in the normalised device
coordinate system.
 Last step is to rasterise the 2D image in the device coordinate system (dcs).

Figure 3.2 The graphics pipeline

Viewing devices :
 Viewing devices fall into 2 categories.
 calligraphic devices.
 –vector graphic devices.
 Draw line segments and polygons directly.
 plotters
 laser light projection systems.
 Raster devices.
 An image is represented as a regular grid of pixels (picture elements).
 crt/lcd/plasma monitors

43 ANNA UNIVERSITY CHENNAI


DMC 1934

 Rasterisation is the process of converting an graphical primitive (points, line,


NOTES curve, etc.) into a set of pixels.

3.1.1 3D Homogeneous Transformations

3D Homogeneous Transformation matrices have size of 4X4, Where (4,4)th position


is called as homogeneous coordinate and the value at (4, 4) position is always 1. Through
Homogeneous Transformation matrices it is possible to make all the matrix operations
uniform and executed by matrix multiplication. Also, Homogeneous Transformation matrices
are essential for executing any number of and any sequence of successive transformations.

 Translation

 x1  1 0 0 tx   x
 1 
 y   0 1 0 t y   y 
  z 1   0 0 1 tz   z 
     
 1   0 0 0 1 1

 Scaling

 x1   s x 0 0 0  x 
 1 
 y   0 s y 0 0  y 
  z 1   0 0 sz 0  z 
    
 1   0 0 0 1  1 

 Rotation about z axis

 x1   s x 0 0 0  x 
 1 
 y   0 s y 0 0  y 
 z1  0 0 sz 0  z 
    
 1   0 0 0 1  1 

 Rotation about x axis

 x1  1 0 0 0 x
 y  0 cos   sin  0  y
 1    
 z1  0 sin  cos  0 0
     
 1  0 0 0 1 1 

44 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 Rotation about y axis


NOTES
 x1   cos  0 sin  0 x
y   0 1 0 0  y
 1    
 z1    sin  0 cos  0 z
     
1  0 0 0 1 1

 General form of a composition of transformations

 r11 r12 r13 tx 


r r22 r23 t y 
 21
 r31 r32 r33 tz 
 
0 0 0 1

Composing 3D homogeneous transforms

 same approach as when composing 2D homogeneous transformations.


 rather than composing each rotation around the x, y, and z axis separately we can
sometimes make use of the vector cross product and the following feature of a
composite matrix.
 the composite rotation matrix applied to the unit vector along the x, y and z axes
are:

 r11 r12 r13 0 1  r11   r11 r12 r13 0 1  r12 
r r22 r23 0 0 r  r r22 r23 0 0 r 
 21     21 ,  21     22 
 r31 r32 r33 1 0 r31   r31 r32 r33 1 0  r32 
           
0 0 0 1 0  0   0 0 0 1 0  0 
 r11 r12 r13 0 1  r13 
r r22 r23 0 0 r 
 21     23 
 r31 r32 r33 1 0 r33 
     
0 0 0 1 0  0 

3D rotations
 the first, second and third columns of the upper-left 3x3 submatrix are the rotated
x-axis, rotated y-axis and rotated z-axis respectively.
Rx Ry Rz 0
0 0 0 1

45 ANNA UNIVERSITY CHENNAI


DMC 1934

 two non colinear vectors define a plane.


NOTES  the vector cross product v1 x v2 is a vector at right angle to the plane defined
 by v1 and v2.
 if either rx, ry or rz is known and some relationship between an element of the
 rotated object and the x, y or z axis, then the computation of the composed rotation
matrix can be simplified.
Example:
 We want to position the following plane at location p with a specified direction of
flight (dof) and no bank angle (wings parallel x-z plane).

Figure 3.3 Plane at location p with specified direction of flight


 zp is transformed to the dof.
 xp = y x zp - no bank angle
 yp = zp x xp = dof x (y x dof)

 0
 
y x DOF DOF x ( y x DOF) DOF 0
R
 0 0 0 0
 
 1

 This approach has limitations. In this case if the dof is collinear with the y axis, the
(y x dof) is zero
 This general approach can be used when instantiating models (specified in a model
coordinate system) in our world coordinate system.
3.2 PROJECTIONS
 Conceptually the 3D viewing process is:

Figure 3.4 3D viewing process

46 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 The projection is defined by projection rays (projectors) that come from a centre
of projection and pass through each point of the object, and intersect with a NOTES
projection plane.
– if the centre of projection is at infinity we have parallel projection.
– if the centre of projection is at a point we have perspective projection.

Figure 3.5 Perspective and parallel projection scenario with projection lines
 Since the direction of projection is a vector, and a vector is the difference between
two point, it can be calculated as (x,y,z,1)t – (x’,y’,z’,1)t = (a,b,c,0)t.
 The above diagram show an effect of perspective projection, namely perspective
foreshortening; the size of an object is inversely proportional to the distance to the
centre of projection.

3.3 PERSPECTIVE PROJECTION


 Perspective projection does no preserve parallel lines. parallel line in a scene
converge on a vanishing point. if the parallel lines are parallel to one of the principle
axes the vanishing point is called an axis vanishing point.

Figure 3.6 3 Dimensional view of perspective projection

47 ANNA UNIVERSITY CHENNAI


DMC 1934

3.4 PARALLEL PROJECTION


NOTES  There are different types of parallel projection.
 orthographic
 Direction of projection and projection plane normal are the same.
 projection plane normal coincident with a principal axis.
 top (plan)
 front elevation
 Side elevation
 projection plane normal not coincident with a principal axis. (axonometric)
  Isometric
 all principal axes equally foreshortened.
 oblique
 Cabinet
 angle between direction of projection and projection plane normal is arctan(2) =
63.4°. this make lines perpendicular to the projection plane project to half their
length.more natural look.
 Cavalier
 angle between direction of projection and projection plane normal is 45°.

Specifying a 3D view
 The projection plane, also called the view plane, is defined by a view reference
point (vrp) and the view plane normal (vpn).
 The 3D viewing reference coordinate (vrc) system is formed from the vpn and the
view up vector (vup). the vpn define the n-axis, the projection of the vup on the
view plane define the v-axis and the u-axis makes up the right-handed coordinate
system.
 The viewing window in the view plane is defined by a centre of window (cw)
and minimum and maximum u and v values.

Figure 3.7 3D viewing reference coordinate system

48 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 For a parallel projection the prp and the direction of projection (dop) define the
view. The dop is a vector from the prp to the cw. NOTES
 The projectors from the prp through the minimum and maximum window coordinate
define a semi-infinite pyramid view volume (perspective projection) or semi-infinite
parallelepiped view volume (parallel projection).
 We usually define a front and back clipping plane.
 This is then mapped to normalised projection coordinates (npc) with the prp at
the origin and the back clipping plane mapped to 1.

Figure 3.8 Viewing plane and Normalized projection coordinate system

Implementing parallel projection

 We can create a transformation matrix that maps a scene into npc.


1. Translate vrp to the origin.
2. Rotate vrc so that n-axis (vpn) becomes the z-axis, u-axis become
3. the x-axis and v-axis becomes the y-axis.
4. Shear such that dop is parallel to z-axis.
5. Translate and scale to npc.
 step 1 is a simple translation t(- vrp).

1 0 0  vrp x 
0 1 0  vrp y 
T ( VRP)  
0 0 1  vrp z 
 
0 0 0 1 

 step 2 is the rotation to align axes.

49 ANNA UNIVERSITY CHENNAI


DMC 1934

– remember that we can think of a rotation matrix as:


NOTES
 Rx RY RZ 0 
 0 0 0 1
 

– vpn is rotated onto z-axis.

VPN
R2 
VP N

– the u-axis which is perpendicular to the z-axis and vup is rotated on the y-axis.

VUP  R2
Rx 
VU P  R2

– the v-axis, which perpendicular to rx and rx, is rotated on to the x-axis.

 step 3 is to shear along the z-axis to align the dop with z-axis while maintaining the
vpn.

Figure 3.9 Direction Of Projection alignment

 shear matrix is defined as:

1 0 shx par 0
0 1 shy par 0

SH pa r  SH shx pa r , shy par  
0 0 1 0
 
0 0 0 1

 dop is defined as:

dop=cw-prp

=[(umax+umin)/2 (vmax+vmin)/2 0 1]ã – [prpx prpy prpz 1]ã

50 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 After the shearing transform


ã .
NOTES
dop’ = [0 0 dopz 0] = sh dop par

shxpar = -dopx/dopz, shypar = -dopy/dopz

 Now the bounds of the view volume are:

umin d” x d” umax vmind”yd”vmax bd”zd”f


 We want translate and scale this volume to npc so that the centre of the front
clipping plane is at the origin, the npc x and y values range over [-1,1] and the z
value ranges over [0,1].
tpar = t(-(umax+u=)/2, -(vmax+vmin)/2,-f)
spar = s(2/(umax-umin),2/(vmax-vmin),1/(f-b))
 The resulting transform when all the steps are composed will take an arbitrary
view volume and transform it into a canonical parallel perspective view volume.
– this canonical view is easier to clip against and map to 2D device coordinates.
npar = spar.tpar.shpar.r.t(-vrp)

Implementing perspective projection


 We can create a transformation matrix that maps a scene into npc using perspective
project.
1. Translate vrp to the origin.
2. Rotate vrc so that n-axis (vpn) becomes the z-axis, u-axis become the x-axis
and v-axis becomes the yaxis.
3. Translate the centre of projection, given by the prp to the origin.
4. Shear such that the centre line of the view volume becomes the z-axis.
5. Scale to npc so that the back clipping plane is at z = -1 and the x and y
values
range over [-1,1].
 This is similar to the parallel projection implementation.
– steps 1 and 2 are identical.
 Step 3 translates the centre of projection (cop) to the origin. The cop is defined
with respect to the vrc by the prp. since steps 1 and 2 effectively converted the
viewing-coordinates into world coordinates, the required translation is simply t(-
prp).

51 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 3.10 Description for view volume centre line (step4)


alignment with z - axis
 Step 4 shears the view volume centre line onto the z-axis. The direction of the
shear is the origin to the centre of the window. since prp is mapped to the origin,
the direction of the shear is cw – prp, the exact same as for parallel projection.

Figure 3.11 Scaling operation to preserve slope


 Since the shear operation did not affect the z axis, vpr’z, the result of steps 1 to 4,
must be equal to -prpn.
 Scaling the x and y axes so that the slope of the view volume is unity; and then
scaling all axes equally (to preserve slope) so that the back clipping plane is at -
1 yields:

Vmix  Vmin
2 2Vrpz
SY1   1 SY1 
Vpr  Vmix  Vmin
1
S y  SY1 
Vprz  B

 2 prp n 2 prpn  1 
S prp  
 (U  U )( prp  B ) (  V )( prp  B ) prp 
 mix min n Vmix min n n 

 Hence the normalising perspective projection transform is:


npar = spar.tpar.shpar.r.t(-vrp)
 Now that the view volume has be transformed to a canonical space we can clip
to the boundaries of the canonical space.

52 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

3.5 VISIBLE SURFACE DETECTION METHODS


NOTES
3.5.1 Ray tracing
 Ray tracing, or ray casting, can be used:
 to determine visible surfaces; and
 to determine the colour and shading of these visible surfaces.
 In the real world, we can see an object because a light ray, from a source, has
been reflected/emitted from an object and entered our eye.
 Ray tracing just reverses this.
 an eye ray is projected from the centre of projection, through the view plane and
into the scene to determine the closest object in the scene that the eye ray intersects.

Figure 3.12 Simple ray tracer

select centre of projection (cop) and window on viewplane

for (each scan line in image)

for (each pixel in scan line)


{
determine ray from cop through pixel
for (each object in the scene)
{
if (object is intersected and is closest so far)
{
record intersection and object
}
set pixel’s colour to that of closest object
}
}

Computing intersections
 This task is at the heart of any ray tracing algorithm.
 where does the eye ray intersect an object?

53 ANNA UNIVERSITY CHENNAI


DMC 1934

 The eye ray is defined by the cop (x0,y0,z0) and the centre of the pixel on the
NOTES window in the view plane (x1,y1,z1).
 Using a parametric representation we have:
x = x0+t(x1-x0),y=t(y1-y0),z=z0+t(z1+z0)
 Let us define 4x = x1-x0, and 4y and 4z similarly.
Then,
x = x0+t”x,y = y0+t”y,z=z0+t”z
 T ranges from 0 to 1 between the cop and viewplane. Values of t greater than 1
correspond to points on the other side of the viewplane.
 Calculating the intersection of an eye ray with a sphere is relatively easy.
 explains why most ray casting examples are full of spheres!
 The equation of a sphere centred on (a,b,c) with radius r is:
 Expanding and substituting 4x = x1-x0 etc. yields:
x2-2ax+a2+y2-2by+b2+z2-2cz-c2=r2
(x0+t”x)2-2a(x0+t”x)+a2+(y0+t”y)2-2b(y0+t”y)+b2+(z0+t”z)2-2c(z0+t”z)+c2=r2
 Multiply out and collect similar term “s.
(“x2+”y2+”z2)t2+2t(“x(x0-a)+ “y(y0-b)+ “z(z0-c))+ (x0-a)2+(y0-b)2+(z0-c)2-r2=0
 This gives a quadratic in t that can be solved using the quadratic formula.
 if there are no real roots the eye ray does not intersect the sphere.
 if there is one real root, the eye ray grazes the sphere at a point.
 if there are two real root the eye ray intersects the sphere.
The small t value corresponds to the closest intersection point.
 The surface normal of a sphere centred on (a,b,c) with radius r at the point of
intersection (x,y,z) is:

 xa yb zc


 , , 
 r r r 
 Finding the intersection of an eye ray and a polygon is more complicated.
 first we must calculate the point of intersection with the polygon’s plane.
 then we must determine if that point is within the polygon.
 To determine the point of intersection we use the equation of the plane.
ax+by+cz+d=0
a(x0+t”x)+b(y0+t”y)+c(z0+t”z)+d=0
t=-(ax0+by0+cz0+d)?(a”x+b”y+c”z)
 If the denominator is zero then the eye ray and plane do not intersect.
 sing t we can calculate the point of intersection (x,y,z).

54 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 To determine if this point is inside the polygon, we orthographically project the


point and polygon onto one of the 3 plane defined by the coordinate system. NOTES
 we pick the axis that yield the largest projection. This corresponds to the coefficient
of the plane equation with the largest absolute value.
 Projection simply drops this coordinate from point and polygon’s vertices.
– inclusion test can then be done in 2D!
Efficiency

 The key to efficiency is to avoid unnecessary intersection calculations.


 One approach is to use bounding volumes.
 a set of objects is enclosed in a bounding volume. If the ray does not intersect
the bounding volume, it doesn’t intersect any object in the bounding volume.
 the enclosed objects can also be enclosed in bounding volumes, creating a
hierarchy of bounding volumes.
 bounding volumes are defines by objects whose intersections are relatively
easy to calculate.bounding slabs

Figure 3.13 Bounding slabs


 each pair of slabs is defined by their plane equations in which a, b and c are
constant and d has the values dmin and dmax.
 calculating the intersection with these planes will result in a near t value and a far t
value for each pair of slabs.
 null intersections will occur when the maximum near t value exceeds the minimum
far t value. test group first, and then subgroups if necessary.

Recursive ray tracing


 simple ray tracing can perform visible surface determination.
 recursive ray tracing can add realism into the image by considering the following.
 the reflected ray
 when a ray is incident on a specularly reflective surface the light ray is reflected.

55 ANNA UNIVERSITY CHENNAI


DMC 1934

 the transmitted ray


NOTES  when a ray is incident on a transmissive surface, and total internal reflection does
not occur, then the light ray is refracted according to snell’s law {(sinè1/sinè2)=(ç1/
ç2)}.
 the shadow ray
 when a ray is incident on a surface then a ray from that point to the light sources
can
 determine possible shadows.
 the reflected, transmitted and shadow rays are called secondary rays.
secondary rays

Figure 3.14 Recursive ray tracing

reflection and refraction


 reflection: è1 = è2

Figure 3.15 Reflected and incident ray

56 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 refraction
– snell’s law uses indices of refraction ç. NOTES

Figure 3.16 Refraction scenario


Recursive ray tracing algorithm
select cop and window on view plane
for (each scan line in image)
{
for (each pixel in scan line)
{
determine ray from cop through pixel;
pixel = rt_trace (ray, 1);
}
}
rt_colour rt_trace (rt_ray ray, int depth)
{
determine closest intersection of ray with an object;
if (object hit)
{
compute normal at intersection;
return rt_shade (object hit, intersection, normal, depth);
}
else
{
return background_value;
}
}
rt_colour rt_shade (rt_object object, rt_ray, ray,
rt_point point, rt_normal normal,

57 ANNA UNIVERSITY CHENNAI


DMC 1934

int depth)
NOTES {
rt_colour colour;
rt_ray rray, tray; /* reflected & transmitted ray */
rt_ray sray; /* shadow ray */
rt_colour rcolour, tcolour;
colour = ambient colour;
for (each light)
{
sray = ray to light from point;
if (dot product of normal and direction to light is positive)
{
compute how much light is blocked by opaque and
transparent surfaces, and use to scale diffuse and
specular terms before adding to colour;
}
}
8
if (depth < maxdepth)
{
if (object is reflective)
{
rray = ray in reflection direction from point;
if (sray ‘“ rray)
rcolour = rt_trace (rray, depth+1);
scale rcolour by specular coefficient and add to colour;
}
if (object is transparent)
{
tray = ray in refraction direction from point;
if (total internal reflection does not exist)
{
if (sray ‘“ rray)
tcolour = rt_trace (tray, depth+1);
scale tcolour by transmission coefficient

58 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

and add to colour;


} NOTES
}
}
return colour;
}

Comments on algorithm
 Opaque objects totally block light whereas transparent objects scale the light’s
contribution.
 Ray tracing suffers from problems caused by limited numerical precision.

when calculating secondary rays from an intersection point t is reset to zero. If another
ray intersects, the object there will be a small non-zero t value. Typically happens with
shadow rays resulting in incorrect “self shadows”.
 Dealt with by considering abs (t) < å equivalent to zero.
 Efficiency

Even more important than before. With m, light sources and a tree depth of n there
are 2n-1 reflection and refraction rays in the ray tree, and m (2n-1) shadow rays!

Since rays can come from any direction view volume clipping and culling back-surfaces
cannot be used.

 Adaptive tree-depth control

– pruning the size of the ray tree can have significant results.

– If the maximum contribution from a refracted or reflected ray is below a

Minimum threshold we can ignore it.


 suppose r is reflected to produce r1 and r1 is reflected to produce r2. If the
reflection coefficients for r1 and r2 are 0.1 and 0.05, then the maximum contribution
from r2 is 0.005. If this is below the threshold we can decide not to cast r2.
 does not always work well. the light at an individual pixel made be made from
many
“Insignifiant” contributions.
 should shadow rays be refracted?
 shadow rays can be handled using special techniques.
– light buffers
 by a bit of preprocessing we can reduce the number of objects we need to test
for

59 ANNA UNIVERSITY CHENNAI


DMC 1934
 intersection with a shadow ray.
NOTES  a light buffer is centre on light source, aligned with coordinate system. it is tiled
with a regular grid of squares and each square contains a depth-sorted list of
surfaces that can be seen through that square.
 if we know which square of the light buffer a shadow ray passes through we can
minimise the number of surfaces that could contribute to the shading at the shadow
ray’s origin.

For the square depicted the depth sorted list


would contain A.D.E and H. Then other surface

Figure 3.17 Light buffer


3.5.2 Z-buffer algorithm
 This algorithm (also known as the depth buffer algorithm) is an image precision
algorithm.
 In addition to the frame buffer that stores the color values for each pixel, a z-buffer
is also maintained.
this buffer stores the z value for each displayed pixel.
– it is initialized to the back-clipping plane.
 Polygons are scan converted into the frame buffer in any order.
 If a point on the polygon being scan converted is no further from the viewer than
the point currently in the frame buffer, then replace the old values with the values
for the point being scan converted.
 Pseudo code
void zbuffer (void)
{
int x, y;
for (y = 0; y < ymax; y++)

60 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

for (x = 0; x < xmax; x++)


{ NOTES
writepixel (x, y, background_value);
writez (x, y, back_clip_depth);
}
for (each polygon)
for (each pixel in polygon’s projection)
{
double pz = polygon’s z value at pixel (x,y);
if (pz >= readz (x, y))
{
writez (x, y, pz);
writepixel (x, y, polygon’s colour at (x,y))
}
}
return;
}
 Since a polygon is planar the value of z is at (x,y) is got from solving ax+by+cz+d
= 0.
 Depth coherence implies that the depth at (x,y) is realted to the depth at (x+)x,y).
if the depth at (x,y) is z1 then the depth across the scan line at (x+)x,y) is:
 Similarly, from scan line y to scan line y+)y the depth is decremented by b)y/c.
 The depth value for any point in the polygon can be determined from its vertices.

za = z1-(z1-z2)(y1-ys)/(y1-y2)
zb = z1-(z1-z3)(y1-ys)/(y1-y3)
zp = zb-(zb-za)(xb-xp)/(xb-xa)

61 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 3.19 Image merging by overlaying


 disadvantage
– z-buffer requires a large amount of memory.
 image can be scan converted in strips at the cost of multiple passes through the
objects.
– can suffers from aliasing effects particularly at a shared edge between polygon
faces.
 advantages
– simple and general.
– z-buffer can be stored with image.
 multiple images can be merged.
– by combining multiple overlays and appropriate masking to avoid updating
zbuffer, a 3-d cursor object can move around the scene.
 supports a picking operation.

3.5.3 Warnock’s algorithm (Hidden Surface Removal)


 this is an area subdivision algorithm that applies a divide and conquer strategy by
partitioning the
projection plane.
– it exploits area coherence.
– it is a hybrid; it uses both object and image precision.
 the algorithm divides the image area into four equal regions.
 each polygon has one of the following relations to an area of interest (aoi).
– surrounding polygons complete contain the aoi.
– intersecting polygons intersect the aoi.

62 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

– contained polygons are completely inside the aoi.


– disjoint polygons are completely outside the aoi. NOTES
 the following 4 cases are easy to deal with:

Figure 3.20 Area of interest(Squared area)


and 4 cases of surrounding polygons
1. All polygons are disjoint from aoi. Display aoi in background colour.
2. There is only 1 intersecting or contained polygon. Fill the aoi in background colour
and then the part of the polygom in aoi is scan-converted.
3. There is a single surrounding polygon. The aoi if filled with the colour of the
surrounding polygon
4. More than 1 polygon is intersecting, contained in or surrounds the aoi but the
surrounding aoi is in front of all the other polygons and the z values of the intersection
of the projected aoi vertices is greater for the surrounding object than any other
object. Fill with surrounding polygon colour surrounding intersecting contained
disjoint

Covered by case 4 Not covered by case 4

Figure 3.21 More than one polygon intersection and exception(Case 4)

 If the aoi is one of the previous 4 cases apply the rule, otherwise sub-divide the aoi.
– can stop when aoi is equivalent to a pixel.
– can go to sub-pixel resolution and average sub-pixels to provide antialiasing.

63 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 3.22 Sub-dividing the AOI for visible surface detection


3.6 VISUALIZATION AND RENDERING
visualization is all about analyzing and viewing data set in science and engineering
applications through graphical methods and techniques. Example includes mission critical
applications, weather forecasting, simulation effects, medical applications, bio-informatics
and space applications. It requires not only graphics but also other techniques like digital
image processing, statistical techniques etc. It has to process a wide range of data sets
namely scalar, vector, and multivariate data. Various graph plotting techniques for
2D(Surface analysis) as well as 3D(Spherical, cubical and other polygonals) analysis is
carried out .

3.6.1 Volume rendering


 For visualizing 3D data se
 Interior data is plotted using ray-casting technique
 Each voxel data is displayed as a dot in pixel plane by means of averaging the
voxel area. However the exact method may differ based on application.

Descriptions for natural objects , such as trees, clouds and other natural objects can
be modeled using fractals, shape descriptors and particle analysis. Special data structures
may be required for data representation in the case of multivariate data.

64 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

3.7 COLOUR MODELS


NOTES
 The colour of an object depends on:
– the object itself.
– the colour of the light illuminating the object.
– the colour of the surrounding area.
– the human visual system.
 Light comes in 2 forms.
– achromatic light
– chromatic light
Achromatic light

 Achromatic light is characterized only by the quantity of the light.


– intensity or luminance refers to the energy of the light.
– brightness refers to the perceived intensity.
– these 2 concepts are not the same.
 The human eye is sensitive to ratios of intensity levels rather than absolute intensity
levels.
– if intensity is measured on a scale of 0 to 1, then the perceived difference between
0.1 and 0.11 is the same as between 0.5 and 0.55

3.7.1 Gamma correction


 If we have 256 different intensity levels and we want equal perceived steps in
intensity between them, the levels should be spread logarithmically and not linearly.
 With each level r times higher than the previous level we want:
i0=i0,i1=ri0,i2=ri1=r2i0i3=ri2=r2i1=r3i0,….
i255 = r255i0 = 1
hence r = (1/i0)1/255,ij=rji0=(1/i0)(1/255)i0=i0(j-255)/255
for n+1 intensity levels:
r=(1/i0)1/n , ij=i0(n-j)/n
 The ratio between the maximum and minimum intensities is called the dynamic
range.
 Displaying intensities on a display medium, such as a crt or film, is complicated by
the inherent nonlinearities of the display medium.
 For example, for a crt the intensity of the light output from a phosphor is related to
the number of electrons in the beam hitting the phosphor.
i=knã

65 ANNA UNIVERSITY CHENNAI


DMC 1934

where k and ã are constants.


NOTES
 the number of electron n is proportional to the control-grid voltage, which is
proportional to the pixel value v.
i=kvã v=(1/k)i/ã
the nearest displayable intensity to the desired intensity i is ij.
j=round(log(i/i0))
ij=rji0
vj=round((ij/k)1/ã)
 k, γ and io are display specific. a look-up table can be built using these values,
and this is called gamma correction.
3.7.2 Halftone approximation

 Some display devices are bilevel, that is they produce just two intensity levels.
 The increase the range of perceived intensity levels we can make use the spatial
integration that our visual system performs.
– when viewing a small area from a distance we perceive the overall intensity of
the area.
 Halftoning (aka cluster-dot ordered dither) exploits this phenomenon.
 The following 2x2 pixel area can produce 5 different intensity levels.

Figure 3.23 2x2 Square for displaying 5 intensity levels


 In general an nxn pixel area can generate n2+1 intensities at the cost of reducing
the spatial resolution by a factor of n.
 Dither patterns should exhibit a growth sequence, such that a pixel used for intensity
level j should be used for all level > j. this minimises contouring effects
 Dither patterns should not introduce visual artifacts, such as lines.
why is this a bad dither pattern?

Figure 3.24 A line like dither pattern

66 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 a dither matrix can express a set of dither patterns with a growth sequence. To
display an intensity i we turn on all pixels whose value is less than i. For the previous NOTES
2x2 patterns the dither matrix is:

 Dither patterns should avoid isolated pixels as some devices (laser printers and
printing presses) have difficulty reproducing isolated pixels consistently.
 Error diffusion can be used to improve the image quality. in the flyod/stienberg
algorithm the difference between the exact intensity and the displayed value is
spread across the pixels to the right
 and below the current pixel; 7/16 of the error to the east pixel, 3/16 to the south-
west pixel, 5/16 to the south pixel and 1/16 to the southeast pixel.

Chromatic light
 Colour perception involves:
– hue: the colour we see.
– saturation: how far the colour is from a grey of equal intensity.
 Red is highly saturated whereas pink is relatively unsaturated. unsaturated colours
include more white light than saturated colours.
– lightness: perceived intensity of a reflecting object.
– brightness: perceived intensity of a self-luminous object.
Colourimetry
 Colourimetry is the branch of physics that quantifies colours.
- the dominant wavelength is the wavelength of the light from
an object. it corresponds to hue.
- the excitation purity measures the proportion of pure light of the
dominant wavelength and of white light needed to define the colour. it corresponds
to saturation.
- Luminance is the amount of light intensity. it corresponds to lightness for reflecting
objects and brightness for self-luminous objects.
Tristimulus theory
 The tristimulus theory is based on the hypothesis that the eye has 3 types of cones
(colour sensors). these cones’ peak sensitivity corresponds to red, green and
blue light.
CIE chromaticity diagram
 The cie (commision internationale del’Eclairage) defined colours using 3 primaries
x, y and z to replace red, green and blue. the primaries require no negative values
in the matching process.

67 ANNA UNIVERSITY CHENNAI


DMC 1934

 The y primary was chosen to exactly match the luminous-efficiency function.


NOTES  xë, yë and zë are the amounts of each of the x, y and z primary required to
match each color.
 to match a color c:
c = xx + yy + zz
 if x + y + z = 1 then we have the cie chromaticity diagram.

Figure 3.25 CIE chromaticity diagram

The point c defines standard white light (approximating sunlight). it is very near

x=y=z=1/3.
 100% spectrally pure colours are on the curved part of the diagram.
 the chromaticity diagram factors out luminance. not all colours are represented.
– brown (orange-red at low luminance) is not in the diagram.
– there are an infinite number of planes in the (x,y,z) space.
 complementary colours are those that mix to produce white light.colour gamuts
 a typical use of the cie chromaticity diagram is to define colour gamuts (or colour
ranges).

68 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES

Figure 3.26 Color gamut

This is a typical crt gamut. by mixing r, b, and g we can match any colour in the
triangle.

RGB colour model


 The rgb model is used in displays.
 It uses a cartesian coordinate system.
 The rgb primaries are additive primaries. Their individual contributions are
added together to produce a result.
– Models luminous display systems.

Figure 3.27 RGB cube

CMY colour model

 Used in reflective displays such as hardcopy devices that deposit


 pigment on to paper.
 The CMY primaries are subtractive primaries.
– the cyan primary subtracts red from the reflected white light.
green yellow (minus blue)
cyan (minus red) black red
blue magenta (minus green)
 Another variant is the cymk model that adds black (k) as a colour.

69 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 3.28 CMY color model

k’ = min (c,m,y)

c’ = c – k

m’ = m – k

y’ = y – k

HSV colour model


 Both the rgb and cym models are hardware-orientated.
 The hsv model is user-orientated,being based on the artist’s concepts of tint,
shade and tone.
 The hsv model specifies a colour by:
– hue
 H is measured by an angle around a vertical axis (0 degree =red, 120 degree =
green, 240 degree=blue)
– saturation
 s is a ratio ranging from 0 on the vertical axis to 1 at the side of the hsv cone.
– value (brightness)
 Brightness is measured along the vertical axis.
 However colours with v=1 are not all of the same perceived brightness.

Figure 3.29 HSV hex color cone

70 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES

Figure 3.30 HSV Hex color cone with angles

3.8 ANIMATION

Animation is the rapid display of a sequence of images of 2-d artwork or model


positions in order to create an illusion of movement. It is an optical illusion of motion due
to the phenomenon of persistence of vision. This could be anything from a flip book to a
motion picture film.

ANIMATION TECHNIQUES

Animation techniques are incredibly varied and difficult to categorize. Techniques are
often related or combined. The following is a brief on common types of animation. Again,
this list is by no means comprehensive.

 TRADITIONAL ANIMATION

Also called cel animation, the frames of a traditionally animated movie are hand-
drawn. The drawings are traced or copied onto transparent acetate sheets called cels,
which are then placed over a painted background and photographed one by one on a
rostrum camera. Nowadays, the use of cels (and cameras) is mostly obsolete, since the
drawings are scanned into computers, and digitally transferred directly to 35 mm film. The
“look” of traditional cel animation is still preserved, and the character animator’s work has
remained essentially the same over the past 70 years. Because of the digital influence over
modern cel animation, it is also known as tradigital animationExamples: the lion king,
spirited away.

 FULL ANIMATION

The most common style in animation, known for its realistic and often very detailed
art.

Examples: all disney feature length animated films, the secret of nimh, the iron giant

71 ANNA UNIVERSITY CHENNAI


DMC 1934

 LIMITED ANIMATION
NOTES
A cheaper process of making animated cartoons that does not follow a “realistic”
approach.

Examples: the flintstones, yellow submarine


 RUBBER HOSE

The characters are usually cartoon, and the animators have a lot of artistic freedom as
rubber hose animations don’t have to follow the laws of physics and anatomy in the same
degree as the other main styles in animation.

Examples: early mickey mouse cartoons, ren and stimpy, popeye


 ROTOSCOPING

A technique where animators trace live action movement, frame by frame, for use in
animated films.

Examples: gulliver’s travels, american pop


 STOP MOTION

Stop-motion animation is any type of animation which requires the animator to physically
alter the scene, shoot a frame, again alter the scene and shoot a frame and so on, to create
the animation. They are many different types of stop-motion animation some notable
examples are listed below.

 CLAY ANIMATION

Figure 3.31 A clay animation scene from a TV commercial.

Often abbreviated to claymation, this is a type of stop-motion animation using figures


made of clay or a similar malleable material. The figures may have an armature or wire
frame inside of them, similar to puppet animation (below). Alternatively, the figures may be
made entirely of clay, such as in the films of bruce bickford where clay creatures morph
into a variety of different shapes.

72 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Examples: wallace and gromit


NOTES
Cutout animation

In this type of stop-motion animation, the animation is formed by moving 2-dimensional


pieces of material such as paper or cloth.

Examples: the animated sequences of monty python’s flying circus (often referred to
as dada animation, named after the dada art movement [citation needed]); tale of tales;
early episodes of south park

 Silhouette animation

A type of cutout animation where the viewer only sees black silhouettes.

Example: the adventures of prince achmed, the world’s oldest surviving animated
feature film, from 1926.

 Graphic animation

 Model animation

In this form of animation, model animated characters interact with, and are a part of,
the live-action world.

Examples: the films of ray harryhausen (jason and the argonauts) and willis o’brien
(king kong)
 Go motion
 Object animation
 Pixilation

examples: neighbors

 PUPPET ANIMATION

Puppet animation typically involves puppet figures interacting with each other in a
constructed environment, in contrast to the real-world interaction in model animation (above).
The puppets generally have an armature inside of them to keep them still and steady as
well as constraining them to move at particular joints.

Examples: the nightmare before christmas, robot chicken, the tale of the fox

 Computer animation:

Like stop motion, computer animation encompasses a variety of techniques, the unifying
idea being that the animation is created digitally on a computer.

73 ANNA UNIVERSITY CHENNAI


DMC 1934

 2D animation
NOTES
Figures are created and/or edited on the computer using 2D bitmap graphics or created
and edited using 2D vector graphics. This includes automated computerized versions of
traditional animation techniques such as of tweening, morphing, onion skinning and
interpolated rotoscoping.

Examples: a scanner darkly, jib jab, analog computer animation, flash animation,
powerpoint animation

Figure 3.32 A completely synthetic, computer-generated scene.

 3D ANIMATION

Figures are created in the computer using polygons. To allow these meshes to move
they are given a digital armature (sculpture). This process is called rigging. Various other
techniques can be applied, such as mathematical functions (gravity), simulated fur or hair,
effects such as fire and water and the use of motion capture to name but a few.

Examples: the incredibles, shrek.


TERMINOLOGIES USED
 RESOLUTION

Resolution in computer graphics either refers to the number of pixels per inch or other
unit of measure (centimeter for example) on a monitor or printer. It is also sometimes used
to describe the total number of pixels on a monitor. Resolution is usually measured in pixels
per inch or dots per inch (dpi).
 KEY FRAMES

When someone creates a 3D animation on a computer, they usually don’t specify the
exact position of any given object on every single frame. They create key frames. Key
frames are important frames during which an object changes its size, direction, shape or
other properties. The computer then figures out all the in between frames and saves an
extreme amount of time for the animator.

74 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES

Figure 3.33a 2 frames drawn by user

Figure 3.33b In-between frames generated by computer

 Modeling

Modeling is the process of creating 3D objects from simple 2D objects by lofting, or


from other simple 3D objects called primitives. Primitives can be combined using a variety
of boolean operations. They can also be distorted by dragging around their vertices or
distorted in different ways.

 Morphing & warping

Morphing is a very cool looking transition. It is also one of the most complicated
ones. A morph looks as if two images melt into each other with a very fluid motion. In
technical terms what happens is, two images are distorted and a fade occurs between
them. This is pretty complicated to understand, but believe me, it looks very cool. (you
might have seen morphing in the gillette© and shell© gasoline commercials).

Warping is the same as morphing, except only one image is distorted and no fade
occurs.

Figure 3.34a The original Figure 3.34b A warped version of


graphics this image

75 ANNA UNIVERSITY CHENNAI


DMC 1934

 Onion Skinning
NOTES
Onion skinning is a term that commonly refers to a graphic process in which an image
or an animation is composed of a couple of different layers. For example, if you have ever
used adobe photoshop 3.0 or higher you are probably familiar with the layers window.
That is exactly what onion skinning is all about. Imagine it as a series of totally transparent
pieces of plastic with different drawings on them. When they are all stacked on top of
another, a composite is formed. This is widely used in traditional animation when the
background is a separate layer and each character is a separate layer. This way, only
layers have to be redrawn or repositioned for a new frame. Onion skinning is also found in
computer software where different effects can be placed on different layers and later compo
sited into a final image or animation.

 Rendering

Rendering is the process a computer uses to create an image from a data file. Most
3D graphics programs are not capable of drawing the whole scene on the run with all the
colors, textures, lights, and shading. Instead, the user handles a mesh which is a rough
representation of objects. When the user is satisfied with the mesh, he then renders the
image.

Figure 3.35a A mesh Figure 3.35b A rendered image

 Tweening

The process of generation of the intermediate frames given the key frames is called
tweening.

 Scene description

This includes the positioning of objects and light sources, defining the photometric
parameters, and setting the camera parameters (position, orientation, and lens
characteristics).

 Action description

This involves the layout of motion paths for objects and camera.

76 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

3.9 KEY FRAME SYSTEMS


NOTES
We generate each set of in-betweens from the specification of two (or more) key
frames. Motion paths can be given with a kinematical description as a set of spline curves
or the motion can be physically based by specifying the forces acting on the objects to be
animated.

 Morphing

Transformation of object shapes from one form to another form is called as morphing,
which is a shortened form of metamorphosis. Morphing methods can be applied to any
motion or transition involving a change in shape.

Given two key frames for an object transformation, we first adjust the object
specification in one of the frames so that the number of polygon edges (or the number of
vertices) is the same for the two frames. This preprocessing step is illustrated below. A
straight line segment in the key frame k is transformed into two line segments in key
frame k+1. Since k+1 have an extra vertex, we add vertex between vertices 1 & 2 in key
frame k to balance the number of vertices in the two key frames.

Using linear interpolation to generate the in-betweens, we transition the added


vertex in key frame k into vertex 3 along the straight line path.

we can state general preprocessing rules for equalizing key frames in terms of either
the number of edges or the number of vertices to be added to a key frame. Suppose we
equalize edge count, and parameter lk and lk+1 denote the number of line segments in two
consecutive frames. We then define
LMAX = MAX (LK, LK+1)
LMIN = MIN (LK, LK+1)
NE = LMAX MOD LMIN
NS = INT (LMAX/LMIN)
Then the preprocessing is accomplished by:
 Dividing ne edges of key framemin into ns+1 section.
 Dividing remaining lines of key framemin into ns sections.

As an example, if

lk = 15 & lk+1 = 11, we should divide 4 lines of key framek+1 into two sections each.
The remaining lines of key framek+1 are left intact.

If we equalize the vertex count, we can use parameters vk and vk+1 to denote the
number of vertices in two consecutive frames. In this case we define:

77 ANNA UNIVERSITY CHENNAI


DMC 1934

vmax = max (vk, vk+1)


NOTES
vmin = min (lk, vk+1)

nls = (vmax -1) mod (vmin - 1)

np = (vmax -1) / (vmin - 1)

Thus the preprocessing using vertex count is performed by:

 Adding np points to nls line sections of key frame min

 Adding np -1 points to the remaining sections of key frame min

Figure 3.36 KTH frame and K+1TH frame

Figure 3.37 KTH frame intermediate frame K+1TH frame

For the triangle to quadrilateral example, vk = 3 and vk+1 = 4. Both nls and np are 1, so
would add one point to one edge of key framek. No points would be added to the remaining
lines of key framek+1.

78 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 Simulating Accelerations
NOTES
For constant speed (zero acceleration) we use equal-interval time spacing for the
time interval between the key frames is divided into ‘n+1’ subintervals yielding an in-
between spacing of

δt = (t2 – t1) / (n+1)

We then calculate the time for the in-betweens as

Tbj = t1 + j ät

Nonzero accelerations are used to produce realistic displays, which are being modeled
using animation path with a spline or trigonometric functions.

To model the increasing speed (positive acceleration), we want the time spacing
between frames to increase so that greater changes in position occur as the object moves
faster. We can obtain an increasing interval size with the function
1 – cosè 0<è<ð/2

for n in-betweens, the time for the jth in-between would be then calculated as

Tbj = t1 + ät [1-cos (jð/2(n+1))] j = 1, 2….n

Where ät is the time difference between the two frames

to model the decreasing motion we do the following calculation

Tbj = t1 + ät [sin (jð/2(n+1))] j = 1, 2….n

Where ät is the time difference between the two frames

MOTION SPECIFICATION

 Direct Motion Specification

The most straight forward method for defining a motion sequence is direct specification
of the motion parameters. Here we explicitly give the rotation angles and the translation
vectors. Then the geometric transformation matrices are applied to transform coordinate
positions. Alternatively, we could use an approximating equation to specify certain kinds of
motions. We can approximate the path of a bouncing ball, for instance, with a damped,
rectified, sine curve

Y(X) = A| SIN (WX+È) |E-KX

Where a is the initial amplitude, w is the angular frequency, è is the phase angle, and
k is the damping constant.

79 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 3.38 Approximating the motion of a bouncing ball with a damped sine
function

 Goal Directed Systems

As the opposite extreme we can specify the motions that are to take place in general
terms that abstractly describe the actions. These systems are referred to as the goal directed
systems because they determine the specific motion parameters given the goals of the
animation. For example, we could specify that we want to “walk” or to “run” to a particular
destination, and so on. The input directives are then interpreted in terms of component
motions that accomplish the selected task. Human motions for instance, can be defined as
a hierarchical structure of sub motions of the torso, limbs, and so forth.

 Kinematics and Dynamics

We can also construct animation sequences using kinematics or dynamic descriptions.


With a kinematics description, we specify the animation by giving the motion parameters
(position, velocity, and acceleration) without references to the forces that cause that motion.
For constant velocity (zero acceleration), we designate the motions of rigid bodies in a
scene by giving an initial position and velocity vector for each object.

For example, v (3, 0,-4) km/sec

Kinematics specification of a motion can also be given by simply describing the motion
path. This is often done using spline curves.

An alternate approach to this is to use inverse kinematics. Here, we specify the initial
and final positions of the objects at specified times and the motion parameters are computed
by the system.

The Future

One open challenge in computer animation is a photorealistic animation of humans.


Currently, most computer-animated movies show animal characters (finding nemo), fantasy
characters, or cartoon-like humans (the incredible). The movie final fantasy: the spirits
within is often cited as the first computer-generated movie to attempt to show realistic-
looking humans. However, due to the enormous complexity of the human body, human
motion, and human biomechanics, realistic simulation of humans remains largely an open
problem. It is one of the “holy grails” of computer animation. Eventually, the goal is to

80 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

create software where the animator can generate a movie sequence showing a photorealistic
human character, undergoing physically-plausible motion, together with clothes, NOTES
photorealistic hair, a complicated natural background, and possibly interacting with other
simulated human characters. This should be done in a way that the viewer is no longer able
to tell if a particular movie sequence is computer-generated, or created using real actors in
front of movie cameras. Achieving such a goal would mean that conventional flesh-and-
bone human actors are no longer necessary for this kind of movie creation, and computer
animation would become the standard way of making every kind of a movie, not just
animated movies. However, living actors will be needed for voice-over acting and motion
capture body movements. Complete human realism is not likely to happen very soon,
however such concepts obviously bear certain philosophical implications for the future of
the film industry.

Then we have the animation studios who are not interested in photorealistic cgi features,
or to be more precise, they want some alternatives to choose from and may prefer one
style over another, depending on the movie.

or the moment it looks like three dimensional computer animations can be divided
into two main directions; photorealistic and non-photorealistic rendering. Photorealistic
computer animation can itself be divided into two subcategories; real photorealism (where
performance capture is used in the creation of the virtual human characters) and stylized
photorealism. Real photorealism is what final fantasy tried to achieve and will in the future
most likely have the ability to give us live action fantasy features as the dark crystal without
having to use advanced puppetry and animatronics, while antz example on stylistic
photorealism (in the future stylized photorealism will be able to replace traditional stop
motion animation as corpse bride). None of them are as mentioned perfected yet, but the
progress continues. The non-photorealistic/cartoonist direction is more like an extension
and improvement of traditional animation, an attempt to make the animation look like a
three dimensional version of a cartoon, still using and perfecting the main principles of
animation articulated by the nine old men, such as squash and stretch. While a single frame
from a photorealistic computer animated feature will look like a photo if done right, a single
frame vector from a cartoonist computer animated feature will look like a painting (not to
be confused with cel shading, which produces an ever simpler look).

81 ANNA UNIVERSITY CHENNAI


DMC 1934

QUESTIONS
NOTES
1. Prove that three successive 3D matrix transformation is commutative for scaling
and rotation.
2. Scale a cubical object of size two cm for each of it’s sides with scaling parameters
(3,3,3) for x,y,z axis respectively.
3. Device a procedure for rotating an object based on OCTREE structure.
4. Write a procedure for performing two point perspective projection of a cubical
object.
5. Extend the liang-barsky clipping algorithm for 3D objects against a specified regular
parallelepiped.
6. Write a program to display the visible surfaces of a convex polyhedron using BSP
tree method.
7. Device an algorithm for viewing a single sphere using ray casting method.
8. Discuss in detail about wire frame models and its applications.
9. Discuss the types of animation and tools required for performing the same.
10. Compare different color models and the specific applications of each.

82 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES
UNIT IV

OVERVIEW OF MULTIMEDIA
LEARNING OBJECTIVES

 To Understand the logical components of multimedia system, Peripherals, Storage


schemes, Image, Video, Graphics, formats and Standards, Authoring systems And
various design issues.

4.1 LOGICAL COMPONENTS OF MULTIMEDIA SYSTEM

A multimedia computer system is a computer system that can create, import, integrate,
store, retrieve, edit, and delete two or more types of media materials in digital form, such
as audio, image, full-motion video, and text information. Multimedia computer systems
also may have the ability to analyze media materials (e.g., counting the number of occurrences
of a word in a text file).

Figure 4.1 Layout of a multimedia system with components

83 ANNA UNIVERSITY CHENNAI


DMC 1934

4.2 PERIPHERALS
NOTES
Input devices

 Cameras, Eye Tracking, Infrared Remote Control, Joystick, Keyboards, Mice,


Microphones, OCR, Scanners, Speech Recognition, Tablets, Touch Screens
 Trackballs , Whiteboard.

Output devices

 Aroma Dispenser, Headphone, Monitor, Speaker, Speech Synthesizer,


Whiteboard.

Storage units
 Erasable Optical: Ex., 3.5 Inch
 Jukebox: Disk, Tape
 Magnetic: Diskette, Disk
 Read-Only Optical: CD-ROM

Components

Higher resolution screens with at least 640x480 video SVGA is 640x480, 256 colors,
CLUT 15 bit SVGA handles 32,768 colors ,24 bit handles 16,777,216 colors

Fast graphics
 24<->8 bit color conversion dithering (e.g., 24->8)
 e.g., Heckbert’s median cut algorithm
 N.B. If only have 4 bits, it is better to switch to gray scale.

Sound boards
 Sampling rates to 44.1 kHz for 16 bit samples
 Sound Blaster is one of the original “standards” (in MPC).
 MIDI is cheaper with an FM synthesizer that generates sounds algorithmically.
 Wavetable synthesizers cost more, but sound more realistic, and use stored
waveforms from actual instruments. DSPs are desirable because of their flexibility.48
kHz audio is slightly better, but hard to tell difference.

Computer-based

Game/Home Systems

Flops

CD-I, CDTV, VIS.By mid 1993, only about 25K players shipped in US. In contrast,
1992 Sega CD-ROM add-ons sales were 200K!.

84 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

CD-I (Philips)
NOTES
Introduced in 1986, shipped in 1992, MPEG in 1994 68000 target machine with
numerous constraints
 Real-time OS
 Slow evolution of authoring environments

CDTV(Commodore)
 Out to market before CD-I
 Built upon successful Amiga 500
 Real-time OS, rich multimedia support
 Investigated fractal compression
 Problems: computer too slow,
 CD-ROM drive too slow

Cathode Ray Tube


1. Monochrome CRT
2. Raster Scanning.
3. Frames and refresh rate:
The electron beam is to produce a complete frame of picture when starting from the
top-left corner it moves over all the pixels and return-back to the starting point.
4. Color CRT.
5. Interlacing.
6. Monitor Specifications:
1) Refresh rate
2) Horizontal scan rate
3) Dot pitch
4) Pixel Addressability
5) Aspect Ratio
6) Monitor Size
7) Resolution
8) Color Depth

Video Adapter Card

The video adapter card is an expansion card which usually sits on the Motherboard.
It acts as an interface between the processor of the computer and the monitor.

Early Adapters
1) Mono Display Adapter
2) Color Graphics Adapter

85 ANNA UNIVERSITY CHENNAI


DMC 1934

3) Hercules Graphics Adapter


NOTES 4) Enhanced Graphics Adapter

Video Graphics Array

VGA was a standard introduced by IBM in 1987 which was capable of displaying
text and graphics in 16 colors.

Extended Graphics Array

Technically the VGA standard was replaced by XGA from IBM in 1990.

Essential Components of an adapter card


1) Display Memory
2) Graphics Controller
3) Accelerated Graphics Port
Video Adapter Cable

The video adapter card is connected to the monitor with the video adapter cable.

Liquid Crystal Display


1) LCD is a suitable component for portable devices like laptops, calculators, mobile
phones etc,
2) Another advantage is the power consumption.

Types of LCD devices


1. Passive matrix
2. Active Matrix

Plasma Display Panel

About plasma

Plasma is an energetic gas state of matter often referred to as the fourth state of
matter.

Merits
1) Color reproduction and contrast is good.
2) Large screen size of 70’’ is possible.

Demerits
1) The PDP is quite fragile and may create shipping problems.
2) Detail in dark scenes is not as good as CRTs.

86 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

4.3 MULTIMEDIA – STORAGE SCHEMES


NOTES
Introduction

Multimedia data is often very large. A single image may take up several megabytes,
while a single 90 minute video might occupy several gigabytes of disk space, even when it
is highly compressed. Hence, it is imperative that media data be stored in secondary a
tertiary storage devices such as disks, cd-roms, and tapes. Unlike traditional database
applications, the retrieval of dynamic multimedia data such as audio and video requires
that it be continuous - the segments of video must be retrieved and presented to the user in
a manner that has no jitter or delay.

4.3.1 Raid Arrays and Placement Methods

One of the most successful disk-based storage methods has been the RAID (redundant
array of inexpensive disks) paradigm. RAID provides several architectures by which many
relatively inexpensive disks can be used together to provide scalable performance. A “block”
is used to denote the smallest chunk of data that will be read or written. Data may be
divided up into several contiguous blocks. For example a video being shown at 30 frames
per second may be broken up into blocks composed of 1800 frames (1 minute).Some of
the well known RAID architectures are:

Raid 0 Architecture

Figure 4.2 Raid 0 architecture

This is the simplest form of RAID architecture. In this architecture there are a set of n
disks, labeled 0,1, ..., (n - 1), that are accessed through a single disk controller.

87 ANNA UNIVERSITY CHENNAI


DMC 1934

Disk Striping
NOTES
Disk striping stores data across multiple physical storage devices. In a stripe set,
multiple partitions of equal size on separate devices are combined into a single logical
device. A k-stripe is a set of k drives, for some integer k <= n that divides n. Intuitively,
once n and k are fixed, we can, in effect, logically (not necessarily physically) split the set
of disk drives into n/k clusters, consisting of k drives each.

When storing a set b0, b1, ..., br-1 of contiguous blocks in terms of a k-striped
layout, we store block b0 on disk 0, block b1 on disk 1, block b2 on disk 2, and so on. In
general, we store block bi on disk i mod r. Furthermore, a stripe could start at disk j rather
than at disk 0, in which case, block bi would be stored on disk ((i+j) mod r).

The picture provides a simple layout of two movies. The blocks of the first movie are
denoted by b0, b1, b2,b3,b4. These are striped with k = 3 starting at disk 0. Thus, block
b0 is placed on disk 0, block b1 is placed on disk 1, and block b2 is placed on disk 2.
Block b3 is now placed back on disk 0, and block b4 is placed on disk 1.

The second movie has six blocks, denoted by c0, ...,c5, and these are striped with k
= 4 and starting at disk 1. (i.e., j = 1). Thus, block c0 is placed on disk 1, block c1 is
placed on disk 2, and so on.

When we stripe a movie across k disks, it is as though the transfer rate of the disk has
increased k-fold. The reason for this is that data can be read, in parallel, from each of the
disks. For example, the controller can directly read blocks b0,b1 and b2 in parallel from
the three disks containing those blocks. Thus, the larger k is, the better the transfer rate of
the disk. However, we should not increase k arbitrarily because the actual transfer rate is
limited by the buffer size as well as the output bandwidth of the channel/bus to which the
disk array is connected. Furthermore, in practice k-fold increases are almost never obtained
due to various performance attenuations.

A major disadvantage with the RAID-0 architecture is that of reliability. If a disk


crashes, then the system as whole crashes.

Raid - 1 Architecture

The Raid -1 architecture basically uses only half the available disk. In other words, if
there is N disks available altogether, then n = N/2 disks are utilized. For each disk, there is
a mirror disk. Stripping is done across the n disks as before. RAID-1 is predicated on the
assumption that there is a very low probability that a disk and its mirror will fail simultaneously.
When we wish to read from a disk, we read from the disk (if it is active) or we read from
its mirror. The obvious disadvantage of the RAID-1 architecture is that only 50% storage
utilization is achieved - this is the price paid for the desired reliability.

88 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

RAID -5 Architecture
NOTES
The raid-5 architecture shown in figure 4.3 is perhaps the best suited for database
applications.

Figure 4.3 RAID 5 Architecture

It reflects a simple but elegant trade-off between efficient utilization of available storage
and excellent reliability. In the raid-5 architecture, each closer of k disks has one disk
reserved as a parity disk which would enable data to be recreated should one of the drives
in the parity set fail.

4.3.2 An Algorithm to Support VCR Functions

In order to design a server architecture that provides access to a set of clustered


RAID arrays, we must write a program that takes user’s request for data (which is described
by the 4-tuple data (i,t) defined as { data(i,t) =(m,b,len,step), where m=movie, b=block in
the movie, len > 0 integer and step an integer}earlier) and that returns an assignment of
retrieval tasks to the different disk servers being accessed by the movie-on-demand server.

For example, consider the situation shown in the figure 4.4 shown. Here, our movie-
on-demand (mod) server is providing access to three disks. For the sake of simplicity,
these disks contain only one movie, composed of 300 blocks. Disk 1 contains blocks 1-
150;, disk 2 contains blocks 150-250; and disk 3 contains blocks 200-300.

89 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 4.4 MOD server with 3 disks

At time t, suppose user u1 is watching blocks at the rate of 2 blocks per time unit, and
suppose he is watching block 140. His current request is being served by disk 1. Suppose:
1) The user continues watching the movie in “ordinary” mode. That is, data (u1, t) =
(m, 140, 2, 1). In this case, the blocks to be retrieved are blocks 140,141, which
can only be served by disk 1. The mod server therefore ships a request to disk 1’s
server, requesting these blocks.
2) The user pauses: in this case, data (u1, t) = (m, 140, 1, 0). In this case, the user is
shown no blocks, and the block that was previously on the screen stays on.
3) The user fast-forwards at 6 blocks per second: in this case, data (u1, t) = (m, 140,
2, and 6). That is, the blocks to be retrieved are blocks 146, 152. Blocks 142
exist only on disk 1, and block 152 exists only on disk 2. Hence, to satisfy this
request by the user, the mod server must dispatch two requests - one each to the
servers controlling disks 1 and 2.
4) The user rewinds at 6 blocks per second: in this case, data (u1, t) = (m, 140, 2,-
6). That is, the blocks to be retrieved are blocks 134 and 128, both of which can
be retrieved from disk 1 only.

Suppose users u1 and u2 are both accessing movies at the same time, at the rate of 2
blocks per time unit, and users u1 and u2 are watching blocks 140 and 199, respectively.
Suppose now that user u1 fast-forwards at the rate of 5 blocks per time unit, while user u2
continues normal viewing.
1) At time t, user u1 wants blocks 140 and 145, while user u2 wants blocks 199 and
200. User u2’s request should be satisfied by disk 1, whole user u2’s request is
satisfied by disk.
2) At time t + 1, user u1, continuing the same transaction as above, wants blocks 150
and 155, whole user u2 wants blocks 201 and 202. Here are some constraints
that need to be taken into account when attempting to satisfy this:

90 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 Block 150 is available on both disks 1 and 2.


 Block 155 is available only on disk 2. NOTES
 Block 201 and 202 are available on both disks 2 and 3.

Either of the following assignments can be used to handle the requests of the two
users:
 The user u1’s transaction data (u1, t+1) = m, 150, 2, 5) is split into two sub
transactions, denoted by the 4-tuples (m, 150, 1, 5), (m, 155, 1, 5). The first sub
transaction is served by disk 1, the second by disk 2. In the same way, user u2’s
transaction data (u2,t + 1) = (m, 200, 2, 1) is split into two sub transactions,
denoted by the 4-tuples (m, 201, 1, 1) and (m, 202, 1, 1). The first sub transaction
is saved by disk 2, the second by disk 3. (notice that the second subs transactions
of both users’ requests could be satisfied by disk 2, but in the first user’s case, only
disk 2 can satisfy it, while in the second user’s case, either disk 2 or disk 3 can
satisfy it.)
 An alternative possibility is that we split user u1’s transaction into two as above
and have disks 1 and 2 satisfy the sub transactions; however, instead of splitting
user u2’s request, we switch his entire transaction to disk 3.
 Looking at the above example, two new operations have been introduced: splitting,
which causes a user’s transaction to be split into two or more pieces (called twins),
and switching, which causes a user’s transaction (or its descendant sub transactions)
to be switched from the server that was originally handling the request to another
server.

4.4 MULTIMEDIA SOFTWARE

The basic tool set for building multimedia projects contains one more authoring systems
and various editing applications for text, images, sounds, and motion video.

4.4.1 Text and word processing tools

Word processors such as Microsoft Word, Word Perfect, are powerful applications
that include spell checkers, table formatters, thesauruses, and prebuilt templates for letters,
resumes, purchase orders, and other common documents. In many word processors, you
can actually embed multimedia elements such as sounds, images, and video.

OCR software: With Optical Character Recognition software, a flat-bed scanner, and
your computer, you can save many hours of rekeying printed words, and get the job done
faster and more accurately than a roomful of typists. OCR software turns bitmapped
characters into electronically recognizable ASCII text. A scanner is typically used to create
the bitmap. Then the software breaks the bitmap into chunks according to whether it
contains text or graphics, by examining the texture and density of areas of the bitmap and
by detecting edges. The text areas of the image are then converted to ASCII characters
using probability and expert system algorithms. Most OCR applications claim about 99%

91 ANNA UNIVERSITY CHENNAI


DMC 1934

accuracy when reading 8 to 36 point printed characters at 300 dpi and can reach processing
NOTES speeds of about 150 characters per second.

4.4.2 Painting and Drawing Tools

Painting software such as Photoshop, Fireworks, and Painter, is dedicated to producing


crafted bitmap images. Drawing software such as CoreDraw, FreeHand, Illustrator,
Designer, and Canvas, is dedicated to producing vector-based line art easily printed to
paper at high resolution. Typically bitmapped images provide the greatest choice and power
to the artist for rendering fine detail and effects, and today bitmaps are used in multimedia
more often than drawn objects. Some vector based packages such as Macromedia’s
Flash are aimed at reducing file download times on the web, and may contain both bitmaps
and drawn art.

The following are the features common to a drawing or painting package:


 An intuitive graphical user interface with pull-down menus, status bars, palette
control, and dialog boxes for quick, logical selection.
 Scalable dimensions, so you can resize, stretch, and distort both large and small
bitmaps
 Paint tools to create geometric shapes, from squares to circles and from curves to
complex polygons.
 Ability to pour a color, pattern, or gradient into any area
 Ability to paint with patterns and clip art
 Customizable pen and brush shapes and sizes
 Eyedropper tool that samples colors
 Autotrace tool that turns bitmap shapes into vector-based outlines
 Support for scalable text fonts and drop shadows
 Multiple undo capabilities, to let you try again
 History function for redoing effects, drawings, and text
 Property inspector
 Screen capture facility
 Painting features such as smoothing coarse-edged objects into the background
with anti-aliasing: airbrushing in variable sizes, shapes, densities, and patterns: washing
colors in gradients: blending; and masking.
 Support for third-party special effect plug-ins
 Object and layering capabilities that allow you to treat separate elements
independently
 Zooming, for magnified pixel editing

92 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 All common color depths: 1, 4, 8, 16, 24 or 32 bit color, and gray-scale.


 Good color management and dithering capability among color depths using various NOTES
color models such as RGB, HSB, and CMYK
 Good palette management when in 8-bit mode
 Good file importing and exporting capability for image formats such as PIC, GIF,
TGA, TIF, PNG, WMF, JPG, PCX, EPS, PTN and BMP.

4.4.3 3-D Modeling and animation tools

With 3-D modeling software, objects rendered in perspective appear more realistic.
Powerful modeling packages such as AutoDesk’s Discreet, Strata Vision’s 3D, Specular
LogoMotion and Infini-D, Alias’ Wavefront, Avid’s SoftImage, and Caligari’s trueSpace
are also bundled with assortments of prerendered 3-D clip art objects such as people,
furniture, buildings, cars, airplanes, trees, and plants. Important for multimedia developers,
many 3-D modeling applications also include export features enabling you to save a moving
view or journey through you scene as a QuickTime or AVI animation file. Each rendered
3-D image takes from a few seconds to a few hours to complete, depending upon the
complexity of the drawing and the number of drawn objects included in it.

A good 3-D modeling tool should include the following features:


 Multiple windows that allow you to view your model in each dimension, from the
camera’s perspective, and in a rendered preview.
 Ability to drag and drop primitive shapes into a scene
 Ability to create and sculpt organic objects from scratch
 Lathe and extrude features
 Color and texture mapping
 Ability to add realistic effects such as transparency, shadowing and fog.
 Ability to add spot, local, and global lights, to place them anywhere, and manipulate
them for special lighting effects
 Unlimited cameras with focal length control
 Ability to draw spline-based paths for animations.

4.4.4 Image-Editing tools

Image –editing applications are specialized and powerful tools for enhancing and
retouching existing bitmapped images, these applications also provide many of the features
and tools of painting and drawing programs and can be used to create images from scratch
as well as images digitized from scanners, video frame-grabbers, digital cameras, clip art
files, or original artwork files created with a painting or drawing package.

93 ANNA UNIVERSITY CHENNAI


DMC 1934

Here are some features typical of image-editing applications and of interest to


NOTES multimedia developers:
 Multiple windows that provide views of more than one image at a time
 Conversion of major image-data types and industry-standard file formats
 Direct inputs of images from scanner and video sources
 Employment of a virtual memory scheme that uses hard disk space as RAM for
images that require large amounts of memory.
 Capable selection tools, such as rectangles, lassos, and magic wands, to select
portions of a bitmap.
 Image and balance controls for brightness, contrast, and color balance
 Good masking features
 Multiple undo and restore features
 Anti-aliasing capability, and sharpening and smoothing controls
 Color-mapping controls for precise adjustment of color balance
 Tools for retouching, blurring, sharpening, lightening, darkening, smudging, and
tinting
 Geometric transformations such as flip, skew, rotate, distort and perspective
changes.
 Ability to resample and resize an image
 24-bit color, 8 or 4 bit index color, 8-bit gray-scale, black and white, and
customizable color palettes
 Ability to create images from scratch, using line, rectangle, square, circle, ellipse,
polygon, airbrush, paintbrush, pencil, and eraser tools, with customizable brush
shapes and user-definable bucket and gradient fills
 Multiple typefaces, styles, and sizes, and type manipulation and masking routines
 Filters for special effects, such as crystalline, dry brush, emboss, facet, fresco,
graphic pen, mosaic, pixelize, poster, ripple, smooth, splatter, stucco, twirl, water
color, wave and wind etc.,
 Support for third-party special-effect plug-ins
 Ability to design in layers that can be combined, hidden, and recorded

Plug-Ins: Image-editing programs usually support powerful plug-ins available from third-
party developers that allow you to warp, twist, shadow, cut, diffuse, and otherwise “filter”
your images for special visual effects.

94 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

4.4.5 Sound Editing tools


NOTES
Sound editing tools provide features for cut, copy, and paste segments of sound and
provide many more useful features. Apple’s QuickTime Player Pro, Sonic Foundry’s
SoundForge for windows are some popular sound editing tools. Some of the features are

Multiple Tracks: Being able to edit and combine multiple tracks and then merge the
tracks and export them in a final mix to a single audio file is important.

Trimming: Removing “dead air” or blank space from the front of a recording and any
unnecessary extra time off the end is the first sound editing task. Trimming even a few
seconds might make a big difference in file size.

Splicing and Assembly: Using the same tools mentioned for trimming, you will probably
want to remove the extraneous noises that inevitably creep into a recording.

Volume adjustments: If you are trying to assemble ten different recordings into a single
sound track, there is little chance that all the segments will have the same volume. To
provide a consistent volume level, select all the data, in the file, and raise or lower the
overall volume by a certain amount. Best is to use a sound editor to normalize the assembled
audio file to a particular level.

Format Conversion: In some cases your digital audio software might read a format
different from that read by you presentation or authoring program. Most sound editing
software will save files in your choice of many formats, most of which can be read and
imported by multimedia systems.

Resampling or Downsampling: If you have recorded and edited your sounds at 16-bit
sampling rates but are using lower rates and resolutions in your project, you must resample
or downsample the file. Your software will examine the existing digital recording, and work
through it to reduce the number of samples. This process may save considerable disk
space.

Fade-Ins and Fade-outs: Most programs offer enveloping capability, useful for long
sections that you wish to fade in or fade out gradually. This enveloping helps to smooth out
the very beginning and the very end of a sound file.

Equalization: Some programs offer digital equalization capabilities that allow you to modify
a recording’s frequency content so that it sounds brighter or darker.

Time Stretching: Advanced programs let you alter the length of a sound file without
changing its pitch. But most time-stretching algorithms severely degrade the audio quality
of the file if the length is altered more than a few percent in either direction.

95 ANNA UNIVERSITY CHENNAI


DMC 1934

Digital Signal Processing: Some programs allow you to process the signal with
NOTES reverberation, multitap delay, chorus, flange, and other special effects using digital signal
processing routines.

Reversing Sounds: Another simple manipulation is to reverse all or a portion of a digital


audio recording.

4.4.6 Animation, video, and Digital Movie tools

Animations and digital video movies are sequences of bitmapped graphic scenes,
rapidly played back. But animations can also be made within the authoring system by
rapidly changing the location of objects, or sprites, to generate an appearance of motion.
Most authoring tools adopt either a frame or object oriented approach to animation, but
rarely both.

Movie making tools, typically take advantage of QuickTime for Mac and Windows
and Microsoft video for Windows (AVI) and let you create , edit, and present digitized
motion video segments, usually in a small window in your project.

4.4.6.1 Video formats

Video formats and systems for storing and playing digitized video to and from disk
files are available with QuickTime and AVI. Both systems depend on special algorithms
that control the amount of information per video frame that is sent to the screen, as well as
the rate at which new frames are displayed. Both provide a methodology for interleaving,
or blending, audio data with video and other data so that sound remains synchronized with
the video. And both technologies allow data to stream from disk into memory in a buffered
and organized manner. DVD is a hardware format defining a very dense, two-layered disc
that uses laser light and, in the case of recordable discs, heat to store and read digital
information. The digital information or software on a DVD is typically multiplexed audio,
image, text, and video data optimized for motion picture display using MPEG encoding.

QuickTime is an organizer of time-related data in many forms. Classic video tape


involves a video track with two tracks of audio. QuickTime is a multitrack recorder in
which you can have an almost unlimited range of tracks. Digitized video, digitized sound,
computer animations, MIDI data, external devices such as CD-ROM players and hard
disks, and even the potential for interactive command systems are all supported by the
QuickTime format. With QuickTime, you can have a movie with five different available
languages, titles, MIDI cue tracks, or the potential for interactive commands.

QuickTime for windows and Macintosh


The heart of QuickTime is software based architecture for seamlessly integrating sound,
text, animation and video on MAC and windows platforms. QuickTime is also used to
deliver multimedia to the World Wide Web as a plug-in for Internet Explorer. On the web

96 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

QuickTime can deliver 3D animation, real-time special effects, virtual reality, and streaming
video and audio. Its role as a powerful cross platform integrator of multimedia objects and NOTES
formats make it a tool upon which multimedia developers depend.
QuickTime building blocks
Three elements make up QuickTime

QuickTime movie file format


QuickTime Media Abstraction Layer
QuickTime media services
The movie file format is a container that provides a standard method for storing video,
audio, and even text description about a media composition. The Media Abstraction Layer
describes how your computer should access the media that is included in the QuickTime
movie. The media services part of QuickTime not only has the built-in support for over 35
media file formats, including most major video, still image, audio, animation, and MIDI
formats, but also allows developers to plug in their own new or custom media formats.
QuickTime includes built-in support for ten different media types (video, audio, text,
timecode, music/MIDI, sprite/animation, tween MPEG, VR, 3D) and offers a comprehensive
set of services such as
 Timing and synchronization
 Audio and image data compression and decompression
 Image blitting, format conversion, scaling, composition, and transcoding
 Audio mixing, sample rate conversion, and format conversion
 Audio and video effects and transitions
 Synchronized storage read and write
 Media capture
 Media import and export
 Standard user interface elements, such as movie controllers, media previewers,
and media capture dialogs.

QuickTime embedded commands for HTML

When delivering QuickTime projects on the World Wide Web, you can embed
powerful commands into your HTML documents that control and fine-tune the display of
your QuickTime file;
AUTO PLAY starts a movie playing automatically
BGCOLOR sets a background color for the movie display
CACHE indicates whether the movie should be cached
CONTROLLER specifies whether to display the QuickTime movie controller bar

97 ANNA UNIVERSITY CHENNAI


DMC 1934

HEIGHT and WIDTH specify size of the movie in web pages


NOTES HIDDEN allows sound-only movies to play in the background without affecting, the look
of a web page
HREF indicates which URL to link to when the movie is clicked
LOOP loops movie playback automatically
SCALE scales the movie display automatically
TARGET provides a frame target for the URL specified in an HREF tag
VOLUME sets the default play back volume.
In addition QuickTime has the following VR commands
CORRECTION specifies an image correction mode
FOV sets the initial field-of-view angle.
NODE sets the initial node
PAN sets the initial pan angle
TILT sets the initial tilt angle

Microsoft Video for Windows

Audio Video Interleaved (AVI) is a Microsoft developed format for playing full-
motion interleaved video and audio sequences in windows, without specialized hardware.
Video data is interleaved with audio data within the file that contains the motion sequence,
so the audio portion of the movie remains synchronized to the video portion.

The AVI file format is not extensible, ‘open’ environment and lacks features needed
for serious video editing environments. To improve this situation, a group of interested
companies recently created the OpenDML file format to make AVI more useful for the
professional market.

Like Apple’s QuickTime, AVI provides the following features:


 Playback from harddisk or CD-ROM
 Playback on computers with limited memory: data is streamed from the hard disk
or CD-ROM player without using great amounts of memory.
 QuickTime loading and playing, because only a few frames of video and a portion
of audio are accessed at a time
 Video compression to boost the quality of your video sequences and reduce their
size

98 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

4.4.7 Movie editors


NOTES
Specialized video editors have been designed for both the MAC and windows, for
example, AVID’s VideoShop, Media 100, Fast’s VideoMachine, Adobe’s Premiere, and
Apple’s Final Cut Pro etc. These applications let you mix video clips, audio recordings,
animation, still images, and graphics to create QuickTime or AVI movies.

4.4.8 Compressing Movie Files

Image compression algorithms are critical to the delivery of motion video and audio
on both the MAC and PC platforms. The three basic concepts are

Compression ratio: The compression ratio represents the size of the original image
divided by the size of the compressed image that is, how much the data is actually
compressed. Some compression schemes yield ratios that are dependent on the image
content: a busy image of a field of multicolored objects may yield a very small compression
ratio, and an image, of blue ocean and sky may yield a very high compression ratio. Video
compression typically manages only the part of an image that changes from image to image.

4.4.8.1 Image Quality

Compression is either lossy or lossless. Lossy schemes ignore picture information the
viewer may not miss, but that means the picture information is in fact lost – even after
decompression. And as more and more information is removed during compression, image
quality decreases. Lossless schemes preserve the original data precisely. The compression
ratio typically affects picture quality because, usually, the higher the compression ratio, the
lower the quality of the decompressed image.

4.4.8.2 Compression / decompression

While developing projects a faster compression time is preferred on the other hand
users will appreciate a fast decompression time to increase display performance.

For compressing video frames, the MPEG format used for DVD employs three types
of encoding: I-frames (Intra), P-Frames (predicted), and B-Frames (Bidirectional
Predicted). Sequences of these frame types are compiled into a GOP, and all the GOPs
are stitched into a stream of images. The result is an MPEG video file.

4.5 DIGITAL VIDEO STANDARDS

1. Enhanced Definition Television System(EDTV)

These are conventional systems modified to offer improved vertical and horizontal
resolutions.

99 ANNA UNIVERSITY CHENNAI


DMC 1934

2. CCIR Recommendations
NOTES
The international body for television standards CCIR defined a standard for digitization
of video signals known as CCIR-601 Recommendations.

3. Common Intermediate format

It is a non-interlace format. Its luminance resolution has 360X288 pixels/frame at 30


frames/second.

4. Source Input format

It has luminance resolution of 360X240 pixels/frame at 30frames/second or 360X288


pixels/frame at 25 frames/second.

High Definition video and TV

It is a new standard for digital video for improving picture quality compared to the
standard NTSC or PAL formats.

Video File formats and codec’s


1) AVI(Audio /video interleaved)
2) MOV(Quick Time Movie)
3) MPEG(Motion Picture Experts Group)
4) Real Video
5) H.261
6) H.263
7) Indeo Video Interactive
8) Cinepak
9) Sorenson video
10) VDO live
11) DivX
12) XviD
13) 3ivx
14) Nero Digital

4.6 REAL TIME PROTOCOLS

a) Real Time Transport Protocol

It is a protocol providing network transport functions suitable for applications


transmitting real time data such as audio, video and animation.

b) Resource Reservation protocol

It is a protocol that transfers reservations and keep a state at the intermediate nodes.

100 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

c) Real Time Streaming Protocol


NOTES
It is a client server multimedia presentation control protocol designed to address the
need for efficiency delivery of streamed multimedia over IP networks.

4.7 OBJECTS FOR MULTIMEDIA SYSTEMS

4.7.1 Text

Text looks like the easiest medium to create and the least expensive to transmit,

But there’s more to text creation than meets the eye!

First, effective use of text requires good writing, striving for conciseness and accuracy.
 Advertising wordsmiths sell product lines with a logo or tag lines with just a few
words
 Similarly, multimedia developers are also presenting text in a media-rich context,
weaving words with sounds, images, and animations
 Design labels for multimedia title screens, menus and buttons using words with the
precise and powerful meanings
 Which feedback is more powerful: “That answer was correct.” or “Terrific!”
 When is “Terrific” more appropriate or effective?
 Why is “quit” more powerful than “close”? Why does UM uses “out” instead?
 Why is the title of a piece especially important?
 It should clearly communicate the content.
 It should get the user interested in exploring the content.
 Let’s discuss some of your proposed project titles.

Some guidelines for writing effective script:


 Write for your audience, bearing in mind your audience’s background and interests
 Can you assume that your audience knows what the traveling salesman problem
is?
 Yes, if your audience is CS faculty; no, if it’s CS undergraduates.
 When should you use a casual, idiomatic style or a formal, business-like style?
 Again, it depends on your audience.
 Recommended reading for writers: The Elements of Style, by William Strunk, full
of pithy advice and rules of thumb:
 Say it in active voice:
“Genetic algorithms were invented by John Holland in 1970’s.”
vs.” John Holland invented genetic algorithms in the 1970’s.”

101 ANNA UNIVERSITY CHENNAI


DMC 1934

 Avoid wordiness: “computer algorithm” vs. just “algorithm”


NOTES  Avoid high-faulting’ phrases: “appropriate incorporated” vs. “using”
 Write and rewrite, bearing in mind that user won’t read much on a screen.

4.7.2 Images

There are three types of images.

1. Visible images:

The images exist for some duration in complete bitmap form , which includes
every pixel captured by the input device.

Example
 Drawings,
 Documents
 Photographs
 Paintings

2.NonVisible Images

Non Visible Images are those that are not stored as images but are displayed as
images

Ex:ample
 Temp gauge
 Pressure gauge

3.Abstract Images

They are really not images that ever existed as real world objects or representations

Ex:ample
 Discrete
 Continuous

4.7.3 Audio and voice

Stored audio and video contain compressed information. This can consist of music,
speech, voice commands, telephone conversations and so on..

An audio object need to store information about the sound clip such as the length of
the sound clip, its compression algorithm, playback characteristics and any sound
annotations associated with the original clip that must be played at the same time as overlays.

102 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Full Motion and Live Video


NOTES
They are the most processing and storage –intensive components.Full motion video
mostly refers to prestored video clips ,while live video by its very definition ,is live and
must be processed while it is being captured by the camera.

Digital Audio
 Windows uses wav format, apple uses aiff
 Advantages:
 Reliable playback (“what you hear is what you get”)
 Required for speech playback
 Sound effects in \windows\media or the web
 Are any of you planning to use simple digitized sound effects in your projects?
 How so? Where are you getting your sound effects?

4.7.4 Taxonomy Of Audio Representations

Figure 4.5 Audio taxonomy

Analog to digital image conversion

Images are often received from a camera or scanner as a sequence of pixels for each
of the primary colors. Those are converted to a digital bit stream. Each pixel is represented
in a particular color space and encoding scheme. The figure 4.6 shows the RGB to digital
bit stream coding scenario.

103 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 4.6 RGB to digital bit stream coding

4.8 IMA ARCHITECTURE

Open standards must develop in a rich, flexible architecture for development,


interchange, and use. The Interactive Multimedia Association (IMA) represents the major
companies and groups working on multimedia technology. They have an architecture as
shown in figure 4.6 that guides work on the specific tools and systems under development.
Mheg, a standard for multimedia object interchange, is defined in the context of a suite of
related standards.

Sound, Animation And Video

Sound, animation, and video can add a great deal of appeal and sensuality to a piece
but there are always tradeoffs. david ludwig of interactive learning designs says,

“Let the content drive the selection of media for each chunk of information to be
presented. Use traditional text and graphics when appropriate; add animation when ‘still
life’ won’t get your message across; add audio when further explanation is required; resort
to video only when all other methods pale by comparison.”

104 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES

Figure 4.6 IMA architecture

Sound is measured in decibels (db), the pressure or volume :


 Typical voice conversation is ~70db, a soft whisper is 30db, a jackhammer is
120db.
 Decibels measure the energy required to make sounds logarithmically.
 A jackhammer’s noise is about 1 watt, but voice conversation is 1/100,000
watt.
 Two major ways to create and deliver sounds are midi and digital audio,
 MIDI (musical instrument digital interface) is to audio what vector-based graphics
are to Images and postscript is to text, while digital audio (such as wav files) are
analogous to bitmaps
 MIDI is a notation (similar to a musical score) and communications standard for
describing
 how electronic instruments and synthesizers play musical sound
 Advantages: midi files tend to be much smaller than digitized wave form files
Can be stretched or edited more easily
May sound better if the playback quality of the instruments is better
 Disadvantages: Doesn’t provide reliable playback, depends on what’s available
 Can’t play speech
 Windows 3.1 introduced standards for midi playback

105 ANNA UNIVERSITY CHENNAI


DMC 1934

 Author ware and other multimedia authoring tools support midi


NOTES  Digital audio is more common for multimedia applications
 Advantages: reliable playback (“what you hear is what you get”)
 Required for speech playback
 Windows also introduced wav format
 Windows comes with some sound files in .wav format, usually in \windows\media
 Users can assign sounds to windows events, such as startup, minimizing window,
etc.
 These sound files may also be imported into your multimedia applications

Digitized sound is sampled sound :

 Every nth fraction of a second, a sample of analog sound is taken and stored in
binary form
 Sampling rate: Indicates how often the sound sample is taken
 Sampling size: Denotes how much information is stored for each sample
 The more often you sample and more data per sample, the higher your quality
and resolution
 The value of each sample is rounded off to the near integer (this is called quantization)
 An 8-bit sampling size provides 256 bits to describe dynamic range of amplitude
 A 16-bit sampling size provides over 65 thousand bits for dynamic range,
 But significantly increases space requirements
 If amplitude is > intervals available, clipping at top and bottom of waves occurs:
 Produces background hissing noise or other distortions (i can’t hear them too
good!)
 Three most common frequencies are 44.1 khz (kilohertz), cd-quality, 22.05 and
11 khz
Formula for determining the size (in bytes) of a digital recording

 Sampling rate * duration in seconds * (bit resolution / 8) * 1 (mono) or 2 (stereo)


E.g., for a 10 second recording at 20.05 khz, 8-bit resolution, monophonic (good for
speech):
22050 * 10 * 8/8 * 1 = 220,500 bytes
For good music quality at 44.1 khz, 16-bit resolution, stereo:
44100 * 10 * 16/7 * 2 = 1,764,000 bytes
This is the “red book” standard for cd-quality audio – but it’s expensive for multimedia!

Note that most but not all pcs have 16 bit sound cards (not all have sound cards!)

106 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Tips for recording sound


NOTES
 Make sure sound is at right minimum and maximum levels or volume, to avoid
distortion
 If recording human voice, make sure your narrator speaks clearly, without much
accent
 In the prototype, how many think Prof Pottenger’s was about right
 Some feedback we got is that he may be speaking too slowly
 MAY WANT TO USE TAPE FOR VOLUME RECORDING, EITHER VCR
TAPE OR DIGITALAUDIO TAPE (DAT)
 Dat is higher end, but it also precisely records background noises and glitches!
 Recording at a higher quality than can be used by target playback device may be
useful for archival purposes – playback devices may improve!

A good sound editor (such as cool edit pro) lets you manipulate your sound files in
useful ways
 Trimming: removing blank space from front or back of a sound file
 Splicing: removing extraneous sounds or inserting sounds from other sources
 Amplification (volume adjustments): making different sounds play at consistent
level
Why might amplification in a sound editor not be such a great idea?
What did you think of the reference librarian’s voice in the cimel prototype?
 Re-sampling or down-sampling, e.g., form 16-bit to 8-bit
 Digital signal processing effects: reverberation, flange, fades and other effects
 Format conversion: windows standard is wav, mac is snd or aif, unix is au (mew-
law)
 Cool edit 2000 also supports mp3

A little bug in cool edit: when saving from wav to mp3, it sometimes truncates the start
of the file;you can avoid this bug by using copy and paste from a wav file

Animation adds motion to a piece, perhaps to draw attention to what you want user
to notice Can be as simple as a transition effect, such as a fade, dissolve or zoom Or as
elaborate and expensive as a full cartoon-like cel-animation or even 3d animation

Animation is possible because of a biological phenomenon known as persistence of


vision:

Object seen by a human eye remains mapped on retina for a brief time after viewing
Makes it possible for a series of images that change slightly and rapidly to blend, Giving the
illusion of movement

107 ANNA UNIVERSITY CHENNAI


DMC 1934

 Television video builds 30 entire frames or pictures per second;


NOTES
Movie on film typically shot at camera shutter rates 24 frames per second, But by
having the projector flash light through each image twice gives 48/second

 Acceptable multimedia animation can make do with fewer frames per second

Classical cartoon animation

Classical cartoon animation makes a different cel (celluloid sheet) for each frame Cel
animation artwork begins with key frames for each action, such as a character about to
take a step, pitching its body weight forward .tweening an action requires calculating the
number of frames between key frames.

Sketching onto a cel, the series of different actions

Computer animation can imitate the classical technique, with key frame, tweening and
layers.

E.g., macromedia director and flash both support these concepts, letting the computer
automate the tweening process where possible

But computer animation introduces new techniques, designed to conserve resources

E.g., rather than reproduce an entire cel for each frame, individual objects (called
sprites in director) move across a background image

Author ware motions give this effect; director animations provide finer control of
sprites .morphing effect can be achieved by dissolving from one image to another,

E.g., from one face to another – many specialized morphing software products
available

Animations introduced more file formats: dir for director movies, fli and flc for autodesk
and animatorpro, max for studiomax, gif89a for animations in gif (most popular on the
web)director movies may be compressed for the web (yet another format, dcr) .

4.9 VIDEO RECORDING FORMATS

Ever since the first silent movie flickered to life, people have been fascinated with
“motion pictures”. However, video places the highest performance demand on processor
speed and memory. Playing a full screen video uncompressed could require 1.8gb per
minute! Special hardware and software enhancements needed for better performance,
both on the production and playback side. On the production side, hardware/software
can get pretty expensive!

108 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

BROADCAST VIDEO STANDARDS VARY


NOTES
1. NSTC (national television standards committee, 1952) for usa, japan, etc.525
horizontal scan lines 30 per second, with interlacing to prevent flicker.
2. PAL (phase alternate line) for uk, europe, australia, south africa:
Adds color to a black-and-white tv signal, 625 lines, 25 per second, interlaced.
3. SECAM (sequential color and memory) used in france, russia, etc.Similar to pal,
but uses different broadcast technology
4. HDTV is just coming out, increases resolution and aspect (more movie-like)
however, broadcast video is still analog; digitizing full broadcast video is prohibitive!

4.9.1 Video Recording Formats

S-VHS keeps video, colour and luminance are kept on two separate tracks Sony
Betacam SP features 3 channels for video (red, blue, luminance), 4 for audio Higher
resolution; considered “the” choice of broadcast industry and archiving.

Shooting and editing video Never underestimate the value of a steady shooting platform
Shaky camera work is a sure sign of amateur home movies! Use a tripod or even a towel
on a table Provide plenty of lighting, ideally bright sunlight.

Good lighting is especially needed with lower quality video equipment.

Chroma key or blue screen is a useful special effect (available in Premiere): e.g., To
show Captain Picard on the moon, shoot Picard in front of a screen or wall painted blue,
then shoot another video of a background moonscape, then mix the two together, deleting
the blue chroma key in the Picard shot Popular technique using 3-D modelling and graphic
software.

Windows avi (audio video interleaved) and apple quicktime mov are popular formats
for movies. Quicktime 3 was released in 1998, with many new features.

Note: Quicktime movies produced on the mac must be “flattened” for playback under
windows (interleaving video and audio together)

Digital video is so data-intensive, video compression schemes are essential. A codec


– decoder/encoder – is a video compression algorithm compression can take a lot of
processing power and time! Mpeg, jpeg, cinepak, intel indeo, real video are some popular
codec’s; cinepak is optimized for cdroms playback mpeg (motion picture experts group)
has many generations of standards mpeg1 was designed for adequate playback via early
cdrom technology supported by hardware on motherboards and software (quicktime)
mpeg2 delivers higher resolution, picture quality, interlaced video, etc. Requires special
purpose hardware for recording and playback mpeg4 offers multiple views and soundtracks
per scene; 3d views for virtual reality mpeg7, on the horizon, integrates multimedia information

109 ANNA UNIVERSITY CHENNAI


DMC 1934

in the video! Streaming technologies have made video somewhat more practical on the
NOTES web real networks claims that 100,000 hours/week of live audio/video

4.10 CREATING CHARACTERS AND OBJECTS ON A COMPUTER

Computer animation combines vector graphics with programmed movement. The


starting point is often a stick figure in which the position of each feature (limb, mouth etc) is
defined by as avars (animation variable).

The character “woody” in toy story, for example, uses 700 avars with 100 avars in
his face alone. Successive sets of avars control all movement of the character from frame
to frame. Once the stick model is moving in the desired way, the avars are incorporated
into a full wire frame model or a model built of polygons. Finally surfaces are added,
requiring a lengthy process of rendering to produce the final scene.

There are several ways of generating the avar values to obtain realistic motion. Motion
tracking uses lights or markers on a real person acting out the part, tracked by a video
camera. Or the avars may be set manually using a joystick or other form of input control.
Toy story uses no motion tracking, probably because only manual control by a skilled
animator can produce effects not easily acted out by a real person.

4.10.1 Computer animation development equipment

Computer animation can be created with a computer and animation software. Some
examples of animation software are: amorphium, art of illusion ray dream studio, bryce,
maya, blender, true space, light wave, 3d studio max, softimage xsi, alice, and adobe flash
(2d). There are many more. Prices will vary greatly depending on target market. Some
impressive animation can be achieved even with basic programs; however, the rendering
can take a lot of time on an ordinary home computer. Because of this, video game animators
tend to use low resolution, low polygon count renders, such that the graphics can be
rendered in real time on a home computer. Photorealistic animation would be impractical
in this context.

Professional animators of movies, television, and video sequences on computer games


make photorealistic animation with high detail. This level of quality for movie animation
would take tens to hundreds of years to create on a home computer. Many powerful
workstation computers are used instead. Graphics workstation computers use two to four
processors, and thus are a lot more powerful than a home computer, and are specialized
for rendering.

A large number of workstations (known as a render farm) are networked together to


effectively act as a giant computer. The result is a computer animated movie that can be
completed in about one to five years (this process is not comprised solely of rendering,
however). A workstation typically costs $2,000 to $16,000, with the more expensive

110 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

stations being able to render much faster, due to the more technologically advanced hardware
that they contain. NOTES
Pixar’s renderman is rendering software which is widely used as the movie animation
industry standard, in competition with mental ray. It can be bought at the official pixar
website for about $5,000 to $8,000. It will work on linux, mac os x, and microsoft windows
based graphics workstations along with an animation program such as maya and softimage
xsi. Professionals also use digital movie cameras, motion capture or performance capture,
blue screens, film editing software, props, and other tools for movie animation.

4.10.2 Hardware animation display technology

Figure 4.7 An example of computer rendered animation


When an image is rendered to the screen, it is normally rendered to something called
a back buffer. There the computer can draw the image, making any necessary changes to
it before it is done. While the computer is rendering, the screen is showing the contents of
what is called the primary or active buffer. Figure 4.7 shows an example of computer
rendered animation.
When the image is completed, the computer tells the screen to draw from the back
buffer. This can be done in one of two ways: the contents of the back buffer can be copied
to the primary buffer (or active buffer—the buffer which is currently being shown) or the
computer can switch where it is drawing from and make the back buffer the new primary
buffer. In this case, the primary buffer becomes the back buffer. John macarthur is the one
that thought up this process. This process is usually called double buffering or, informally,
“flipping,” because the computer is flipping its use of primary and back buffers.
This switching should be carried out when it is imperceptible to the user. Therefore it
needs to take place during what is called the “v-sync” or vertical retrace. The v-sync, in
CRT’s, takes place when the electron guns reach the bottom right of the screen and need
to reposition the beam to the top left of the screen. This happens very quickly and the
image the guns had just projected remains on the screen as they are moving back to their
starting position. While the guns are repositioning themselves, the computer has enough
time to flip buffers and the new image will be rendered on the screen on the next pass of the
guns. The new image will continued to be displayed until the buffers are flipped once more.

111 ANNA UNIVERSITY CHENNAI


DMC 1934

When the computer fails to wait for the v-sync, a condition called sprite breakup or
NOTES image breakup is perceptible. This is highly undesirable and should always be avoided
when possible to maintain the illusion of movement.

4.10.3 Design of animation sequences

 Storyboard layout

The storyboard is an outline of the action. It defines the motion sequence as a basic
set of events that are to take place. Depending on the type of the animation to be produced,
the storyboard could consist of a set of rough sketches or it could be a list of basic ideas
for the motion.

 Object definitions

An object definition is given for each participant in the action. Objects can be defined
in terms of basic shapes, such as polygons or splines. In addition, the associated movements
for each object is specified along with the shape.

 Key-frame specification

A key frame is detailed drawing of the scene at a certain time in the animation sequence.
Within each key frame, each object is positioned according to the time for that frame.
Some key frames are chosen at extreme positions in action; others are spaced so that the
time interval between key frames is not too great. More key frames are specified for
intricate motions that for simple, slowly varying motions.

 Generation of the in-between frames

In-betweens are the intermediate frames between the key frames. The number of in-
betweens needed is determined by the media to be used to display the animation. Typically,
time intervals for the motion are set up so that there are from three to five in-betweens for
each pair of key frames.

Raster animations

In raster animations we do the following, we animate objects along 2d-motion paths


using the color-table transformations. Here we predefine the object at successive positions
along the motion path, and set the successive block of pixels to the color-table entries. We
set the pixels at the first position of the object to “on” values, and we set the pixels at the
other object positions to the background color. The animation is then accomplished by
changing the color-table values so that the object is “on” at successively positions along
the animation path as the preceding position is set to the background intensity.

112 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

4.11 MULTIMEDIA AUTHORING


NOTES
Multimedia systems are different from other system in two main respect
1. The variety of information of objects.
2. The level of integration achieved in using these objects in complex interconnected
application.

The scope of an application is an important determinant of how it will impact othere


application and the organization as a whole.the design should take into account the scope
of the application.

Multimedia application very broadly in the following categories:


 Mail
 Information repositories
 Business processes.

One fact to note is that all of these classes of application have some common
characteristics:
 They are shared application and used by a large number of users.
 The user share data objects as and when they need them.
 Some processes are carried out in a sequential manner,that is data processed by
one user or group of user goes to next in a sequence.

Functionality

The functionality of the system is the primary determinant of the technologies required
and how they will be used .each technology utilized in multimedia application supports a
range of functionality options.

Multimedia Authoring System

Authoring systems for multimedia application are designed with the following two
primary target users in mind
1. Professionals.
2. Average business users.

Multimedia objects must be compressed as they are being captured or immediately


thereafter. Most authoring system are managed by a control application.The control
application must determine storage location and compression format according to the type
of the multimedia object,the capturing equipment in use,and the software drivers and
compression standards supported by the authoring system.

113 ANNA UNIVERSITY CHENNAI


DMC 1934

4.11.1 Design Issues For Multimedia Authoring


NOTES
 Display resolution
 Data formats for captured data
 Compression algorithm
 Network interfaces
 Storage formats

Display resolution
i. Level of standardization on display resolution
ii. Display protocol standardization.
iii. Corporate norms for service degradaions.
iv. Corporate norms for network traffic degradation as they relate to resolution issues.

File Format And Data Compression Issues


i. Compression type
ii. Estimated time to decompress and display or play back the object (for audio and
full motion video)
iii. Size of the object
iv. Object orientation
v. Annotation markers and history
vi. Index markers
vii. Date and time of creation source file name
viii. Version number
ix. Required software

Service Degradation Polices


i. Decline further request with a message to try later,giving the explanation for
the decline as network overload.
ii. Provide the playback service but at a lower resolution.
iii. Provide the playback service at full resolution but,in case of sound or full-
motion video,drop intermediate frames.
iv. Provide service at full resolution and frame rate in blocks.

Design Approaches To Authoring


i. Hypermedia application design specifics.
ii. User interface aspects.
iii. Embedding /linking streams of object to a main document presentation.
iv. Storage of and access to multimedia objects.
v. Playing back combined streams in a synchronized manner.

114 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Types Of Multimedia Authoring System


NOTES
There are two types of multimedia authoring namely dedicated authoring system,
timeline based authoring.

Dedicated Authoring System

Dedicated authoring systems are the simplest type of system,designed for a single
user and generally for sigle streams Although a dedicated authoring system is very
simple,designing an authoring system capable of combining even two object streams can
be quite compex.a structured design approach is very useful in isolating the visual and
procedural design components.

Timeline Based Authoring

In a timeline based authoring system,objects are placed along a timeline. The timeline
can be drawn on the screen in a window,in a grahioc manner,or it created. Using a script in
a manner similar to a project plan.

4.11.2 Multimedia Authoring Tools

Multimedia authoring tools provide the important framework needed for organizing
and editing the elements of your multimedia project, including graphics, sounds, animations,
and video clips. Authoring tools are used for designing interactivity and the user interface,
for presenting your project on screen, and for assembling diverse multimedia elements into
a single, cohesive product.

Authoring software provides an integrated environment for binding together the content
and functions of your project, and typically includes everything you need to create, edit
and import specific types of data: assemble raw data into a playback sequence: and provide
a structured method or language for responding to user input. With multimedia authoring
software, we can make
 Video productions
 Animations
 Games
 Interactive web sites
 Demo disks and guided tours
 Presentations
 Kiosk applications
 Interactive Training
 Simulations, prototypes, and technical visualizations

115 ANNA UNIVERSITY CHENNAI


DMC 1934

4.11.2.1 Types of authoring tools


NOTES
The various authoring tools are classified into groups based on the metaphor used for
sequencing or organizing multimedia elements and events

 Card or page based tools

 Icon based, event-driven tools

 Time-based tools

4.11.2.2 Card or page based tools:

In these authoring systems, elements are organized as pages of a book or a stack of


cards. These tools are best used when the bulk of your content consists of elements that
can be viewed individually, like the pages of a book or cards in a card file. The authoring
system lets you link these pages or cards into organized sequences. You can jump, on
command, to any page you wish in the structured navigation pattern. Card or page based
authoring systems allow you to play sound elements and launch animations and digital
video.

4.11.2.3 Icon or Object based event-driven tools

In these authoring systems, multimedia elements and interaction elements are organized
as objects in a structural framework or process. Icon or object based, event-driven tools
simplify the organization of your project and typically display flow diagrams of activities
along branching paths. In complicated navigational structures, this charting is particularly
useful during development.

4.11.2.4 Time-based tools

In these authoring systems, elements and events are organized along a timeline, with
resolutions as higher than 1/30 sec. Time-based tools are best to use when you have a
message with a beginning and an end. Sequentially organized graphic frames are played
back at a speed that you can set. Other elements are triggered at a given time or location

116 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

in the sequence of events. The more powerful time-based tools let you program jumps to
any location in a sequence, thereby adding navigation ad interactive control. NOTES

4.11.3 Features of authoring tools

4.11.3.1 Editing features

The elements of multimedia – images, animations, text, digital audio and MIDI music,
and video clips – need to be created, edited and converted to standard file formats and the
specialized software provide these capabilities. Also editing tools for these elements,
particularly text and still images, are often included in your authoring system. The more
editors your authoring system has, the fewer specialized tools you may need. In many
cases, however, the editors that come with an authoring system will offer only a subset of
the substantial features found in dedicated tools.

4.11.3.2 Organizing features

The organization, design, and production process for multimedia involves storyboarding
and flowcharting. Some authoring tools provide a visual flowcharting system or overview
facility for illustrating your projects structure at a macro level. Storyboards or navigation
diagrams, too, can help organize a project. Because designing the interactivity and navigation
flow of your project often requires a great deal of planning and programming effort, your
storyboard should describe not just the graphics of each screen, but the interactive elements
as well.

4.11.3.3 Programming features

Multimedia authoring systems offer one or more of the following approaches, which
are explained below.
 Visual programming with cues, icons, and objects
 Programming with a scripting language
 Programming with traditional languages, such as Basic or C
 Document development tools

Visual programming with icons or objects is perhaps the simplest and easiest authoring
process. To play a sound or to put pictures into your project just drag the elements icon
into the playlist. Visual authoring tools such as Authorware and Icon Author are particularly
useful for slide shows and presentations.

117 ANNA UNIVERSITY CHENNAI


DMC 1934

Authoring tools that offer a very high level language or interpreted scripting environment
NOTES for navigation control and for enabling user inputs-such as Macromedia Director,
macromedia Flash, etc., are more powerful. With scripts you can perform computational
tasks, sense respond to user input, create character, icons, and motion animations, launch
other applications, and control external multimedia devices.

4.11.3.4 Interactivity features

Interactivity empowers the end users of your project by letting them control the content
and flow of information. Authoring tools should provide one or more levels of interactivity:

Simple branching

Conditional branching

Nested IF-THENs, subroutines, event tracking and message passing among objects
and elements

4.11.3.5 Performance Tuning Features

Complex multimedia projects require exact synchronization of events. One will need
to use the authoring tools own scripting language or custom programming facility to specify
timing and sequence on systems with different processors.

4.11.3.6 Playback features

As you build your multimedia project, you will be continually assembling elements
and testing to see how the assembly looks and performs. Your authoring system should let
you build a segment or part of your project and then quickly test it as if the user were
actually using it.

4.11.3.7 Delivery features

Delivering your project may require building a run-time version of the project using
the multimedia authoring software. A run-time version or standalone allows your project to
play back without requiring the full authoring software and all its tools and editors. Often,
the run-time version does not allow users to access or change the content, structure, and
programming of the project.

4.11.3.8 Cross-Platform features

Choose the target platform. You might sometimes need to buy the software that suits
a particular target development and delivery platform.

118 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

4.11.3.9 Internet playability


NOTES
Because the web has become a significant delivery medium for multimedia, authoring
systems typically provide a means to convert their output so that it can be delivered within
the context of HTML or DHTML, either with special plug-ins or by embedding Java,
JavaScript, or other code structures in the HTML document. Test your authoring software
for Internet delivery before you build your project.

4.11.4 Card and Page based authoring tools

Card and page based authoring systems provide a simple and easily understood
metaphor for organizing multimedia elements. Because graphic images typically form the
backbone of a project, both as navigation menus, and as content, many developers first
arrange their images into a logical sequences or groupings similar to the chapters and
pages of a book, or cards in a catalog. Navigation routines become then simply directives
to go to a page or card that contains appropriate images, and text, and associated sounds,
animations, and video clips.

Page based authoring systems contain media objects: the objects are the buttons, text
fields, graphic objects, backgrounds, pages or cards, and even the project itself. The
characteristics of objects are defined by properties. Each object may contain programming
script, usually a property of that object that is activated when an event related to that
object occurs. Events cause messages to pass along the hierarchy of objects in your project.
As the message travels it looks for handlers in the script of each object: if it finds a matching
handler, the authoring system then executes the task specified by that handler.

Most page based authoring systems provide a facility for linking objects to pages or
cards. Examples are HyperCard, ToolBook. The scripting languages associated with them
respectively are HyperTalk and OpenScirpt.

To go to the next card or page when a button is clicked, you would place a message
handler into the script of that button. Below is an example script in Hyper Talk.
On mouse Up
Go next card
end mouse Up
Here is an example in OpenScript:
to handle buttonUp
Go next page
end buttonUp

119 ANNA UNIVERSITY CHENNAI


DMC 1934

Card-and page-based systems typically provide two separate layers on each card: a
NOTES background layer that could be shared among many cards, and a foreground layer that
was specific to a single card.

4.11.5 Icon and object-based authoring tools

Icon-based, event-driven tools provide a visual programming approach to organizing


and presenting multimedia. First your build a structure or flowchart of events, tasks and
decisions by dragging appropriate icons from a library. These icons can include menu
choices, graphic images, sounds and computations. The flowchart graphically depicts the
projects logic. When the structure is built, you can add your content: text, graphics, animation,
sounds, and video movies. Then, to refine your project, you edit your logical structure by
rearranging and fine-tuning the icons and their properties.

With icon-based authoring tools, nontechnical multimedia authors can build


sophisticated applications without scripting. In Authorware from Macromedia, by placing
icons on a flow line, you can quickly sequence events and activities, including decisions
and user interactions. These tools are useful for storyboarding, as you can change sequences,
add options, and restructure interactions by simply dragging and dropping icons.
Macromedia’s Authorware authoring interface allows for reuse of common multimedia
elements in a drag-and-drop, object-oriented visual environment.

120 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

4.11.6 Time-based authoring tools


NOTES
Time-based systems are popular multimedia authoring tools. Each uses its own
distinctive approach and user interface for managing events over time. Many use a visual
timeline for sequencing the events of a multimedia presentation, often displaying layers of
various media elements or events alongside the scale in increments as precise as one second.
Others arrange long sequences of graphic frames and add the time component by adjusting
each frame’s duration of play.

Example: 1) Director

Macromedia’s Director is a powerful and complex multimedia authoring tool with a broadest
of features to create multimedia presentations, animations, and interactive multimedia
applications. In Director you assemble and sequence the elements of your project using a
Cast and a Score.

Cast: The cast is a multimedia database containing still images, sound files, text,
palettes, QuickDraw shapes, programming scripts, QuickTime movies, Flash movies, and
even other Director files. You can import a wide range of data types and multimedia element
formats directly into this Cast, and also create multimedia elements from scratch using
Director’s own tools and editors.

Score: Cast members are tied together using the Score facility. Score is a sequence
for displaying, animating, and playing Cast members, and it is made up of frames that
contain, Cast members, tempo, a palette, timing, and sound information. Each frame is
played back on a Stage at a rate specified in the tempo channel. The Score provides
elaborate and complex visual effects and transitions, adjustments of color palettes, and
tempo control. You can synchronize animations with sound effects by highlighting a range
of frames with sound effects by highlighting a range of frames and selecting the appropriate
sound from your cast.

121 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Lingo: Director utilizes Lingo, a full-featured object-oriented scripting language, to enable


interactivity and programmed control. A built-in script editor offers Lingo debugging facilities.
Because you can attach scripts to individual elements of the Cast, you can copy and paste
complete interactive sequences. Lingo also uses Xtras, which are special code segments
used to control external sound and video devices.

Using Lingo scripts, you can chain together separate Director Documents and call
other files as subroutines. You can also import elements into your cast using pointers to a
file.

Example 2)Flash

Macromedia’s Flash, like Director, is a time-based development environment. Flash


is for developing rich multimedia content for the web. With the Flash player plug-in, Flash
delivers far more than simple static HTML pages. ActionScript is the proprietary scripting
language of Flash, while less powerful than Lingo.

122 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

QUESTIONS
NOTES
1. Define sampling and encoding. Also discuss the various standards available for
video data.
2. What is a file system interface? State the need.
3. State the functionalities of a mixer.
4. What is speech synthesizer.
5. Propose a suitable RAID architecture for a live video conferencing system.
6. Compare various real time protocols with other relevant protocols.
7. Discuss in detail about various data compression standards and issues.
8. Discuss the open issues and challenges in multimedia authoring system.

123 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

124 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES
UNIT V

MULTIMEDIA SYSTEMS AND APPLICATIONS


LEARNING OBJECTIVES
 To Understand the various multimedia systems like multimedia communication,
Multimedia database systems, Distributed multimedia systems and their challenges.
 To apply the multimedia systems knowledge for the development of various
applications.

A multimedia computer system can be a single - or multiple-user system. Networked


multimedia computer systems can transmit and receive digital multimedia materials over a
single computer network or over any number of interconnected computer networks. As
multimedia computer systems evolve, they may become intelligent systems by utilizing expert
system technology to assist users in selecting, retrieving, and manipulating multimedia
information.

5.1 MULTIMEDIA AND COMMUNICATION TECHNOLOGY

It finds numerous applications that vary from entertainment to simulation. Graphics,


sound and animation for creating children games. For example, the flight simulator which
enables one to fly a plane without even getting of the ground.

5.1.1 Multimedia System Issues

The major multimedia system issues are as follows.


Synchronization
Approaches / Issues
Time code
Interleaving
Global clocks: NNTP
QoS
Jitter
Throwing away frames
Lowering frame rate

125 ANNA UNIVERSITY CHENNAI


DMC 1934

Multimedia Server Issues


NOTES
Multimedia servers, e.g., for video-on-demand — issues
1. Admission control algorithm
2. Performance enhancement: pre-fetch, buffering, caching
3. Hard vs. probabilistic guarantees

Quality of Service aspects


1. Providing at least minimum bandwidth required
2. Synchronization between audio and video
3. Number of frames played out of order
4. Number of frames lost
5. Number of frames in error
6. End-to-end latency
7. Mean delay and variance of connection
8. Frame rate
9. Image quality: resolution and no. of colors
10. Perceived image
11. Audio quality
12. Frequency response, no. of channels and perceived sound

Multimedia Applications

Public Access
1. Tourist Information System.
2. Public Information System for Railways.

Publishing Industry

A publication can be classified according to the market it caters to and it aims at –like
the family, school children, professional persons or academics

Edutainment
1. Many Educational games are available in the market.
2. Microsoft has produced games such as Sierra, Knowledge Adventure etc.,

Business Communication
1. Employee – related communication
2. Product promotions
3. Customer information
4. Reports for innovation.
5. Multimedia Information Database

Natural language systems for making enquiries

126 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

5.1.2 Application Subsystem


NOTES
5.1.2.1 Collaborative computing

There are many tools for collaborative computing, such as e-mail, bulletin boards,
screen sharing tools, text-based conferencing systems, telephone conference systems,
conference rooms and video conference systems.

Collaborative dimensions

Electronic collaboration can be categorized according to three main parameters:

Time, user scale and control.

Time: With respect to time, there are two modes of cooperative work: asynchronous and
synchronous. Asynchronous cooperative work specifies processing activities that do not
happen at the same time; the synchronous cooperative work happens at the same time.

User Scale: The user scale parameter specifies whether a single user collaborates with
another user or a group of more than two users collaborate together.

- A group may be static or dynamic during its lifetime. A group is static if its participating
members are pre-determined and membership does not change during the activity. A group
is dynamic if the number of group members varies during the collaborative activity, i.e.
group members can join or leave the activity at any time.

 Group members may have different roles in the CSCW.

 Groups may consist of members who have homogeneous or heterogeneous


characteristics and requirements of their collaborative environment.

Control: Control during collaboration can be centralized or distributed. Centralized control


means that there is a chairman who controls the collaborative work and every group member
reports to him or her. Distributed control means that every group member has control over
his/her own tasks in the collaborative work and distributed control protocols are in place
to provide consistent collaboration.

Other partition parameters may include locality, and collaboration awareness. Locality
partition means that collaboration can occur either in the same place or among users located
in different places through tele-collaboration. Collaboration awareness divides group
communication systems into collaboration-transparent and collaboration-aware systems.
The collaboration system is an existing application extended for collaboration. The
collaborative-aware system is a dedicated software application for CSCW.

127 ANNA UNIVERSITY CHENNAI


DMC 1934

5.1.2.2 Group Communication Architecture


NOTES
Group communication involves the communication of multiple users in a synchronous
or an asynchronous mode with centralized or distributed control.
Group communication architecture consists of a support model, system model, and
interface model. The GC support model includes group communication agents that
communicate via a multi-point multicast communication network as shown below.

5.1.2.3 Application Sharing Approach


Definition: Sharing applications is recognized as a vital mechanism for supporting
group communication activities. Example: when a shared application program (e.g., editor)
executes any input from a participant, all execution results on the shared object (e.g.,
document text) are distributed among all the participants. Shared objects are displayed in
shared windows.
An important issue in application sharing is shared control. The primary design
decision in sharing applications is to determine whether they should be centralized or
replicated.
 Centralized Architecture

In a centralized architecture, a single copy of the shared application runs at one site.
All participants input to the application is forwarded to the local site and the
application’s output (shared object) is then distributed to all sites. Advantage: Easy
maintenance because there is only one copy of the application that updates the shared
object. Disadvantage: high network traffic because the output of the application needs
to be distributed every time.

128 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

 Replicated Architecture
NOTES
In a replicated architecture, a copy of the shared application runs locally at each site.
Input events to each application are distributed to all sites and each copy of the
shared application is executed locally at each site. Advantages: low network traffic
because only input events are distributed among the sites, and low response times,
since all participants get their output from local copies of the application. Disadvantage:
requirement of same execution environment for the application at each site and difficulty
in maintaining consistency.

A variety of mechanisms exist to maintain data consistency among group members.


Examples include centralized locks, floor passing schemes, and dependency detections.

Output
Shared Window Shared Window
Network
Network
Input Output

Shared Application

Centralized architecture

Input
Shared Window Shared Window
Network
Input Output Input Output

Shared Application Shared Application

Replicated Architecture

 Conferencing

Conferencing supports collaborative computing and is also called synchronous


tele-colloboration. It’s a management service that controls the communication among
multiple users via multiple media, such as video and audio, to achieve simultaneous face
to face communication.
 Video is used in technical discussions to display view-graphs and to indicate how
many users are physically present at a conference.
 Audio is an important component in tele-conferencing for describing and clarifying
visual information.

129 ANNA UNIVERSITY CHENNAI


DMC 1934

Conference control includes several functions:


NOTES  Establishing a conference.
 Closing a conference.
 Adding new users and removing users who leave the conference.

The control model may be either centralized or distributed.

 Centralized conference control

This provides the establishment of a conference. First the initiator (e.g., chairman)
starts a conference by selecting an initial group of invited conference members. The
knowledge of the conference state is inquired from a central directory server, which implies
that the client has registered his/her location.

Second, each invited client responds to the invitation so that the initiator is informed
of who will participate in the conference. After this step, a negotiation of conference policies
and an admission of resources is performed among the conference participants. during the
negotiation, the shared conference state is distributed using a reliable messaging service to
all participants. Advantage: this static control guaranteed consistency of the conference
state. Works well for small conferences. Disadvantage: when a new participant wants to
join, explicit exchange of the conference state must be performed among all participants,
which causes large delays. In case of link failure, it is more difficult to re-establish the
conference state.

 Distributed conference control

Based on a distributed conference state. This is achieved as follows: the initiator of


the conference establishes a multicast space with multicast entries for distribution of
information to the conference participants and the conference is established; the
conference participants join the conference by tuning into a particular multicast entry(e.g.,
multicast address),announced through rendezvous means(e.g., sd).

Each site distributes its own participation status to other conference participants, but
there is no global notation of a group membership, and no guarantees that all users will
have the same view of the state space. Hence, this loose control is implemented through
retransmitting the state periodically for eventual consistency. The periodical retransmission
is done using an unreliable messaging service. .The loose control works well for large
conferences.

Advantages: inherent fault tolerance and scaling properties. Disadvantage is that the
conference participants may not have the same view of the state space.

130 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

SESSION MANAGEMENT
NOTES
Session management is an important part of the multimedia communication architecture.
It is the core part which separates the control, needed during the transport, from the actual
transport.

Session Control Architecture

Session Control
Conference Membership Protocol
Control Control
Floor Control

Configuration Media
Control Control

Reliable Transport
Whiteboard (shared Protocol
workspace) agent
Real-Time Transport
Video Protocol
agent
Real-Time Transport
Audio Protocol
Media agent
agents Real-Time Transport
Sensory Protocol
data agent

 Session Manager
1. Local functionalities:
a) Membership control management, such as participant authentication or
presentation of coordinated user interfaces.
b) Control management for shared workspace, such as floor control.
c) Media control management such as inter communication among media agents
or synchronization
d) Configuration management such as exchange of interrelated QoS.
e) Conference control management, such as an establishment, modification
and a closing of a conference.

131 ANNA UNIVERSITY CHENNAI


DMC 1934

2. Remote functionality:
NOTES The session manager communicates with other session managers to exchange
session state information which may include the floor information, configuration
information, etc.
 Media Agents
Media agents are responsible for decisions specific to each type of media. Each
agent performs its own control mechanism over the particular medium.
 Shared Workspace Agent
The shared workspace agent transmits shared objects among the shared applications.

Control

Each session is described through its session state. This state information is either
private or shared among all the session participants. Session management includes two
steps to process the session state: 1.An establishment 2.Modification of the session.

Several control mechanisms are embedded in session management:

 Floor Control

Within shared workspaces, the floor control is employed to provide access to the
shared workspace. It is often used to maintain data consistency. It uses floor passing
mechanism which means that at any time, only one participant has the floor. The floor is
handed off to another participant when requested.

With real-time audio, there is no notion of data consistency, instead, the floor control
is typically used in more formal settings to promote turn-taking. Floor control for real-time
video is frequently used to control bandwidth usage.

It is also used to implement floor policies. A floor policy describes how participants
request the floor and how the floor is assigned and released.

 Conference Control

For conferencing applications, conference control is employed.

 Media Control

Media control mainly includes functionality, such as the synchronization of media


streams.

 Configuration Control

Configuration control includes a control of media quality, QoS handling, resource


availability and other system components to provide a session according to the user

132 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

requirements. This control may embed services, such as the negotiation and
renegotiation of media quality. NOTES
 Membership Control

Membership control may include services, for example, invitation to a session,


registration into a session, modification of the membership during the session, etc.

5.1.3 Transport Subsystem

5.1.3.1 User and Application Requirements


 Data Throughput
 Fast Data Forwarding
 Service Guarantees
 Multicasting

Processing and Protocol constraints

Communication protocols have some constraints which need to be satisfied.


 The requirements are satisfied best if they take the shortest possible path through
the system.
 Protocols involve a lot of data movement because of the layered structure of the
communication architecture. But copying data has become a bottleneck, hence
other mechanisms for buffer management must be found.
 Different layers may have different PDU sizes; therefore a segmentation and
reassemble occur. This phase ahs to be done fast, and efficient.

 Some parts of protocols may use retransmissions error-recovery mechanism, which


imposes requirements on the buffer spaces for queues at the expense of the larger
and-to-end delays.

5.1.3.2 Transport Layer

Transport layer need to provide the following functions for multimedia transmissions:
timing information, semi-reliability, multicasting, NAK based recovery mechanism and rate
control.
Internet Transport Protocols
Transmission control Protocol (TCP)
 TCP provides reliable, serial communication path, or virtual circuit, between
processes exchanging a full-duplex stream of bytes.
 To achieve reliable, sequenced delivery of a stream of bytes, it makes use of
timeouts and positive acknowledgements.
 Flow control in TCP makes use of sliding window technique.

133 ANNA UNIVERSITY CHENNAI


DMC 1934

 Further, TCP is not suitable for real-time video and audio transmission because its
NOTES retransmission mechanism may cause a violation of deadlines which disrupt the
continuity of the continuous media streams.

User Datagram Protocol


 It offers only multiplexing and checksumming, nothing else.
 High level protocols using UDP must provide their own retransmission, packetization,
reassembly, flow control etc.
 Many multimedia applications use this protocol because it provides to some degree
the real-time transport property, although loss of PDUs may occur.
 In general UDP is not suitable for continuous media streams because different
service guarantees cannot be provided.

Real-time Transport Protocol (RTP)


 RTP s an end-to-end protocol providing network functions suitable for applications
transmitting real-time data such ad audio, video etc.
 RTP provide functions such as determination of media encoding, synchronization,
framing, error detection, encryption, timing and source identification.
 RTP has a companion protocol RTCP (RTP Control Protocol) which is used for
monitoring of Qos and for conveying information about the participants of the
ongoing session.
 RTP does not address resource reservation and does not guarantee Qos for real-
time services. Further, it does not guarantee delivery or prevent out-of-order
delivery.

Xpress Transport Protocol


 It is designed to be an efficient protocol.
 It provides six service types: connection, transaction, unacknowledged datagram,
acknowledged datagram, isochronous stream and bulk data.
 For flow control, XTP uses sliding window, or rate-based flow control. Data packet
retransmissions are triggered by the arrival of status reports showing missing data.

Features of XTP that meet the requirements of multimedia:


 XTP provides a connection-oriented transport and network transmission.
 Different transport services are provided with fast connect-establishment for tele-
transaction service.
 Flexible error management.
 Rate-based flow control which allows a convenient mechanism for throughput and
bandwidth reservation when Qos request is issued.

134 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Problems with XTP:


NOTES
 XTP was designed to be implemented in VLSI to achieve high performance because
it is too complex.
 If the round rotation time of the underlying network frequently fluctuates, XTP
constantly enters the synchronizing handshake which is very undesirable in high
speed networks for continuous media transmissions.
 XTP has large header, which creates and overhead of 44 bytes regardless of
mode.
 Source identification and discrimination are missing in XTP.
 Internetworking with other protocols is not worked out to provide Qos handling
and resource reservation.

Internet protocol

Internet protocol properties which are relevant to Multimedia Transmission


Requirements are

a) Type of Service

TOS specifies

1. Precedence relation and

2. Services such as minimum delay, maximize throughput, maximize reliability, minimize


monetary cost and normal service.

For example, Multimedia conferencing would need service class which supports low
delay, high throughput and intermediate reliability. Precedence handling would support
real-time network traffic.

b) Addressing and Multicasting

The LAN technology brought the concept of convenient broadcasting to all end-
points on the LAN. LAN also introduces the concept of multicasting. This capability allows
an application to send a single message to the network and have it delivered to multiple
recipients. This service is attractive in a variety of distributed applications, including multi-
side audio/video conferencing and distributed database maintenance. Using class D
addresses multicast addressed packets are routed to all targets that are part of multicast
group.

To avoid duplicative replications of multicast packets by multiple routers ,a spanning


tree of routers is constructed as part of the multicasting algorithm, to route

And duplicate multicast packets.

135 ANNA UNIVERSITY CHENNAI


DMC 1934

The worldwide Internet has been providing an IP multicast routing service for some
NOTES time now, through an Internet segment called MBone (Multicast Backbone). The MBone
is Collection of UNIX workstations running a routing daemon called “mrouted”, which is
an implementation of the Distance Vector Multicast Routing Protocol. Using MBone,
conference sessions and other Internet Technical meetings can be multicast and the remote
users can listen to the technical talks and ask the speaker questions.

c) Interconnectivity between Internet Protocol and Underlying Networks

The router learns the binding of the IP to the 48 bit LAN address through the Address
Resolution Protocol(ARP).This protocol allows a router to broadcast a query containing
an IP address and receive back associated LAN address. A related protocol reverse
ARP, can be used to ask which IP address is bound to a given LAN address.
For Multimedia, the best performance would be achieved if a fixed path (static route)
could be allocated. The problem with this extension is that the IP protocol would lose the
ability to bypass link-layer failures, which is a fundamental property of the Internet
architecture and should be retained for integrated services. Further in the case of static
route, if no resource reservation would be performed along the fixed path, the flexibility of
changing a route on a packet basis would be lost , which would decrease the performance
of the best effort service.
Internet Group Management Protocol (IGMP)
Internet Group Management protocol (IGMP) is a protocol for managing Internet
multicasting groups. It is used by conferencing applications to join and leave particular
multicast group. The basic service permits a source to send datagrams to all members of a
multicast group. There are no guarantees of the delivery to any or all targets in the group.
Multicast routers send queries to refresh their knowledge of memberships present on
a particular network. If no reports are received for a particular group after some number
of queries, the routers assume the group has no local members, and that they need not
forward remotely originated multicasts for that group onto the local network.
Otherwise, hosts respond to a query by generating reports(Host membership reports).
Queries are normally sent infrequently, so as to keep the IGMP overhead on host and
routers very low. However, when a multicast router starts up, it may issue several queries
to quickly build up its knowledge of local membership.
When a host joins a new group, it should immediately transmit a report for that group,
rather than waiting for a query, in case it is the first member of the group.
In a multimedia scenario, IGMP must loosely cooperate with an appropriate resource
Management protocol, such as RSVP, to provide a resource reservation for a member
who wants to join a group during a conference.

136 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Resource reservation protocol (RSVP)


NOTES
RSVP is a protocol which transfers reservation and keeps a state at the intermediate
nodes. It does not have data transfer component. RVSP messages are sent as IP datagrams,
and the router keeps “soft state”, which is refreshed by periodic reservation messages. In
the absence of the refresh messages, the routers delete the reservation after a certain
timeout.

RVSP protocol was designed to satisfy requirements, such as:


 It must accommodate heterogeneous service needs.
 It must give flexible control over the manner in which reservations can be shared
along the branches of the multicast delivery tree.
 It must accommodate elementary actions such as adding one sender and/or receiver
to an existing set, or deleting one.
 It must be robust enough to scale well to large multicasting groups.
 It must provide for advance reservation of resources, and for the preemption that
this implies.

Real-Time Internet Protocol (RTIP)

RTIP provides for connection-oriented, performance-guaranteed, unreliable delivery


of packets. It occupies an analogous place in the Tenet protocol stack as the IP in the
Internet protocol suite. It communicates with RCAP for resource reservation, therefore it
provides guaranteed service. The tenet protocol was designed for real-time communication,
with particular emphasis on multimedia transmission.

5.2 MULTIMEDIA DATABASES (MMDB)

Multimedia data typically means digital images, audio, video, animation and graphics
together with text data. The acquisition, generation, storage and processing of multimedia
data in computers and transmission over networks have grown tremendously in the recent
past.

Contents of MMDB

An MMDB needs to manage several different types of information pertaining to the


actual multimedia data. They are:

 Media data

This is the actual data representing images, audio, video that are captured, digitized,
processes, compressed and stored.

137 ANNA UNIVERSITY CHENNAI


DMC 1934

 Media format data


NOTES
This contains information pertaining to the format of the media data after it goes
through the acquisition, processing, and encoding phases. For instance, this consists of
information such as the sampling rate, resolution, frame rate, encoding scheme etc.

 Media keyword data

This contains the keyword descriptions, usually relating to the generation of the media
data. For example, for a video, this might include the date, time, and place of recording,
the person who recorded, the scene that is recorded, etc this is also called as content
descriptive data.

 Media feature data

This contains the features derived from the media data. A feature characterizes the
media contents. For example, this could contain information about the distribution of colors,
the kinds of textures and the different shapes present in an image. This is also referred to as
content dependent data.

The last three types are called Meta data as they describe several different aspects of
the media data. The media keyword data and media feature data are used as indices for
searching purpose. The media format data is used to present the retrieved information.

Types of Databases

The four classes of the databases are the following.


1. Data processing database
2. Business process database
3. Mail database
4. Information repositories

Data processing database

Information about real world objects is maintained as alphanumeric information in


columns in a table in a relational database or as attributes in object. A column or attribute
can be used as reference objects such as image or video objects.

Business processing database

Documents are edited by large no. of users. Database may be very active and can
include large no of video objects. A large no of users in the group may read most documents
during the tracking stage.

138 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

Mail database
NOTES
The addresses and creator of mail message may access the mail for short duration
before it falls into disuse and is ultimately relegated to archival status. The routing program
delivers the message to recipients mail file.
Information repositories
Document may be large and may have multiple linked sound clips and videos.
Database management system
Many inherent characteristics of multimedia data have direct and indirect impacts on
the design of MMDB’s. These include, the huge size of MMDB’s, temporal nature, richness
of content, complexity of representation and subjective interpretation. The major challenges
in designing multimedia databases arise from several requirements they need to satisfy
such as the following:
1. Manage different types of input, output, and storage devices. Data input can be
from a variety of devices such as scanners, digital camera for images, microphone,
MIDI devices for audio and video cameras. Typical output devices are high-
resolution monitors for images and video, and speakers for audio.
2. Handle a variety of data compression and storage formats. The data encoding has
a variety of formats even within a single application. For instance, in medical
applications, the MRI images of brain have lossless or very stringent quality of
lossy coding technique, while the X-ray images of bones can be less stringent.
Also, the radiological image data, the ECG data, other patient data, etc. have
widely varying formats.
Support different computing platforms and operating systems. Different users operate
computers and devices suited to their needs and tastes. But they need the same kind of
user-level view of the database.
Integrate different data models. Some data such as numeric and textual data are best
handled using a relational database model, while some others such as video documents are
better handled using an object-oriented database model. So these two models should
coexist together in MMDB’s.
A variety of user-friendly query systems suited to different kinds of media is needed.
From a user point of view, easy-to-use queries and fast and accurate retrieval of information
is highly desirable. The query for the same item can be in different forms. For example, a
portion of interest in a video can be queried by using either
1) A few sample video frames as an example,
2) A clip of the corresponding audio track or
3) A textual description using keywords.

139 ANNA UNIVERSITY CHENNAI


DMC 1934

1. Handle different kinds of indices. The inexact and subjective nature of multimedia
NOTES data has rendered keyword-based indices, exact and range searches used in
traditional databases ineffective. For example, the retrieval of records of persons
based on social security number is precisely defined, but the retrieval of records of
persons having certain facial features from a database of facial images requires,
content-based queries and similarity-based retrievals. This requires indices that
are content dependent, in addition to key-word indices.
2. Develop measures of data similarity that correspond well with perceptual similarity.
Measures of similarity for different media types need to be quantified to correspond
well with the perceptual similarity of objects of those data types. These need to be
incorporated into the search process
3. Provide transparent view of geographically distributed data. MMDB’s are likely
to be in distributed nature. The media data resides in many different storage units
possibly spread out geographically. This is partly due to the changing nature of
computation and computing resources from centralized to networked and
distributed.
4. Adhere to real-time constraints for the transmission of media data. Video and
audio are inherently temporal in nature. For example, the frames of a video need
to be presented at the rate of at least 30 frames/sec. for the eye to perceive continuity
in the video.
5. Synchronize different media types while presenting to user. It is likely that different
media types corresponding to a single multimedia object are stored in different
formats, on different devices, and have different rates of transfer. Thus they need
to be periodically synchronized for presentation.
The recent growth in using multimedia data in applications has been phenomenal.
Multimedia databases are essential for efficient management and effective use of huge
amounts of data. The diversity of applications using multimedia data, the rapidly changing
technology, and the inherent complexities in the semantic representation, interpretation and
comparison for similarity pose many challenges. MMDB’s are still in their infancy. Today’s
MMDB’s are closely bound to narrow application areas. The experiences acquired from
developing and using novel multimedia applications will help advance the multimedia database
technology.

5.2.1 Generation of MMDB

The first generation

The first MMDBMS rely mainly on the operating system for storing and querying
files. These were ad-hoc systems that served mostly as repositories. The mid 90s saw a
first wave of commercial, implemented from-the-scratch, and full-fledge MMDBMS. Some
of them were MediaDB, nowMediaWay, JASMINE, and ITASCA

140 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

That is the Commercial successor of ORION. They were all able to handle diverse
kinds of data and provided mechanisms for querying, retrieving, inserting, and updating NOTES
data. Most of these products disappeared from the market after some years of existence,
and only some of them continued and adapted themselves successfully to the hardware
and software advances as well as to application changes. Instance, Media Way provided
early very specific support for a wide variety of different media types. Specifically different
media file formats varying from images, and video to PowerPoint documents can be managed
segmented, linked and searched.

The second generation

In a second wave, commercial systems were proposed which handle multimedia


content by providing complex object types for various kinds of media. The object-oriented
style provides the facility to define new data types and operators appropriate for the new
kinds of Media, such as video, image and audio. Therefore, broadly used commercial
MMDBMS‘s are extensible Object-Relational DBMS (RDBMS). They are successfully
shipped since 1996-1998, started by Informix. The current releases significantly improved
the performance and integration into the core systems. Future works include the extension
of search services (mainly similarity) to video and audio and advanced presentation and
browsing facilities.

The most advanced solutions are marketed by Oracle 10g, IBM DB2 and IBM
Informix. They propose a similar approach for extending the basic system. As sample, we
consider the IBM DB2 Universal Database Extenders. The IBM DB2 Universal Database
Extenders extend the ORDBMS management to images, video, audio, and spatial objects.
All these data types are modeled, accessed, and manipulated in a common framework.
Features of the multimedia extenders include importing and exporting multimedia objects
and their attributes in and out of a database, controlling access to non-traditional types of
data with the same level of protection as traditional data and browsing or playing objects
retrieved from the database. For instance, the DB2 Image Extender defines the distinct
data type DB2IMAGE with associated user-defined functions for storing and manipulating
image files

On top of MIRROR runs the ACOI system that is a platform for indexing and retrieval
of video and image data. The system provides a plug-in architecture to subsequently index
multimedia objects using various feature extraction algorithms. ACOI relies on the COBRA
(Content-Based Retrieval) video data model (only low level descriptors). COBRA
introduces a feature grammar to describe the low-level persistent meta-data and the
dependencies between the extraction mechanisms.

The MARS project includes the conception of a multimedia data model, for content
indexing and retrieval, and for database management. The presented multimedia data model
influenced the development of the MPEG-7standard. MARS is a from-the-scratch

141 ANNA UNIVERSITY CHENNAI


DMC 1934

management system, based on a query refinement processing. Furthermore, a table of


NOTES content (ToC) extraction mechanism for videos has been developed. A hybrid tree data
structure to support high-dimension feature indexing in multimedia databases has also been
realized. For the multimedia information retrieval, an adapted relevance feedback approach
that is able to learn user’s information need in image databases is proposed.

The MPEG-7 Multimedia Data Cartridge (MDC) is a system extension of the Oracle
9i DBMS providing a multimedia query language, access to media, processing and
optimization of queries, and indexing capacities relying on a multimedia database schema
derived from MPEG-7. The MDC builds on three main concepts. At first, the Multimedia
Data Model is the database schema which is derived from MPEG-7 descriptions. It is
realized with the help of the extensible type system of the cartridge environment, i.e.,
descriptors in the MPEG-7 schema are mapped to object types and tables. Secondly, the
Multimedia Indexing Framework (MIF) which provides an extensible indexing environment
for multimedia retrieval. The indexing framework is integrated into the query language and
enables efficient multimedia retrieval. Finally a set of internal and external libraries allow
the access to the media and communicate with MDC (query, insert, update, etc.). The
Multimedia Schema of MDC relies on the one hand on the structural and semantic parts of
the MPEG-7 standard (high-level descriptions). On the other hand, object types for the
MPEG-7 low-level descriptors, like color, shape, texture are provided and linked to the
high-level descriptions. This enables one to retrieve multimedia data not only by low-level
features, but also by semantics in combination with low-level characteristics.

The third generation

The Multimedia Indexing Framework (MIF) offers advanced indexing services to the
MMDBMS. It is generic in a way that new index types may be added without changing
the interface definitions. The MIF is divided into three modules. Each module, especially
the Gist Service and the Oracle Enhancement may be used on its own and may be distributed
over the network The Gist Service is the main part and realized in the external address
space. It offers services for index management and relies on the Generalized Search Trees
(GIST) framework .The original GIST framework was extended to cope with several
index trees at time and to be linked to the Oracle DBMS. Several index trees can be
employed in our system, in the cat gory balanced trees e.g., X-tree and SRtree, and other
access methods not relying on balanced trees (VA-files). This service is split into two
components: Gist Communicator and Gist Holder. The Gist Communicator is a COM-
object (Component Object Model) used for inter-process communication between the
database (by passing through the Gist

Wrapper shared library) and the implemented access methods. Thus, the Gist
Communicator supplies the necessary functionality (e.g., creating, inserting, deleting) for
accessing the index structures. The Gist Holder manages all currently running index trees

142 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

and the accesses to them. Each index tree is identified through a global and unique ID,
which is forwarded to the accessing process. The integration of MIF into the MDC is NOTES
done via the index extension mechanisms of Oracle 9i. For each new index, a new Oracle
index type has to be defined, but the interface to the Gist remains unchanged.

5.3 SYNCHRONIZATION ISSUES

Synchronization is mainly required because of timing mismatches due to communication,


processing delays and other memory related delays.

5.3.1 Synchronization

Synchronization in multimedia systems refers to the temporal relations between media


objects in the multimedia system or it may be referred to as content, spatial or temporal
relations between media objects.

A time dependent media object is presented as a media stream. If the presentation


durations of all units of a time-dependent media object are equal, it is called continuous
media object. A video consists of a number of ordered frames; each of thee frames has a
fixed presentation duration. A time-independent media object is any kind of traditional
media like text and images. Synchronization between media objects comprises relations
between time-dependent media objects and time-independent media objects. An example
of synchronization between continuous media is the synchronization between the visual
and audio information in a television programme. An example of temporal relations between
time-dependent and time-independent media is a slide show. The presentation of slides is
synchronized with the commenting audio stream.

Content relations

Content relations define a dependency of media objects from some data. An example
of a content relation is the dependency between a filled spreadsheet and a graphic text that
represents the data listed in the spreadsheet.

Spatial relations

The spatial relations that are usually known as layout relationship define the space
used for the presentation of a media object on an output device at a certain point of time in
a multimedia presentation.

Temporal relations

These relations define the temporal dependencies between media objects. They are
of interest whenever time-dependent media object exists.

143 ANNA UNIVERSITY CHENNAI


DMC 1934

Intra and Inter object synchronizations


NOTES
Intra-object synchronization: This refers to the time relation between various presentation
units of one time-dependent media object. An example is the time relation between the
single frames of a video sequence. For a video with a rate of 25 frames per second, each
of the frames must be displayed for 40 ms. Figure below shows this for a video sequence
presenting a bouncing ball.

Frames of a video sequence that shows a jumping ball

Intra-object synchronization: Inter-object synchronization refers to the synchronization


between media objects. Figure below shows an example of the time relations of a multimedia
synchronization that starts with an audio/video sequence, followed by several pictures and
an animation that is commented by an audio sequence.

Time dependent presentation units

Time dependent media objects usually consist of a sequence of information units.


Such information units are known as Logical Data Units (LDU’s). In the case of CD-
quality with PCM coding without compression, a sample rate of 44,100 Hz with two
channels and 16-bit resolution per channel is used. On a CD, these sample values are
combined to blocks of 1/75 sec of duration each which can be treated as an LDU. The
level of granularity used is application dependent. The selection of LDU’s depends on the
operations that should be performed on a media object.

Another example is an uncompressed video object that is divided into scenes and
frames. The frames can be partitioned in areas of 16x16 pixels. Each pixel consists of
luminance and chrominance values. All these units are candidates for LDU units.

144 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

In addition LDUs an be classified into closed and open LDUs. Closed LDUs have a
predictable duration. Examples are LDUs that are part of stored media objects of continuous NOTES
media like audio and video, or stored media objects with a fixed duration. The duration of
open LDUs is not predictable before the execution of presentation. Open LDUs typically
represent input from a live source, for example, a camera or a microphone, or media
objects that include a user interaction.

Classification of Logical Data Units

For digital video, often the frames are selected as LDUs. For example, for a video
with 30 pictures per second, each LDU is a closed LDU with a duration of 1/30 s. Figure
below shows the video LDUs.

In the case of the basic physical unit being too small to handle, often LDUs are
selected that block the samples into units of a fixed duration. A typical example is an audio
stream where the physical unit duration is too small; therefore LDUs are formed comprising
512 samples. Assuming one sample is coded with one byte, hence each block contains
512 bytes.

In computer generated media objects, the duration of LDUs may be selected by the
user. An example of these user-defined LDU durations is the frames of an animation
sequence. For the presentation of a two-second animation sequence, 30 to 60 pictures
may be generated depending on the necessary quality. Thus the LDU duration depends on
the selected picture rate.

145 ANNA UNIVERSITY CHENNAI


DMC 1934

LDUs of an animation sequence of user selected size


NOTES
Streams are more complex when the LDUs vary in duration. An example is the
recording of events at a graphical user interface to replay a user interaction. In this case, an
LDU is an event with a duration lasting until the next event. The duration of LDUs depends
on the user interaction and varies accordingly.

LDUs of varying duration

Open LDUs of unpredictable duration are given in the case that the LDU has no
inherent duration. An example of open LDU is a user interaction in which the duration of
the interaction is not known in advance.

An open LDU representing a user interaction

Timers can be regarded as streams of empty LDUs with a fixed duration.

LDUs of a timer

Synchronization examples

1. Lip synchronization demands tight coupling of audio and video streams.


Synchronization can be specified by defining a maximal skew between the two
media streams as shown below.

146 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

LDU view of lip synchronization


NOTES
2. A slide show with audio commentary demands that the change of slides be
temporally related to the audio commentary.

LDU view of a slide show

3. A lip synchronized audio/video sequence (Audio1 and Video) is followed by the


replay of a recorded user interaction (R1), a slide show(P1-P3) and an animation
(Animation), which is partially commented using an audio sequence(Audio2),
Starting the animation presentation, a multiple choice question is presented to the
user (Interaction). If the user has made a selection, a final picture (P4) is shown.

Synchronization example

5.3.2 Real-time synchronization

A further requirement of distributed multimedia applications is the need for a rich set
of real-time synchronization mechanisms about continuous media transmission. Such real-
time synchronization can be divided into two categories: intra-media synchronization and
inter-media synchronization.

5.3.3 Intra-media synchronization

Intra-media synchronization refers to the maintenance of real-time constraints across


a single continuous media connection. For example, in video transmission, this type of
synchronization is used to ensure the video is received with required throughput, jitter(figure
5.1) and latency. Throughput of a continuous media transmission is decided by the value of
average interval between frames which indicate the frames transmission speed. Jitter refers
to the difference between an individual interval and the average interval. Latency refers to
the time between the sending and the receiving.

147 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES

Figure 5.1 Timing relationships and jitter

5.3.4 Inter-Media Synchronization

Inter-media synchronization is more complex and concerned with arbitrary different


media types. We can discuss some examples here which include the synchronization
between audio and video channel and the synchronization between text subtitle and video
sequences. The first example illustrates an inter-media synchronization between two
continuous media types, and the last one is between a continuous media type and a discrete
media type.

5.3.5 Impact On Synchronization

Multiparty communication also adds complexity to synchronization in general. It is


important to be able to support a variety of policies for ordering of data delivery, for
example, real-time ordering, and causal ordering, attributes ordering and partial ordering.

Figure 5.2 Multicast Communication

Figure 5.2 shows the multicast communication scenario, where sender sends the full
color video at 24~30 frames per second. The video signal goes without filtering to the
receiver A ie. full color signal. Receiver B and receiver C have received the signal after one
level of filtering may be for reduction in number of frames. Receiver D receives after two
levels of filtering, i.e less number of frames as well as may be reduction in colors or mono
chrome image.

5.3.6 Presentation requirements

These comprise, for intra-object synchronization, the accuracy concerning delays in


the presentation of LDUs and, for inter-object synchronization, the accuracy in the parallel

148 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

presentation of media objects. For Intra-object synchronization, the jitter in consecutive


LDUs may be avoided. Whereas processes can wait for each other, using the method of NOTES
blocking, a data stream of time-independent LDUs can also be stopped. There is a different
situation when looking at sequences of audio and moving pictures. The blocking leads to
what is called a ‘gap’. This situation is called the ‘gap problem’. Therefore alternatives
must be available that are selected independently of the expected blocking time. If the gap
between a late video and audio exceeds a predefined threshold, the audio presentation is
switched to an alternative presentation. In the case that the gap is shorter, the audio
presentation may be stopped until the gap is closed.

In general in the case of blocking, alternative single pictures, sequences of pictures or


audio signals can be presented, or simply previously used presentation units can be repeated.
This method of process blocking, respectively streams of audio and video, is known as
“restricted blocking”

Lip synchronization requirements: Lip synchronization refers to the temporal relationship


between an audio and video stream for the particular case of humans speaking. The time
deference between related audio and video LDUs is known as the ‘skew’. Streams which
are perfectly ‘in sync’ have no skew i.e. 0 ms.

The following factors influence the lip synchronization and thus quality when only
visuals (video) are considered is

Content

Continuous talking or discrete

Background – shaky or non shaky and with or without any distortion

Resolution and quality – The difficulty of human perception to distinguish any lip
synchronization skew with a higher resolution and the capability of multimedia software
and hardware devices to refresh motion video data every 40 ms.

View mode (head view, shoulder view, or body view)

The following factors influence synchronization when audio is taken into account

 Content
 Background noise or music
 Language and articulation

149 ANNA UNIVERSITY CHENNAI


DMC 1934

Elementary media synchronization


NOTES
Some studies show that the maximum allowable jitter for 16 bit quality audio in a
sample period is 200ps, which is the error equivalence to the magnitude of the LSB of a
full-level maximum-frequency 20KHz signal. The combination of audio and animation is
usually not as stringent as lip synchronization.

Two audio tracks can be tightly or loosely coupled. For tightly coupled audio tracks
a skew of a magnitude of 20 ms is allowable. For loosely coupled audio a skew of 500 ms
is affordable.

The combination of audio with images has its initial application in slide shows. A skew
of 1 ms. is affordable.

The synchronized presentation of audio with some text is usually known as ‘audio
annotation’. For this type of media synchronization, the affordable skew can be 240ms.

The synchronization of video and text or video and image occurs in two distinct
fashions: In the overlay mode text is often an additional description to the displayed moving
image sequence. For such situations a skew of 240 ms is affordable. In the second mode,
no overlay occurs and skew is less important. For such types of applications a skew of
500ms is affordable.

5.4 ARCHITECTURE OF THE MULTIMEDIA PRESENTATION SYSTEM

Figure 5.3 A multimedia presentation system

150 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

The system shown in figure 5.3 uses a distributed set of computers to control six
independent video properties for three wall screens and three monitors into the navigation NOTES
table. Simple multimedia systems are but extended computers that enable capture, storage,
and playback of several media types.

Modern systems for multimedia development, and some for playback, have extensive
multimedia capabilities, with many components in the multimedia subsystem, often more
(powerful) than in the host system!.

5.5 APPLICATIONS

5.5.1 Multimedia Message Systems

Multimedia message systems are extensions of contemporary electronic mail and


conference systems which include multimedia data handling capabilities. Multimedia message
systems can create, transmit, receive, reply to, forward, save, retrieve, and delete multimedia
messages. As part of the message creation and editing processes, multimedia message
systems can import different media materials and integrate them. Since multimedia message
systems can incorporate sophisticated data handling capabilities, the distinction between
this type of system and multimedia database systems can sometimes appear hazy; however,
the primary purpose of these two kinds of systems is quite different. Multimedia database
systems are optimized for database functions, while multimedia message systems are
optimized for communication functions.

5.5.2 Virtual Reality Systems

It is a technology that provides one with sensations and the control of perspective so
that one experiences being the illusion of being in the presence of the object, within a
situation or surrounded by a place. It relies on audio and video technologies augmented by
a computer interface that reads the movement of the participant’s body. It is used in
1. Perambulation (walking through a building)
2. Synthetic Experience (perform surgery, operate a plant control room)
3. Realization (foreign currency , inventory of items)

The preceding types of multimedia computer systems enrich the computing environment
with a wider variety of visual and auditory data. Virtual reality systems transform the
computing environment by immersing the user in a simulated world, which also can include
movement and tactile control. When this is accomplished, the user enters a “virtual reality.”
virtual reality systems will permit users to interact with computer systems in a manner that
more closely mimics how humans naturally operate in the real world.

151 ANNA UNIVERSITY CHENNAI


DMC 1934

MULTIMEDIA APPLICATIONS
NOTES
5.5.2.1 Virtual Reality

The term Virtual Reality (VR) promises far more than our technology can currently
deliver. It has been variously used to describe user interfaces ranging from synthesized
physical environment presented on Head-Mounted Displays (HMDs), to ordinary graphics
displayed on conventional CRTs, to text-based multi-user games.

Computer-based VR systems are three-dimensional, interactive as opposed to passive,


and use one or more devices in an attempt to provide the user with a sense of presence, be
it visual, auditory or tactile. Among these devices are head-tracked displays and stereo
displays (both visual and audio), hand trackers and haptic displays(devices which provide
force feedback).

The first VR systems appeared before computers were used for VR. Early flight
simulators also created virtual environments without the aid of computers. They used movies
or created live video by shooting model boards with TV cameras.

Currently, the hardware platform of virtual environments consists of color stereo HMDs,
haptic displays, spatial sound, data gloves and 3D graphics. The software architecture for
virtual environments has been developed to support a single hardware platform or a small
number of tightly coupled platforms. As a result, systems were originally modeled after
traditional interactive programs. The first virtual environment applications were simple event-
loop-based programs. There are several problems with this approach because the following
requirements need to be satisfied:
1. VR displays should respond to changes in tracked objects, especially the user’s
head, at least ten times per second for the virtual environment to be convincing.
2. VR systems should not have tightly coupled distributed processes because this
approach does not scale towards new hardware and system software solutions. A
solution is to use structured as a large set of asynchronous, event-driven processes.
Each process is independent of the others and communication is performed via a
well-defined message protocol.
3. VR systems should scale up gracefully. Solutions for this requirement can be
achieved using adaptive algorithms, dynamic environments and adaptive protocols.
4. VR systems should have immersive, fully synthesized, photo-realistic graphical
displays. The solution for this requirement is still far away because current technology
still does not provide such displays. Partial solutions to this requirement might be in
a graphical display with (a) rendering of fully scenes, (b) rendering images from the
view point of a given user.

There are many VR systems based on various types of implementation approaches.


The simulation determines what is displayed and the user’s head position determines from

152 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

which angle it is displayed, MR Toolkit distributes the VR system over the multiple processes.
Another implementation approach took toolkits such as dVS, VR-DECK or DIVE.
NOTES
5.5.3 Mobile Messaging

Mobile messaging is defined as the ability to send and receive e-mail

Messages from a mobile, hand-held or hand-carried device.

Conventional internet vs mobile computing

Convergence between current network technologies: the internet and the mobile
telephony is thus taking place, but the internet’s ip routing, was designed to work with
conventional static nodes not mobile nodes. Efforts are therefore being made in wireless
and internet forums to enhance ip routing to support mobility and many proposals have
been made in this direction.

Mobile ip is a key proposal from the internet engineering task force (ietf) that specifies
protocol enhancements to enable transparent routing of ip data packets to mobile nodes in
the internet. This white paper thus consolidates and summarizes mobile ip concepts from
the base rfc, as well as numerous related rfcs.

E-mail and cell phones each provide obvious and significant advantages over traditional
land-line telephone communication, and this accounts for their extraordinary success. Among
other things, e-mail provides the ability to conduct interpersonal communications on an
asynchronous basis, and also provides the qualitative advantages of the written, over the
spoken, word. And cell phones provide the self-evident advantages of mobile, real-time
voice communications. Individually, the major advantage of each of these two technologies
is not shared by the other. The correct combination of the two technologies, however, can
provide the major advantages of both. We refer to this combined technology as mobile
messaging. Formally, we define mobile messaging as the ability to send and receive e-mail
messages from a mobile, hand-held or hand-carried device. This capability is also sometimes
referred to as wireless messaging, or mobile e-mail.
5.5.4 Mobile Telephone function access

During the 1990s, mobile telephones and the networks they were connected to began
to offer a far broader range of functions than simply making and receiving calls. Users
could divert calls, set up message boxes, change ringing options, and so on. To support
this range of functionality, most mobile phones used — and still do — a hierarchical menu
based approach. That is, a user can view a series of options on the small screen of the
phone,select one of the options. They are then presented with a series of sub-options (an
example display from a mobile phone is shown in figure 1). This navigation continues until
the user finds the function or the desired information is found (or they give up).figure 1:

153 ANNA UNIVERSITY CHENNAI


DMC 1934

mobile ‘phone display’ showing sub-menu and navigation. Third sub-option of menu two
NOTES displayed. The next level in the tree can be reached by pressing select and the previous by
back

1. Japanese Jumbo Jets


2. S. Africa Faces Cris
3. Buried Quake Couple
4. 33 Die in China Boa
5. World Women May Hur
6. Parents Demand Guat

Read Menu

In the mid-1990s, we were contacted by one of europe’s major mobile


telecommunications providers (now known as orange plc). They had noticed that their
customer service centre was receiving many calls from customers who were finding it very
hard to operate the range of functions and services. In response we carried out an analysis
using orange’s network of the usability of one of the most popular mobile phones at the
time, which could only display a few menu options at a time [18]. This work showed that
the number of key presses to view all the menu options was110 while the average number
of key presses to access a function was 8.2. These figures seemed excessive. These figures
were the theoretical optimal performance a user could achieve, making no mistakes and
knowing exactly what they were doing. Of course, in reality, users make mistakes, need to
press the ‘go back’ key, or even abort the navigation completely. We were not surprised,
then, when we carried out a user-based experiment, that we found that when real users
were used, the average number of key strokes was 16.52, almost double the ‘perfect’
theoretical predictions — emphasising the significance of user errors, and of doing actual
experiments.

The mobility problem


These days more and more people enjoy the advantages of the portability and flexibility
of carrying their workstations in the form of laptops, notebooks, and pda handsets. To
meet the needs of this new set of users, existing computing environments based on fixed
networks are being extended into the mobile domain.
For data access services and multimedia communication, it is seen as desirable to
adapt traditional applications and services people are accustomed to use in the fixed network,
and extend them to make them available to the mobile user in a seamless manner.
The most dominant services in mobility are the internet/intranet services, which run on
top of the ip protocol. Internet host mobility poses a problem at the network layer (ip)
when a mobile node moves from one sub-net to another. Routing tables have to be updated
to route packets to the destination sub-net instead of the original sub-net. This procedure
is highly inefficient and time consuming, in particular, if a mobile host needs to retain its

154 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

network address (ip address) while changing sub-nets. But if a mobile host changes its
network address, all established transport layer connections (tcp) are broken. NOTES
EXAMPLE
Imagine a commuter downloading music while travelling by train . This user is using a
laptop attached to a mobile handset. The mobile handset could be connected to the internet
using data services provided by gsm or cdma
Networks. When the user registers for data services, i.e. The user initiates a data call,
he/she will be assigned a unique ip address. Once connected, the user starts an ftp session
to download music from the internet. This ftp session is based on a transport layer connection
that is dependent on the connection invariant1. But as the train moves, the mobile station
moves to another cell; the point of attachment for data services and therefore the sub-net,
may change. If the mobile station is now assigned a new ip address, all the transport layer
connections will break down. The ftp session will therefore be aborted.
This is the problem that mobile ip seeks to solve.

5.5.5 Video Conferencing

Teleconferencing systems allow the user to achieve the most of the efficiency and
productivity of traditional meetings with one main difference: the user can stay at his/her
desk as can the remote conference participants. A multimedia conferencing system enables
people to work together across geographically distant locations without the need to meet
at one site. They communicate among each other in multi-party or face-to-face mode
using motion video, audio and textual information in each direction. The audio and video
quality heavily depends on the platform. Therefore a big factor in the success of a tele-
conferencing system is to achieve high media quality over any platform and interconnectivity
among various platform vendors. A possible set up is shown in the figure below.

Monitor Monitor

Work station Work station

Network

155 ANNA UNIVERSITY CHENNAI


DMC 1934

Video conferencing is used either in an office environment, where the video is displayed
NOTES on a PC or workstation screen, or in a conference room, where the video is displayed on
a video wall. For the office environment, desktop video conferencing systems have been
developed. For a conference room environment, large TV screens in conference rooms
are used for meetings of groups located at different geographical places.

Desktop video conferencing system often include a dedicated shared white-board


application (i.e., drawing and writing software support for multiple users). Application
sharing denotes techniques which replicate the user interface of the particular software so
that the software can be used simultaneously by all participants of a conference. The
concurrency in the activities underlies the mechanisms of “floor passing” to determine which
on the users may actually interact with the software at a given time.

Some examples of conferencing tools are BERKOM’s MMC (Multimedia


Conferencing) ‘vat’ for audio conferencing and ‘nv’ for video conferencing.

5.5.6 Interactive Video

Interactive video research various problems in the area of interests and Video-
On-Demand. Interactive TV research concentrates on cable a television, whereas Video-
On-Demand concentrates computer-oriented. Since both areas merge, in the future we
will see the results of both are interactive video service.

5.5.6.1 Interactive TV

Interactive TV specifies that the TV viewer can become a more active than is the
case today. There are several types of interactivity. For instance user might select one out
of several camera angles from a televised sport or ask for supplementary information
about the teams or players. Another could be an educational program where one out of
several educational to be selected and/or extra tutorials could be requested.

Interactive TV is an application that may require different types of technological


solutions because the interactive programs would be too specialized to be on ordinary
channels. This means that one has to subscribe either to sport TV channels, or to a
telecommunication service. Both cases require the TV signal and equipment for
communicating with the TV studio.

5.5.6.2 Video-On-Demand

Video-On-Demand (VOD) services represent a class of application with information


is accessed from one or more video servers.
More generally, VOD systems include many more components that are the provision
of a complete service, such as video server(s), administration maintenance systems,
networking services, backbone networks for linking geographically distributed video servers

156 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

and set top units for receiving, demodulating, decoding and converting video for television
playback. NOTES
VOD services need retrieval tele-services. Furthermore, the video services are an
asymmetrically switched service in which the customer chooses among a wide selection of
video material and receives, on-demand, a real-time response. The service is asymmetric
in the sense that the downstream (to the customer) channel is much higher bandwidth than
the upstream channel.
The best-known application of VOD is the video library which uses Interactive
VOD. Interactive VOD allows a user to gain access to a movie (i.e., digitized video
sequence stored on a storage medium such as hard disk) via point-to-point connection.
This connection allows the user individual and instantaneous control of the storage medium
in terms of start, fast-forward, pause and rewind actions.
There are two types of interactive VOD service:
 Interactive VOD with instantaneous access whereby the user can instantly retrieve
and individually control program information from a library instantly with instant
control response. The service is provided as follows: the customer selects a movie
out of a large set of movies; the transmission starts within a few seconds; the user
can stop and continue the transmission instantaneously, and; the user gets
uninterrupted response. To provide the service, a video load buffer must be created
at the start of the programme that responses to different functions can be performed
immediately.

Video Set-top Box


Server Backbone
Network Community
Network
Set-top Box
Video
Server

Switching Set-top Box


Office
Archive

Video
Server Switch
Video
Dial Head
Tone
Video Gateway End
Server

157 ANNA UNIVERSITY CHENNAI


DMC 1934

 Interactive VOD with delayed access, whereby users retrieve and individually
NOTES control program information from a library, but there is a waiting time depending
on the available bandwidth resources to the network, and/or popularity index of
the requested program. In this case, the user needs to wait a few minutes before
the movie starts. For this case, the video load buffer is only created when the
function pause is performed and not the start of the program. Therefore, this service
consumes less video buffer resources.

QUESTIONS
1. Discuss the processor configuration required for multimedia live presentation system.
2. What are all the functions of a multimedia operating system
3. Compare the first three generations of multimedia database systems.
4. Discuss the QoS architecture for a distributed multimedia system.
5. Define jitter and orchestration
6. Write short notes on the following:
a. Virtual reality b. Mobile messaging c Video- on -demand

158 ANNA UNIVERSITY CHENNAI


COMPUTER GRAPHICS AND MULTIMEDIA SYSTEMS

NOTES NOTES

159 ANNA UNIVERSITY CHENNAI


DMC 1934

NOTES NOTES

160 ANNA UNIVERSITY CHENNAI

You might also like