0% found this document useful (0 votes)

6 views20 pages

Lec 3

This document covers the representation of real numbers in computing, focusing on sign-magnitude representation, radix complement, and fixed-point and floating-point numbering systems. It explains the complexities of arithmetic operations with sign-magnitude, the advantages of radix complement, and the conversion processes between decimal and floating-point formats. Additionally, it details the IEEE 754 standard for floating-point representation, including the structure of single and double precision formats.

Uploaded by

Deseres 4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views20 pages

Lec 3

Uploaded by

Deseres 4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Introduction to IT

Lecture 3:
Representation of real numbers; FP2 codes; IEEE 754 standard.
Sigh-Magnitude Representation
 This code is used to represent signed, both positive and negative, integers. In
the sign-magnitude representation, all bits of the number except the most
significant one have the same meaning as in the case of interger binary code
representation. The most significant bit is the sign bit: 0 - indicates a positive
number, 1 - indicates a negative number.

bn-1 bn-2 bn-3 ... b2 b1 b0

Sign bit Number magnitude

1 011

0 011

 Examples:
 1101SM = -510
 0101SM = 510
Sigh-Magnitude Representation
 00110110SM = (-1)0 * (0*26+1*25+1*24+0*23+1*22+1*21+0*20) = 5410
 10110111SM = (-1)1 x (25 + 24 + 22 + 21 + 20) = - (32 + 16 + 4 + 2 + 1) = - 5510

Although simple, the SM representation is complicated when performing arithmetic

operations. In particular, the sign bit has to be dealt with separately from the
magnitude bits.
Example: addition of +18 (010010) and – 19(110011) using SM representation.
Signs are different, the result should carry the sign of the larger number in magnitude
– (-19).
The remaining 5-bit numbers are subtracted (10011 – 10010)= (00001), that is -1.
The range of SM numbers:
4 bit: from -7 = 1111SM till +7 = 0111SM
8 bit: from – 127 = 1 111 1111SM till +127 = 0 111 1111SM
Radix Complement
 A positive number is represented the same way as in SM.
 A negative number is represented using the b’s complement (for base b numbers).
 The most significant bit has a weight of (-2n-1), where n is the number of bits in the
number notation.
2’s complement of (-19):
1 – 19 – 010011
2 – each digit is complemented (negated) – 101100 +1
3 – „1” is added to the least significant bit – 101101
2’s complement of (+18):
It is positive, the same as SM – 010010
Addition of these numbers:
101101 +
010010 =
111111, this is the 2’s complement representation of (-1) –
1*(-25) + 1*24+1*23+1*22+1*21+1*20 = -32+16+8+4+2+1 = -1(10)
Radix Complement
 Advantage:
 No special treatment is needed for the sign of the numbers;
 A carry coming out of the most significant bit while performing arithmetic
operations is ignored without affecting the correctness of the result.
Adding (-19) – 101101 and +26 – 011010:
101101 +
011010
000111 with a carry bit (1) which is ignored
Result – 000111 - +710

11 1111
Diminished Radix Complement
 No „1” is added to the least significant bit after complementing.
 (-19) - 101100
 (+18) – 010010
 The most significant bit has a weight of (-2n-1 + 1), where n is the number of bits
in the number notation
Addition result – 111110 = 1*(-25+1) + 1*24+1*23+1*22+1*21+0*20= -31 + 16+8 +
4 + 2 = -1(10)

Disadvantage:
The need for a correction factor whenever a carry is obtained from the most
significant bit while performing arithmetic operations.
Adding -3 (111100) to + 18 (010010)
Result – (1) (001110), then add the carry to the least significant bit – 001111 –
which +15 – a correct result.
Fixed - point numbering system
 A fixed position of the radix point, separating the integer part from the
fractional part of a a number.

Weight 103 102 101 100 10-1 10-2 10-3 10-4 10-5

Digits 25 6 8 , 83954
Position number 3 2 1 0 -1 -2 -3 -4 -5
Interger part Fractional part

 Conversion a fractional number from binary to decimal:

- multiply the digits of the number by powers of two. In the fractional part,
the negative powers of 2 are the multiplier.
- calculating the value of a fixed-point number according to the basic
procedure requires calculating the fractions. There is, however, a
simplification: to treat the fractional part as a integer part and multiply it by
the weight of the last digit of the fixed-point notation.
Fixed - point numbering system
 101011,01101(2)
 101011 = 1*25+0*24+1*23+0*22+1*21+1*20 = 32+8+2+1=43
 01101 = 1*23+1*22+1*20= 8+4+1=13
 13*2-5=13*1/32=13/32
 101011,01101(2) = 43*13/32

Conversion of a fractional number from decimal to binary

When converting a fractional number from decimal to binary, we multiply the
fractional part by 2 retaining the obtained integers as the reguired digits until the
fractional part is zero or the result is an infinite fraction.
The solution is a number made up of its integer and fractional parts.
It should be noticed, however, that the fractional part conversion may not
terminate after a finite number of repeated multiplications. Therefore the proces
may have to be terminated after a numer of steps, thus leading to some
acceptable approximation.
Fixed - point numbering system
 4,1875(10)
Integer part: 4(10) = 100(2)
Fractional part
Number Multiplication Result
0,1875 0,1875 * 2 0,375
0,375 0,375 * 2 0,75
0,75 0,75 * 2 1,5
1,5 0,5 * 2 1,0
1,0 0

Result: 4,1875(10) = 100,0011(2)

7,575(10)
Integer part: 7(10) = 111(2) Result – 111.10010011…..
Number Multiplication Result
0,575 0,575 * 2 1,150
0,150 0,150 * 2 0,300
0,300 0,300 * 2 0,600
0,600 0,6 * 2 1,2
0,2 0,2*2 0,4
0,4 0,4*2 0,8
0,8 0,8*2 1,6
0,6 0,6*2 1,2
…………………………….
Floating Point Numbers
 Floating-point representation - basically represents reals in scientific notation.
Scientific notation represents numbers as a base number and an exponent. For
example, 123.456 could be represented as 1.23456 × 102.
 Floating-point solves a number of representation problems. Fixed-point has a fixed
window of representation, which limits it from representing very large or very
small numbers. Also, fixed-point is prone to a loss of precision when two large
numbers are divided.
 Floating-point, on the other hand, employs a sort of "sliding window" of precision
appropriate to the scale of the number. This allows it to represent numbers from
1,000,000,000,000 to 0.0000000000000001 with ease.
 IEEE floating point numbers have three basic components:
- the sign,
- the exponent, and
- the mantissa. The mantissa is composed of the fraction and an implicit leading
digit. The exponent base (2) is implicit and need not be stored.
Floating Point Numbers
 Layout:
Sign Exponent Fraction Bias
Single Precision 1 [31] 8 [30-23] 23 [22-00] 127
Double Precision 1 [63] 11 [62-52] 52 [51-00] 1023

 The Sign Bit

The sign bit: 0 denotes a positive number; 1 denotes a negative number. Flipping the
value of this bit flips the sign of the number.
 The Exponent
The exponent field needs to represent both positive and negative exponents. To do
this, a bias is added to the actual exponent in order to get the stored exponent. For
IEEE single-precision floats, this value is 127. Thus, an exponent of zero means that
127 is stored in the exponent field. A stored value of 200 indicates an exponent of
(200-127), or 73. For special reasons, exponents of -127 (all 0s) and +128 (all 1s) are
reserved for special numbers.
For double precision, the exponent field is 11 bits, and has a bias of 1023.
Floating Point Numbers
 The Mantissa
 The mantissa, also known as the significand, represents the precision bits of
the number. It is composed of an implicit leading bit and the fraction bits.
 To find out the value of the implicit leading bit, consider that any number can
be expressed in scientific notation in many different ways. For example, the
number five can be represented as any of these:
5.00 × 100
0.05 × 102
5000 × 10-3
 In order to maximize the quantity of representable numbers, floating-point
numbers are typically stored in normalized form. This basically puts the radix
point after the first non-zero digit. In normalized form, five is represented as
5.0 × 100.
 A nice little optimization is available to us in base two, since the only possible
non-zero digit is 1. Thus, we can just assume a leading digit of 1, and don't
need to represent it explicitly. As a result, the mantissa has effectively 24 bits
of resolution, by way of 23 fraction bits.
The Conversion Procedure (Dec to FP)
The rules for converting a decimal number into floating point are as follows:
 Convert the absolute value of the number to binary, perhaps with a fractional part
after the binary point. This can be done by converting the integral and fractional
parts separately. The integral part is converted with the techniques examined
previously. The fractional part can be converted by multiplication. This is basically
the inverse of the division method: we repeatedly multiply by 2, and harvest each
one bit as it appears left of the decimal.
 Append × 20 to the end of the binary number (which does not change its value).
 Normalize the number. Move the binary point so that it is one bit from the left.
Adjust the exponent of two so that the value does not change.
 Place the mantissa into the mantissa field of the number. Omit the leading one,
and fill with zeros on the right.
 Add the bias to the exponent of two, and place it in the exponent field. For IEEE
32-bit, the bias is 127.
 Set the sign bit, 1 for negative, 0 for positive, according to the sign of the original
number.
Example 1

 Convert -1313.3125 to IEEE 32-bit floating point format.

a. The integral part is 131310 = 101001000012. The fractional:

0.3125 × 2 = 0.625 0 Generate 0 and continue.

0.625 × 2 = 1.25 1 Generate 1 and continue with the rest.
0.25 × 2 = 0.5 0 Generate 0 and continue.
0.5 × 2 = 1.0 1 Generate 1 and nothing remains.
b. So 1313.312510 = 10100100001.01012.
c. Normalize: 10100100001.01012 = 1.010010000101012 × 210.
d. Mantissa is 01001000010101000000000, exponent is 10 + 127 = 137 =
100010012, sign bit is 1.

So -1313.3125 is 11000100101001000010101000000000 = c4a42a0016

Example 2
 Convert 0.1015625 to IEEE 32-bit floating point format.
a. Converting:

0.1015625 × 2 = 0.203125 0 Generate 0 and continue.

0.203125 × 2 = 0.40625 0 Generate 0 and continue.
0.40625 × 2 = 0.8125 0 Generate 0 and continue.
0.8125 × 2 = 1.625 1 Generate 1 and continue with the rest.
0.625 × 2 = 1.25 1 Generate 1 and continue with the rest.
0.25 × 2 = 0.5 0 Generate 0 and continue.
0.5 × 2 = 1.0 1 Generate 1 and nothing remains.
b. So 0.101562510 = 0.00011012.
c. Normalize: 0.00011012 = 1.1012 × 2-4.
d. Mantissa is 10100000000000000000000, exponent is -4 + 127 = 123 =
011110112, sign bit is 0.

So 0.1015625 is 00111101110100000000000000000000 = 3dd0000016

The Conversion Procedure (FP to Dec)
The rules for converting a floating point number into decimal are simply to reverse of the
decimal to floating point conversion:
 If the original number is in hex, convert it to binary.
 Separate into the sign, exponent, and mantissa fields.
 Extract the mantissa from the mantissa field, and restore the leading one. You may
also omit the trailing zeros.
 Extract the exponent from the exponent field, and subtract the bias to recover the
actual exponent of two. As before, 127 for the 32-bit.
 De-normalize the number: move the binary point so the exponent is 0, and the value
of the number remains unchanged.
 Convert the binary value to decimal. This is done just as with binary integers, but the
place values right of the binary point are fractions.
 Set the sign of the decimal number according to the sign bit of the original floating
point number: make it negative for 1; leave positive for 0.
 If the binary exponent is very large or small, you can convert the mantissa directly to
decimal without de-normalizing. Then use a calculator to raise two to the exponent,
and perform the multiplication. This will give an approximate answer, but is sufficient
in most cases.
Example 3

Convert the 32-bit floating point number 44361000 (in hex) to decimal.
1. Convert and separate: 4436100016 = 01000100001101100001000000000000 2
2. Exponent: 100010002 = 13610; 136 − 127 = 9.
3. Denormalize: 1.011011000012 × 29 = 1011011000.01.
4. Convert:
Exponents 29 28 27 26 25 24 23 22 21 20 2-1 2-2
Place Values 512 256 128 64 32 16 8 4 2 1 0.5 0.25
Bits 1 0 1 1 0 1 1 0 0 0 . 0 1
Value 512 + 128 + 64 + 16 + 8 + 0.25 = 728.25
5. Sign: positive
Result: 44361000 is 728.25.
Example 4
Convert the 32-bit floating point number be580000 (in hex) to decimal.
1. Convert and separate: be58000016 = 10111110010110000000000000000000 2
2. Exponent: 011111002 = 12410; 124 − 127 = -3.
3. Denormalize: 1.10112 × 2-3 = 0.0011011.
4. Convert:
Exponents 20 2-1 2-2 2-3 2-4 2-5 2-6 2-7
Place Values 1 0.5 0.25 0.125 0.0625 0.03125 0.015625 0.0078125
Bits 0 . 0 0 1 1 0 1 1
Value 0.125 + 0.0625 + 0.015625 + 0.0078125 = 0.2109375
5. Sign: negative
Result: be580000 is -0.2109375.
Example 5
Convert the 32-bit floating point number 76650000 (in hex) to decimal.
1. Convert and separate: 7665000016 = 01110110011001010000000000000000 2
2. Exponent: 111011002 = 23610; 236 − 127 = 109.
3. Since the exponent is far from zero, convert the original (normalized) mantissa:
Exponents 20 2-1 2-2 2-3 2-4 2-5 2-6 2-7
Place Values 1 0.5 0.25 0.125 0.0625 0.03125 0.015625 0.0078125
Bits 1 . 1 1 0 0 1 0 1
Value 1 + 0.5 + 0.25 + 0.03125 + 0.0078125 = 1.7890625
4. Use calculator to find 1.7890625 × 2109. You should get something like
1.16116794981 × 1033 .
5. Sign: positive
Result: 76650000 is about 1.16116794981 × 10 33 .
Putting it All Together

 So, to sum up:

 The sign bit is 0 for positive, 1 for negative.
 The exponent's base is two.
 The exponent field contains 127 plus the true exponent for
single-precision, or 1023 plus the true exponent for double
precision.
 The first bit of the mantissa is typically assumed to be 1.f, where
f is the field of fraction bits.

Ravi Bca Iii M1 Python Practical File
No ratings yet
Ravi Bca Iii M1 Python Practical File
21 pages
Floating Point Representation: Reading: B&O 2.4
No ratings yet
Floating Point Representation: Reading: B&O 2.4
44 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
FIXED and FLOAT
No ratings yet
FIXED and FLOAT
8 pages
Data Representation
No ratings yet
Data Representation
58 pages
Number Systems - Data Representation (Numbers)
No ratings yet
Number Systems - Data Representation (Numbers)
27 pages
MTH 214 Accuracy in Numerical Calculations and Error Analysis
No ratings yet
MTH 214 Accuracy in Numerical Calculations and Error Analysis
18 pages
Introduction To Numerical Computing: Statistics 580 Number Systems
No ratings yet
Introduction To Numerical Computing: Statistics 580 Number Systems
35 pages
ARCh Presentation1
No ratings yet
ARCh Presentation1
12 pages
Data Representation
No ratings yet
Data Representation
28 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
7 pages
Floating Points
No ratings yet
Floating Points
31 pages
3-EED220 Lecture 3
No ratings yet
3-EED220 Lecture 3
22 pages
#3 - Floating Point
No ratings yet
#3 - Floating Point
38 pages
CO III SEM UNIT V (1) Anu Degree Notes For Co
No ratings yet
CO III SEM UNIT V (1) Anu Degree Notes For Co
32 pages
COA Unit 2
No ratings yet
COA Unit 2
23 pages
CO Unit-V
No ratings yet
CO Unit-V
10 pages
Week 5: IEEE Floating Point Revision Guide For Phase Test
No ratings yet
Week 5: IEEE Floating Point Revision Guide For Phase Test
23 pages
EC-502 - Aritra Dutta
No ratings yet
EC-502 - Aritra Dutta
6 pages
LECTURE NOTE Fixed and Floating Point Representation
No ratings yet
LECTURE NOTE Fixed and Floating Point Representation
3 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
5 pages
Chapter 1 - Izaac-Wang - Computational Quantum Mechanics (2018)
No ratings yet
Chapter 1 - Izaac-Wang - Computational Quantum Mechanics (2018)
12 pages
Computer Architecture and Organization Unit1 &,2.
No ratings yet
Computer Architecture and Organization Unit1 &,2.
23 pages
Binary Tutorial
No ratings yet
Binary Tutorial
10 pages
IEEE Standard 754
No ratings yet
IEEE Standard 754
10 pages
What Are Floating Point Numbers?
No ratings yet
What Are Floating Point Numbers?
7 pages
Number Representation
No ratings yet
Number Representation
4 pages
2 3-FloatingPtNumbers
No ratings yet
2 3-FloatingPtNumbers
44 pages
Floating Point Numbers 237045407 237045407
No ratings yet
Floating Point Numbers 237045407 237045407
20 pages
Computer Architecture & Organization Unit 2
No ratings yet
Computer Architecture & Organization Unit 2
24 pages
CAO 2 Unit 1
No ratings yet
CAO 2 Unit 1
59 pages
5.3 Representing Data - The Binary Number System
No ratings yet
5.3 Representing Data - The Binary Number System
22 pages
Lecture 4 - Computer Arithmetic
No ratings yet
Lecture 4 - Computer Arithmetic
18 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Floating Point Number
No ratings yet
Floating Point Number
34 pages
IEEE Standard 754 Floating Point Numbers
No ratings yet
IEEE Standard 754 Floating Point Numbers
7 pages
Number Representation
No ratings yet
Number Representation
7 pages
L4
No ratings yet
L4
29 pages
Fixed and Floating Point Representations
No ratings yet
Fixed and Floating Point Representations
4 pages
Fixed Versus Floating Point
No ratings yet
Fixed Versus Floating Point
5 pages
Floating Point: Contents and Introduction
No ratings yet
Floating Point: Contents and Introduction
7 pages
Fixed and Floating Point Numbers: Dr. Ashish GUPTA Sense, Vit-Ap Ashish - Gupta@vitap - Ac.in
No ratings yet
Fixed and Floating Point Numbers: Dr. Ashish GUPTA Sense, Vit-Ap Ashish - Gupta@vitap - Ac.in
34 pages
Fixed - and - Floating - Point - Representation
No ratings yet
Fixed - and - Floating - Point - Representation
40 pages
Fixed & Floating Point
No ratings yet
Fixed & Floating Point
31 pages
3 Fixed and Floating Point DSP
No ratings yet
3 Fixed and Floating Point DSP
23 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
23 pages
Finite Word Length Effects in Digital Filter
No ratings yet
Finite Word Length Effects in Digital Filter
26 pages
Cacc
No ratings yet
Cacc
106 pages
Floating Point 6up
No ratings yet
Floating Point 6up
7 pages
The IEEE Standard For Floating Point Arithmetic
No ratings yet
The IEEE Standard For Floating Point Arithmetic
9 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
3 pages
Lec 2 Unit-1
No ratings yet
Lec 2 Unit-1
65 pages
2.4 Floating Point Representation
No ratings yet
2.4 Floating Point Representation
7 pages
Unit 2
No ratings yet
Unit 2
16 pages
Finite Word Length Effects
No ratings yet
Finite Word Length Effects
31 pages
Chap 02
No ratings yet
Chap 02
16 pages
Data Representation Workbook
No ratings yet
Data Representation Workbook
8 pages
COA UNIT-III PPTs Dr.G.Bhaskar ECE
No ratings yet
COA UNIT-III PPTs Dr.G.Bhaskar ECE
64 pages
CSC340 - HW3
No ratings yet
CSC340 - HW3
28 pages
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
Fast mental calculation tricks
From Everand
Fast mental calculation tricks
EasyMath
No ratings yet
Criteria For Grading The Level of Achievement of Course Outcomes
No ratings yet
Criteria For Grading The Level of Achievement of Course Outcomes
2 pages
Lec 2
No ratings yet
Lec 2
20 pages
Lec 42024
No ratings yet
Lec 42024
13 pages
Curve Sketching
No ratings yet
Curve Sketching
2 pages
26461621ii Puc Computer Science Lab Manual
No ratings yet
26461621ii Puc Computer Science Lab Manual
61 pages
CHPT 01 - GE - C1 - International System of Measurement
No ratings yet
CHPT 01 - GE - C1 - International System of Measurement
59 pages
Core Java With SCJP Ocjp Notes by Durga Sir Language Fundamentals
No ratings yet
Core Java With SCJP Ocjp Notes by Durga Sir Language Fundamentals
58 pages
Sololearn Python
No ratings yet
Sololearn Python
35 pages
TCL TK
100% (2)
TCL TK
125 pages
ch1 Part B
No ratings yet
ch1 Part B
19 pages
4 - InnovatiCS - Crash Course in Python Programming
No ratings yet
4 - InnovatiCS - Crash Course in Python Programming
57 pages
BCS 054 June2010 June2023
No ratings yet
BCS 054 June2010 June2023
116 pages
PLS Method
No ratings yet
PLS Method
80 pages
Grade 11 - Computer Science
No ratings yet
Grade 11 - Computer Science
10 pages
KTMT
No ratings yet
KTMT
9 pages
Python Practical Synopsis
No ratings yet
Python Practical Synopsis
18 pages
Spring 2024 Exam 1
No ratings yet
Spring 2024 Exam 1
13 pages
University of Victoria Midterm Exam 1 February 7 2019 Computer Science 349A
No ratings yet
University of Victoria Midterm Exam 1 February 7 2019 Computer Science 349A
4 pages
1.error in Numerical Analysis
No ratings yet
1.error in Numerical Analysis
3 pages
Lecture - Notes C++
No ratings yet
Lecture - Notes C++
24 pages
08 Odds Ends
No ratings yet
08 Odds Ends
27 pages
The MEL Companion - Maya Scripting For 3D Artists
No ratings yet
The MEL Companion - Maya Scripting For 3D Artists
427 pages
Numbers and Precision
No ratings yet
Numbers and Precision
26 pages
Xi Python Notes Cs-083
No ratings yet
Xi Python Notes Cs-083
35 pages
PPT ch02
No ratings yet
PPT ch02
78 pages
Python CS1002 PDF
No ratings yet
Python CS1002 PDF
210 pages
Welcome To Introduction To Programming
No ratings yet
Welcome To Introduction To Programming
48 pages
Lab 6 - Function - v0 PDF
No ratings yet
Lab 6 - Function - v0 PDF
9 pages
Arm Cortex m33 TRM 100230 08 en
No ratings yet
Arm Cortex m33 TRM 100230 08 en
137 pages
Nptel Assessment
No ratings yet
Nptel Assessment
25 pages
C Functions Reference
No ratings yet
C Functions Reference
656 pages
Notes - PPS Unit 4
No ratings yet
Notes - PPS Unit 4
19 pages
IL Instruction Guide E
No ratings yet
IL Instruction Guide E
257 pages

Lec 3

Uploaded by

Lec 3

Uploaded by

Introduction to IT

bn-1 bn-2 bn-3 ... b2 b1 b0

Sign bit Number magnitude

Although simple, the SM representation is complicated when performing arithmetic

 Conversion a fractional number from binary to decimal:

Conversion of a fractional number from decimal to binary

Result: 4,1875(10) = 100,0011(2)

 The Sign Bit

 Convert -1313.3125 to IEEE 32-bit floating point format.

0.3125 × 2 = 0.625 0 Generate 0 and continue.

So -1313.3125 is 11000100101001000010101000000000 = c4a42a0016

0.1015625 × 2 = 0.203125 0 Generate 0 and continue.

So 0.1015625 is 00111101110100000000000000000000 = 3dd0000016

 So, to sum up:

You might also like