0% found this document useful (0 votes)
25 views

1 Intro and Binary Operations

This document discusses numerical methods which are approximate computer methods used to solve mathematical problems without analytical solutions. It describes different types of numerical methods and their tradeoffs, such as accuracy vs computational cost. Examples of real-life disasters caused by poor numerical computing are also provided.

Uploaded by

pocketdroid01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

1 Intro and Binary Operations

This document discusses numerical methods which are approximate computer methods used to solve mathematical problems without analytical solutions. It describes different types of numerical methods and their tradeoffs, such as accuracy vs computational cost. Examples of real-life disasters caused by poor numerical computing are also provided.

Uploaded by

pocketdroid01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

NUMERICAL METHODS

A numerical method is an approximate computer method for solving a


mathematical problem which often has no analytical solution. In analytical
solution, the problem is framed in a well understood form, calculating the exact
solution. A numerical method makes guesses at the solution and the test
whether the problem is solved well enough to stop.

The problems themselves can come from many fields of application, e.g.,
biology, chemistry, physics, engineering, medicine, education,
entertainment, the internet, forensics, financial markets, etc

The stakes are often high, so the results must be accurate, reliable, and
robust.

Disasters Caused By Poor Numerical Computing

Here are some examples of real life disasters that occurred because of poor
computing practices (and not because of programming errors per se).

 CSI: Round-off error (aka Salami attacks)


Hackers diverted round-off errors from normal bank calculations to their
own accounts, netting them substantial sums of money over time.

 Patriot missile failure, February 25, 1991:


A Scud missile killed 28 soldiers in Dhahran, Saudi Arabia, when the
Patriot missile meant to intercept it miscalculated the Scud’s position.
The error came from an integer to floating-point conversion of 9.5 ×
10−8 per tenth of a second multiplied by the product of the up-time
of the Patriot (100 hours) and the velocity of the Scud (1676 metres
per second); i.e., 573 metres.

Trade – off in choosing a Numerical Method

1. Accuracy vs. computational cost: Higher order numerical methods often


provide greater accuracy but require more computational resources and
time.

1
2. Stability vs. convergence: Some numerical methods may be more stable
but converge slower, while others may converge quickly but be highly
sensitive to initial conditions.
3. Robustness vs. complexity: More complex numerical methods may
provide better results in a wider range of situations, but they may also be
more difficult to implement and require a deeper understanding of the
underlying mathematical principles.
4. Applicability vs. simplicity: Some numerical methods may be more
applicable to a wider range of problems but may also be more complex
and difficult to implement, while simpler methods may be limited in their
applicability.
5. Model complexity vs. computational efficiency: More complex numerical
methods may be able to handle more complex models, but may require
significantly more computational time and resources.

Other Includes;

 Type of Mathematical Problem


 Type, Availability, Precision, Cost, and Speed of Computer
 Program Development Cost versus Software Cost versus Run-Time Cost
 Characteristics of the Numerical Method
 Mathematical Behavior of the Function, Equation, or Data.
 Ease of Application
 Maintenance.

2
1. BINARY SYSTEMS

Modern computers of high-speed deal with real number in the binary system in
contrast to the decimal systems that humans prefer to use. The binary ‘sys’ uses
two digits; O's and 1's only to represent numbers. It is called “bits”. E.g.,
11100101 has 8 bits. 8 bits make one byte.

In computing, ASCII (American Standard Code for Information Interchange) is an


8-bit code for characters. That is, it uses 8-bits to represent a letter. Example,
the word ‘CAT’ in a word processor can be 010000112, 010000012, and
010101002.

Computers carry out their operation in binary number systems. Computers


communicate with its human users in decimal system but work internally in
binary system. Hence, there must be a conversion process that is executed
internally by the computer.

Conversion of Binary to Decimal

Examples: a) Convert 10011011₂ to base 10

b) Convert 1001.11101₂ to base 10.

Solution

a) 10011011₂ = 1 × 27 + 0 × 26 + 0 × 25 + 1 × 24 + 1 × 23 + 0 × 22 +
1 × 21 + 1 × 20
= 128 + 0 + 0 + 16 + 8 + 0 + 2 + 1
= 15510
b) 1001.11101₂ = 1 × 2 + 0 × 22 + 0 × 21 + 1 × 20 + 1 × 2−1 + 1 ×
3

2−2 + 1 × 2−3 + 0 × 2−4 + 1 × 2−5


= 8 + 1 + 1⁄2 + 1⁄4 + 1⁄8 + 1⁄16 + 1⁄32

= 9.96910

3
Conversion from Decimal (Base 10) to Binary Base

a) Convert 1510 to binary number

2 15
2 7r1
2 3r1
2 1r1
0r1

1510 = 11112

b) Convert 2510 to base 2

2 25
2 12 r 1
2 6r0
2 3r0
1r1
0r1

1510 = 110012

BINARY CODED DECIMAL SYSTEM

BCDS is the code used by microprocessors. The computer system expresses a


decimal number by four binary numbers.

How does the computer process 155?

2 1 2 5 2 5
2 0r1 2 2r1 2 2r1
2 1r0 2 1r0
0r1 0r1
So 155 is;
1 = 0001
5 = 0101
5 = 0101
NOTE: You will add zero to the front so as to make each 4 digits
4
Conversion from Binary to Octal System

STEP 1 - Convert to base 10


STEP 2 - Convert to base 8

a) Convert 10110₂ to octal

Solution
STEP 1 - Convert to base 10

10110₂ = 1 × 24 + 0 × 23 + 1 × 22 + 1 × 21 + 0 × 20 = 2210

STEP 2 - Convert to base 8


8 22
8 2r6
8 0r2

= 26
101102 = 268

b) Convert 10000101₂ to octal


Solution
STEP 1 - Convert to base 10

10000101 = 1 × 27 + 0 × 26 + 0 × 25 + 0 × 24 + 0 × 23 + 1 × 22 + 0
× 21 + 1 × 20 = 133

STEP 2 - Convert to base 8

8 133
8 16 r 5
8 2r0
0r2

100001012 = 2058

5
Conversion from Octal to Binary

a) Convert 3078 to binary

Solution
STEP 1 - Convert to base 10
3078 = 3 × 82 + 0 × 81 + 7 × 80
= 192 + 0 + 7
= 19910

STEP 2 - Convert to base 2

2 199
2 99 r 1
2 49 r 1
2 24 r 1
2 12 r 0
2 6r0
2 3r0
2 1r1
0r1
19910 = 110001112

EXCERCISE
1) Convert 47628 to binary.

Converting a binary fraction to octal


a) Convert 11101.01112

Solution
STEP 1 - Convert to base 10
11101.01112
= 1 × 24 + 1 × 24 + 1 × 22 + 0 × 21 + 1 × 20 + 0 × 2−1
+ 1 × 2−2 + 1 × 2−3 + 1 × 2−4
= 16 + 8 + 4 + 1 + 1⁄2 + 1⁄4 + 1⁄8 + 1⁄16

= 29.4375210

6
STEP 2 - Convert to base 8

8 29
8 3r5
0r3
29 = 358
the fraction part is as follow
0.43752 × 8 = 3.5 𝑐𝑎𝑟𝑟𝑦 3
0.5 × 8 = 4 𝑐𝑎𝑟𝑟𝑦 4
Continue until no fractional part or till you are okay with the precision

0.43752 = . 348 Thus 11101.01112 = 35.348

Convert a decimal fraction to binary


a) Convert 5.8725 to its binary equivalent

2 5
2 2r1
2 1r0
0r1

0.8725 × 2 = 1.745 𝑐𝑎𝑟𝑟𝑦 1


0.745 × 2 = 1.49 𝑐𝑎𝑟𝑟𝑦 1
0.49 × 2 = 0.98 𝑐𝑎𝑟𝑟𝑦 0
0.98 × 2 = 1.96 𝑐𝑎𝑟𝑟𝑦 1
Assume there was no remainder at this point, our answer becomes;

= 101.11012

NOTE: Solve till you have no remainder.

Excercise
1) Convert 11011011.110112 to octal.

7
BINARY OPERATIONS (Arithmetic)

A microprocessor uses the following four binary arithmetic to carry out its
operations. They include addition, subtraction, multiplication, division.

1) Addition
a. Add 10102 𝑎𝑛𝑑 10012
1 0 1 0
+ 1 0 0 1
1 0 0 1 1

b. Add 110010012 𝑎𝑛𝑑 10011102

1 1 0 0 1 0 0 1
+ 1 0 0 1 1 1 0
1 0 0 0 1 0 1 1 1

2) Subtraction
a. Subtract 10012 𝑓𝑟𝑜𝑚 10102
1 0 1 0
- 1 0 0 1
0 0 1 0

b. Subtract 0011002 𝑓𝑟𝑜𝑚 00110102

0 0 1 1 0 1 0
- 0 0 1 1 0 0
0 0 0 1 1 1 0

c. Subtract 10101112 𝑓𝑟𝑜𝑚 11101102

1 1 1 0 1 1 0
-1 0 1 0 1 1 1
0 0 1 1 1 1 1

8
3) Multiplication
a. Evaluate 11001012 × 100112
1 1 0 0 1 0 1
X 1 0 0 1 1
1 1 0 0 1 0 1
1 1 0 0 1 0 1
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 0 0 1 0 1
1 1 1 0 1 1 1 1 1 1 1

4) Division
To divide two numbers, which result is an exact division, we basically need to
follow four steps: division, multiplication, subtraction, and next digit.
To perform a binary division, we need to follow the same process as we do for
dividing regular numbers but, in this case, we only need to decide if it's going to
be a 1 or a 0.

a. Divide 110012 𝑏𝑦 101


1 0 1
101 1 1 0 0 1
- 101
00101 add 0 to quotient,
101
000

ANS = 101

b. Divide 10110012 𝑏𝑦 111

To divide two numbers, which result is an exact division, we basically need to


follow four steps: division, multiplication, subtraction, and next digit.
First, we need to identify digit by digit.
Is 111 less or equal than 101? No, so we get another digit.
9
Is 111 less or equal than 101? No, so we repeat and get another digit.
Is 111 less or equal than 1011? Yes, so in this case we don't need to think how
many times can 1011 be divided by 111, instead, we just add a 1 to the product.
We keep on the usual until nothing to bring down and tthe remainder is read or
attached.
1100
111 1 0 1 1 0 0 1
111
1000
111
00101
Exercise
A. Add 111101111 and 101010111
B. What is the binary subtraction of 111101111 and 101010111
C. Multiply 111101111 and 101010111
D. Divide 111101111 by 101

HEXADECIMAL NUMBER SYSTEM

Hexadecimal is the name of the numbering system that is base 16. This system,
therefore, has numerals 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, and 15. That
means that two-digit decimal numbers 10, 11, 12, 13, 14, and 15 must be
represented by a single numeral to exist in this numbering system. To address
the two-digit decimal values, the alphabetic characters A, B, C, D, E, and F are
used to represent these values in hexadecimal and are treated as valid numerals.
These digits are represented as follows:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0 1 2 3 4 5 6 7 8 9 A B C D E F

Convert 20710 to hex

16 207

16 12 R 15 -> F
0 R 12 -> C

𝐴𝑁𝑆 = 20710 = 𝐶𝐹16


10
Convert 1𝐴0ℎ𝑒𝑥 to decimal number

= 1 × 162 + 𝐴 × 161 + 0 × 160


= 256 + 10 × 16
= 256 + 160
= 41610

Excercise

a. Convert 1110011011₂ to Hexadecimal


b. Convert 𝐸𝐸16 to decimal.

Miscellaneous

Two’s Complement

Two’s complement is an alternative way of representing negative binary


numbers. This alternative coding system also has the unique property that
subtraction (or the addition of a negative number) can be performed using
addition hardware. Architects of early computers were thus able to build
arithmetic and logic units that performed operations of addition and subtraction
using only adder hardware. (As it turns out since multiplication is just successive
addition and division is just successive subtraction it was possible to use simple
adder hardware to perform all of these operations.

Example

Find the twos’s complement form of -6

Step 1
Decide how many bits you are going to use for all your operations. For our
purposes we will always use 8 bits.

If we were using 8-bits the left-most bit will contain the sign. This would leave
7 bits to hold the number.

XXXX XXXX
|
This is the sign bit

11
This sign bit is reserved and is no longer one of the digits that make up the binary
number. Using two’s complement, the computer recognizes the presence of a
one (1) in the leftmost bit which tells the machine that before it does
mathematics it needs to convert negative numbers into their two’s compliment
equivalent.

0000 11102 the sign bit is 0 so the number is positive


The binary number is 7-digits long,

1000 01102 the sign bit is 1 so the number is negative


The binary number is only 7-digits long,
Step 2
Strip the sign bits off the numbers.

Step 3
Convert the negative number into it’s two’s complement form

So therefore

-610 = 1000 01102

Write down the number


without the sign bit 000 01102

a) Flip all the digits

The 1 -> 0, the 0 -> 1 111 10012

b) Add 1 to this number + 1

c) This is now - 6 in the 111 10102


two’s complement format

Excercise

a) Find the 2s complement of -1410

b) Using the 2s complement form perform the following arithmetic operation

1410 – 610 and 610 – 1410

12

You might also like