0% found this document useful (0 votes)
3 views

Introduction(Updated)

Lecture notes of DSA

Uploaded by

Sake Anila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Introduction(Updated)

Lecture notes of DSA

Uploaded by

Sake Anila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Mind Set

Algorithm

Welcome to CS202

In this course we will look at the theory behind data structures and
algorithms. It will also give us a glimpse into the theory of
computers and computer science.

In this current day and age, every established university in the world
has people who work on quantum computers, even though most
people do not have access to such types of computers.

Many of them spend their entire lives working on quantum


computers but never even get a chance to look at such computers.

They achieve this feat by understanding the theory behind quantum


computers solely based on the laws of quantum physics.
Mind Set
Algorithm

Similarly roughly 50 years back, most scientists did not have access
to personal laptops or computers. They also studied the capabilities
of computers based on some basic principles regarding their working.

Try and understand this course by having the mind-set of those


computer scientists.
There will be coding related assignments and several students try to
learn algorithms and datastructures by coding them, which may
work but it is not the way one should go about learning this course.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

What is an algorithm? An algorithm is a procedure to accomplish


a specific task. An algorithm is the idea behind any reasonable
computer program. Don’t worry about the definition, try to focus
on the underlying idea.

This definition seems vague so let me be a bit more precise.


Computational problems have an input or set of inputs and they
have a desired output.

For example as a 8th standard student you were given a quadratic


equation as an input and the desired output was a value or set of
values which would satisfy that equation.
Similarly, algorithms are required to solve a computational problem.
They have to solve them correctly and efficiently.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

Barring few fields as exceptions in study of algorithms, given a


computational problem we are looking for algorithms which always
give the right answer. You ideally do not want to design a piece of
code/algorithm that works 99% of the time, but fails sometimes.

Once you are assured about the correctness of an algorithm, you are
required to focus on the efficiency of the problem.

You may wonder as to what one means by efficiency. In some cases


efficiency could mean the time taken to solve a problem. In some
cases it could mean the space (RAM) required to solve such a
problem. There could be several other parameters based on which
efficiency could be measured.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

For the major part of this course we will be focused on


efficiency in terms of time.

As far as this course is concerned, the word time has a different


meaning. It does not denote the actual time (in seconds) taken by
an algorithm to produce an output.

By time we mean clock cycle. A clock cycle, or simply a ”cycle,”


is a single electronic pulse of a CPU. During each cycle, a CPU can
perform a basic operation such as fetching an instruction, accessing
memory, or writing data. Since only simple commands can be
performed during each cycle, most CPU processes require multiple
clock cycles. This definition of time makes our life easy for the
theoretical understanding of the problems we will study.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

In this course we will be looking at the The RAM Model of


Computation. Machine-independent algorithm design depends upon
a hypothetical computer called the Random Access Machine or
RAM. Under this model of computation, we are confronted with a
computer where:
I Each simple operation (+, *, –, =, if, call) takes exactly one
time step.
I Loops and subroutines are not considered simple operations.
Instead, they are the composition of many single-step
operations. The time it takes to run through a loop or execute
a subprogram depends upon the number of loop iterations or
the specific nature of the subprogram.
I Each memory access takes exactly one time step. Further, we
have as much memory as we need. The RAM model takes no
notice of whether an item is in cache or on the disk.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

Under the RAM model, we measure run time by counting up the


number of steps an algorithm takes on a given problem instance. If
we assume that our RAM executes a given number of steps per
second, this operation count converts naturally to the actual
running time measured in seconds.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

Let’s come back to the correctness of an algorithm.


How does one ascertain that the logic behind the code will never fail?

Most amateur coders try to solve a computational problem by


coding it without verifying its correctness. They run the code over
several sample inputs and check if the output is correct.

This is not the right way to go about algorithm design. One to first
be absolutely sure that logic of the code never fails and that there
are no black swan events. The task of ascertaining the correctness
of an algorithm is usually easier compared to the task of finding an
efficient algorithm.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

For example:- let’s consider the basic task of finding two factors of
an integer x with n digits. Please note that n denotes the size of
the input and not the value of the input.

Suppose, if the input integer is 10763, then its value is 10763, but
it’s size in decimal is 5 integers. By n, we are referring to 5 and not
10763.

The basic algorithm which we knew as a high school was to try


dividing integers from 2 to x (which is roughly of size 10n ) and see
if any number perfectly divides the input. This algorithm will
correctly find a factor if the input is a composite number, because it
checks all the integers in the range of 2 to x.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

A minor modification would be to divide all integers in the range of



2 to x (roughly 10n/2 ) and see if any integer divides this input.

This algorithm will also correctly find a factor if the input is a


composite number, because if the input is a composite number it

would have at least one factor smaller than x.

So now we are sure that our algorithm always gives the factors of
the input if it is a composite number otherwise it outputs that the
input was a prime number.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

One might even go one step further and check only primes in the

range of 2 to x if they divide the input. But then we have to
encounter a whole new problem of trying to find prime numbers,
which make the matters more complicated so we will ignore this
tweak for the time being.

Right now we are happy with the fact that we have a procedure to
find a factor of an input of n digits if it is a composite or declare
that the integer given as the input is prime.
But how long will it take for our procedure to run and give the
output?
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

Even in the best case assumption that a computer can check


divisibility of one integer by another integer in 1 clock cycle, it could
take roughly 10n/2 clock cycles to factorize an n digit composite
integer. The number of cycles required is exponential in terms of
the input size.

So now we know that the factorization algorithm that we discussed


above is probably not efficient, because increasing the input by k
digit increases the running time of the procedure by 10k/2 times.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

FYI:- The problem of factoring a large number is being studied for


several decades and no one yet has found an efficient solution. It is
one of the most important unsolved problems of this century.
Solving this problem will give instant worldwide fame and probably
break several encryption techniques breaking internet banking.

So now we have some rough/basic idea as to what we are looking


for in an algorithm. Lets look at another related problem which we
might have studied in high school.

Lets study the problem of finding the GCD of two integers x and y .
As students in the primary school, we were taught to factorize both
the integers x and y and then list out the common prime factor
while noting their multiplicity. As we noticed earlier, this is probably
not the efficient way to find GCD because we do not know how to
factorize an integer efficiently.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

Some of you might have encountered the Euclidean algorithm for


finding GCD. If we have two integers with 4 and 7 digits, the input
size in this case would be 11 digits.
The Euclidean algorithm is based on the fact that

GCD(x, y ) = GCD(y , x mod y )


GCD(a, 0) = a

Where x mod y denotes the remainder obtained when x is divided


by y . This process is done over and over till we reach a point where
the second operand is zero at which point we realize that the first
operand is the desired GCD.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

Lets analyse the procedure we have defined


1. Does the algorithm terminate or does it keep running in
loops forever? Ans:- Since the second operand is always
decreasing and non negative it will eventually reach zero at
which point the algorithm terminates.
2. Does the algorithm always give the correct answer? Yes,
it always gives the correct answer due to the mathematical
logic behind the algorithm. At each step of the running of
GCD procedure, even though the operands may change but the
output of the GCD function is the same. So the GCD of the
first set of operands is also the GCD of the last set of operands.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

3. Is it Efficient? Let’s assume that finding the remainder


obtained when x is divided by y can be done in one clock cycle.
(x ≥ y )
I x has more digits than y

In the next step of the procedure we will find GCD of y and an


integer (x mod y ) which is less than y . Thus the total number
of digits representing the two input integers decreases by at
least 1.

For example
If we are trying to find GCD(1201, 88) then the next step would
be GCD(88, 1201 mod 88)= GCD(88, 57) which reduces the
number of digits in the input by 1.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

I x ≥ y , but x and y have the same number of digits.

In this situation, one step may not be sufficient to reduce the


number of digits in the operands.
For example, GCD(21,11)= GCD(11,10).

Suppose x and y have α digits each and their leading digits are
x` and y` respectively.

Since x and y have same number of digits, (x mod y ) could


possibly be x − y which in turn might also have α digits.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

The goal is the figure out the number of steps needed to ensure
that one of the operands loses at least one digit.
1. If x` = y` = 1, then one step is enough and x mod y would
have less than α digits
2. If x` > 1 and y` = 1, then in one step,
I Either x mod y has less than α digits.
I Or x mod y also has α digits, but that would imply that the
leading digit on (x mod y ) is also 1. So in the next step, we
encounter Case 1.
...
In this way, one can analyze for all possible pairs of leading digits of
x and y , and figure out the number of steps needed to reduce the
number of digits in one of the operands.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

After checking for all the possible pairs of leading digits, one would
come to the conclusion that at most 5 steps are needed. For sake of
brevity, I have not included that analysis here.

However, in this situation, we can use Fibonacci numbers to make


some observations. Fibonacci numbers are useful in analysis of GCD
computing algorithms.
The first few fibonacci numbers are 0,1,1,2,3,5,8.

This worst case behaviour is showcased by the example


GCD(8,5)=GCD(5,3)=GCD(3,2)=GCD(2,1)=GCD(1,1)
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

Thus, in either case after at most 5 iterations, we would be trying to


find GCD of two integers whose total size in terms of number of
integers is 1 less than what we started with.

Extending this logic, we can see that if the x and y had n digits in
total, then after at most 5n iterations of the GCD procedure we will
be able to find the GCD for two input integers and the algorithm
will terminate.

If the total size of the input is n digits, then we can find the GCD in
5n clock cycles. Here we can say that the algorithm is efficient
because it takes a linear number of clock cycles w.r.t the size of the
input to solve the problem.
Whether this algorithm is the best algorithm or not is a topic future
discussion, but this is an efficient algorithm.
Time
Mind Set RAM model of Computation
Algorithm Correctness
Efficiency

Thus we have concluded that the Euclidean algorithm is an efficient


algorithm which correctly finds the gcd of two integers.

These problems are easy to model as mathematical/computational


problems. But it is not so easy in most real world scenarios.

Perhaps the single most important design technique is modeling, the


art of abstracting a messy real-world application into a clean
problem suitable for algorithmic attack.

You might also like