0% found this document useful (0 votes)
14 views

Seminar 1 - Introduction (Handout)

FIT2004 is a unit focused on Algorithms and Data Structures, emphasizing problem-solving and algorithm design paradigms like Divide-and-Conquer. The course includes assessments such as quizzes and assignments, and stresses the importance of engagement in seminars and applied classes for success. Academic integrity is crucial, with strict policies against cheating, collusion, and plagiarism.

Uploaded by

ttri0026
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Seminar 1 - Introduction (Handout)

FIT2004 is a unit focused on Algorithms and Data Structures, emphasizing problem-solving and algorithm design paradigms like Divide-and-Conquer. The course includes assessments such as quizzes and assignments, and stresses the importance of engagement in seminars and applied classes for success. Academic integrity is crucial, with strict policies against cheating, collusion, and plagiarism.

Uploaded by

ttri0026
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

FIT2004 - Algorithms and Data Structures

Seminar 1 - Introduction

Rafael Dowsley
3 March 2025
Agenda

1 Unit Information

2 Divide-and-Conquer

3 Complexity Analysis

4 Solving Recurrence Relations


Unit Information
Teaching team

• Chief examiner/co-lecturer: Rafael Dowsley

• Co-lecturer: Anuja Dharmaratne

• Malaysia lecturer/coordinator: Lim (Ian) Wern Han

• Admin TAs: Harrison Sloan, Joshua Nung, Mubasshir Murshed,


Sachinthana Pathiranage

• TAs: David Batonda, Elijah Lewis, Ethan Wills, Jackson Goerner,


Klarissa Jutivannadevi, Luhan Cheng, Michael Xue, Nathan
Companez, Saman Ahmadi, Satya Jhaveri, Shen-Kit Hia, Susilo
Lebang, Thomas Hendrey, Yisong Yu

1/49
What is this unit about?

• Solving problems with computers - efficiently.

• Developing your algorithm toolbox.

• Training your problem solving skills.

• The unit is not really about programming:

I Python used for assignments, but the subject is really language agnostic.

I Algorithms in this unit will be described in English, pseudocode, procedural


set of instructions or Python.

2/49
Is this unit important?

• Algorithms and Data Structures is a key unit in computer science


degrees around the world.

• The subject is very important for careers in the area:

I Companies actively hunt for people good at algorithms and data structures.

I Many job interview questions are based on algorithms from this unit.

I Many applied class questions are in fact very similar to questions you could
be asked in job interviews.

I You are the future of CS and will make great contributions to field.
What you learn in this unit will greatly help you throughout your career.

3/49
Overview of the contents

• Explore some of the most important algorithm design paradigms:

I Divide-and-Conquer

I Greedy algorithms

I Dynamic Programming

I Network Flow

• Analysis of algorithms.

• Learn important data structures for implementing algorithms


efficiently.

• Algorithms for solving important computational problems.

4/49
Expectations

• This unit is challenging.

I If you don’t have a good understanding of the contents of prerequisite


units (e.g., FIT1008), you need to catch up on them urgently to not get in
trouble.

I You have to be on top of it from Week 1. You will very likely not pass if
you think “I can brush up on the material close to the assessment
deadlines”.

• The minimum expected workload is 144 hours.

I The best way to spend “the first 60 hours” is by attending and engaging in
the Seminars and Applied Classes.

I Missing Seminars or Applied Classes will require double the efforts to


recover.

5/49
Good news

• The unit wants you to succeed and understand. Lot of resources


available:

I Notes for all 12 weeks are available in a single PDF file on Moodle (click
on “Learning”, then “Additional Information and Resources” and scroll
down to “Unit Resources”).

I Solutions to the Applied Classes’ problems are released.

I Preparation sheets and solutions are provided.

I Support on Ed discussion platform and in the consultations.

• Majority of the students that regularly engage with the classes end
up getting D/HD.

6/49
Support

• Contents and general questions should be clarified on Ed forum or


during a consultation.

• For questions involving sensitive matters that cannot be solved on


Ed forum, please email [email protected].

• Do not email individual staff members.

7/49
Additional references

• This subject has huge value for your professional development into
careers in computer science. Some books you might want to refer to
(from time-to-time, even beyond this unit) include:

I CLRS: Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford


Stein. Introduction to Algorithms.
I KT: Jon Kleinberg, Éva Tardos. Algorithm Design.
I Rou: Tim Roughgarden. Algorithms Illuminated.
I Knu: Donald Knuth. The Art of Computer Programming. More advanced,
pretty expensive; but iconic CS book!

• No required textbooks. Beware that all of those textbooks contain


both less and (far) more than is required for the unit. That is why it
is important to not rely solely on these books (i.e., you should watch
the seminars, attend applied classes and read the course notes).

8/49
Assessment

• In-semester assessments (total of 50 marks):

I Weekly Quizzes (2 marks per week, 22 marks in total): during each


Applied Class; 1 or 2 questions per week; 10 minutes to solve. Bring your
own Internet-connected device to answer on Moodle; and also blank
working sheets if you plan to use them.

I Assignment 1 (10 marks) due on Wednesday of Week 7.

I Assignment 2 (18 marks), due on Wednesday of Week 12.

I In-semester hurdle: 45% of the in-semester marks (i.e., at least 22.5 out of
total 50 marks for Quizzes and Assignments).

• Final exam (50 marks)

I Exam hurdle: 45% (i.e., at least 22.5 out of the 50 Exam marks).

9/49
How to succeed in this unit?

• During Week x, engage with Seminar x.

• During Week x, read the corresponding parts of the unit notes and
clarify any questions in the consultations or Ed forum.

• Attend the Applied Classes and engage with them:

I Applied Class of Week x deepens the understanding of topics presented in


Seminar x − 1.

I Do the Preparation x problems before the Applied Class of Week x. They


help you practice and self-assess your understanding of the topics of
Seminar x − 1. Solutions are provided.

I The Preparation problems will likely be helpful to get you ready for the
Quiz as the Quiz covers contents from classes up to Week x − 1.

I The Assignments and the Exam assess the concepts explored in the
Applied Classes.

10/49
How to get into trouble in this unit?

• Don’t engage with classes and focus all effort on completing


assignments.

I The assignments are designed to be done by people with a strong grasp of


the contents and are heavily based on the content taught in the unit.

I If you spend time understanding the content taught in the classes, the
assignments will be far easier.

I Historically, many of those students focusing all their effort on completing


assignments will fail the exam hurdle as it covers all contents of the unit.

I Those students normally struggle a lot with exam questions that are taken
almost “as is” from Applied Classes and Seminars.

• Constructive alignment is used in this unit.

11/49
Academic integrity

• Cheating: Seeking to obtain an unfair advantage in an examination


or in other written or practical work required to be submitted or
completed for assessment.

• Collusion: Unauthorised collaboration on assessable work with


another person or persons.

• Plagiarism: To take and use another person’s ideas and or manner


of expressing them and to pass them off as one’s own by failing to
give appropriate acknowledgement. This includes material from any
source, staff, students or the Internet – published and un-published
works.

• Generative AI tools cannot be used for any assessment in this


unit.

12/49
How to avoid academic integrity issues

• https://www.monash.edu/students/study-support/
academic-integrity

• Do not discuss the assessment tasks with other students.

• High-risk game: every semester a considerable number of academic


integrity cases are opened in this unit. After SCC investigations and
decisions, most result in a “zero marks for assignment” penalty
(which makes it hard to pass the hurdle) or a straight “zero marks
for unit” penalty.

• What can you do? Share test cases! Feel free to post your test cases
on the Ed post that will be created for that, and to use other’s.

13/49
Give us feedback!

• We are continuously trying to improve the unit.

• Tell us about things you dislike/think should be done differently.

• A few examples of changes which were motivated by student


feedback:

I Reduction in exam grade percentage to 50%.

I Reduction of the number of assignments from 4 to 2.

I Ed thread specifically for sharing test cases for assignments.

I Releasing Applied Classes’ solutions one week earlier.

14/49
Your first algorithm

• What was the first algorithm you learned?

I As an CS student you already learned some algorithms in the university:


binary search, sorting algorithms, etc.

I But algorithms predate computers by millennia. Even the word ‘algorithm’


is derived from the name of a 9th century Persian mathematician.

I In fact, you learned in school an algorithm that is more than 2000 years
old: Euclid’s algorithm for computing the greatest common divisor.

I Even in your first school years you already learned the “grade school”
multiplication algorithm.

15/49
Grade school multiplication algorithm

• Grade school multiplication algorithm using partial products:

• Did your teacher talk about the efficiency of this algorithm? Showed
the correctness proof?

• Well, back then you were only a user of the algorithm.

• In the future, understanding the efficiency and correctness of


algorithms will be a central skill in your career.

16/49
Grade school multiplication algorithm

• Grade school multiplication algorithm using partial products:

• If we consider addition and multiplication of single digit numbers as


the basic operations, for n-digit numbers, this algorithm clearly has
complexity that is quadratic in n.

• Fundamental question in algorithm design: Can we do it more


efficiently?

17/49
Divide-and-Conquer
Divide-and-Conquer paradigm

• The Divide-and-Conquer algorithm design paradigm works in 3


steps:

1. Divide the problem into smaller subproblems.

2. Conquer (i.e., solve) the smaller subproblems.

3. Combine the solutions of the smaller subproblems to obtain the solution of


the bigger problem.

• Analysing the time complexity of a divide-and-conquer algorithm


normally involves solving a recurrence relation.

• Normally a polynomial time solution to the problem is already know,


and the divide-and-conquer strategy is used to reduce the time
complexity to a lower polynomial.

18/49
First improvement idea

• Problem: multiply two n-digits numbers a and b in sub-quadratic


time given addition and multiplication of single digit numbers as the
basic operations.
a
×
b

• Improvement idea: split the numbers between the n/2 most


significant digits and the n/2 least significant digits; and do
recursive calls with them.

• From math we know that:


a·b = (aM · 10n/2 + aL )(bM · 10n/2 + bL )
= aM · bM · 10n + aM · bL · 10n/2 + aL · bM · 10n/2 + aL · bL

19/49
Are we making progress?

a · b = aM · bM · 10n + aM · bL · 10n/2 + aL · bM · 10n/2 + aL · bL

• Reduce 1 instance of the problem of size n to 4 instances of size n/2:


aM aM aL aL
× × × ×
bM bL bM bL

• Are we making progress?

• Not really! Intuition: 4 instances that will each take about 14 of the
time that was necessary to solve the original problem, so the overall
time stays in the same order. You can later check that by solving
the recurrence T (n) = 4 · T (n/2) + c · n.

• If we want to follow this approach to improve the efficiency, we


should use at most 3 recursive calls. 20/49
Are improvements possible?

Andrey Kolmogorov, one of the greatest Anatoly Karatsuba, then a 23 y/o


mathematicians of the 20th century. student, within one week from hearing
that in his seminar.

21/49
Karatsuba’s algorithm

a · b = aM · bM · 10n + (aM · bL + aL · bM ) · 10n/2 + aL · bL

• Do only 3 recursive calls to compute:


aM aL aM + aL
× × ×
bM bL bM + bL
= (1) = (2) = (3)

• Given the results of (1), (2) and (3), if we can trivially obtain
(aM · bL + aL · bM ) then we are done computing a · b with only 3
recursive calls.

22/49
The trick

• Problem: Obtain c = (aM · bL + aL · bM ) without further recursive


calls when given:
aM aL aM + aL
× × ×
bM bL bM + bL
= (1) = (2) = (3)

• Note that

(aM + aL ) · (bM + bL ) = aM · bM + aM · bL · +aL · bM + aL · bL


= c + aM · bM + aL · bL

• Solution: We obtain c by computing (3) - (1) - (2).

• Where does this trick come from?

23/49
Where does this trick come from?

• This trick traces back to Gauss’ method for multiplying complex


numbers using 3 multiplications of real numbers instead of 4.

Johann Carl Friedrich Gauss

• Adapting previous ideas to solve your new problem can be


very useful!

24/49
Is that a big improvement?

• The time complexity of Karatsuba’s algorithm is O(n1.59 ). To verify


that, just solve the recurrence T (n) = 3 · T (n/2) + c · n and use the
fact that log2 3 < 1.59.

• Don’t underestimate the difference between n2 and n1.59 !

• This is the algorithm that Python uses for multiplying large


numbers.
25/49
Merge Sort

• O(n log n) sorting algorithm presented by John von Neumann.

• One of the first explicit uses of the Divide-and-Conquer paradigm.

• Python uses as standard Timsort (a hybrid stable sorting algorithm


derived from Merge Sort and Insertion Sort).

Merge sort

1: function merge sort(array [lo..hi])


2: if hi > lo then
3: mid = b(lo + hi)/2c
4: merge sort(array[lo..mid])
5: merge sort(array[mid + 1..hi])
6: array[lo..hi] = merge(array[lo..mid], array[mid + 1..hi])

John von Neumann, one of


the greatest mathematicians
of the 20th century.
26/49
Quick Sort

• Sorting algorithm created by Tony Hoare that uses the


Divide-and-Conquer paradigm, but the subproblems are not
necessarily of the same size.

Tony Hoare (Turing Award 1980)

• Seminar 3 will cover it in detail.

27/49
Other Divide-and-Conquer examples

The Divide-and-Conquer paradigm can be applied to get efficient


algorithms for a wide range of problems, such as:

• Finding closest pair of points in a plane in O(n log n).

• Counting inversions in O(n log n), Applied Class next week.

• Improving matrix multiplication (Strassen’s algorithm).

• Fast Fourier Transform: this algorithm published by James


Cooley and John Tukey in 1965 is one of the most influential
algorithms, with a wide range of applications in engineering, music,
science, mathematics, etc.

I In fact, it can be traced back to unpublished work by Gauss.

28/49
Complexity Analysis
Time complexity

• Time complexity is the amount of time taken by an algorithm to run


as a function of the input size.

I Worst-case complexity (our main focus).

I Best-case complexity.

I Average-case complexity.

29/49
Asymptotic notation

Big-O Notation
It is said that f (n) = O(g (n)) if there are constants c and n0 such that
f (n) ≤ c · g (n) for all n ≥ n0 .

Big-Ω Notation
It is said that f (n) = Ω(g (n)) if there are constants c and n0 such that
f (n) ≥ c · g (n) for all n ≥ n0 .

Big-Θ Notation
It is said that f (n) = Θ(g (n)) if, and only if, f (n) = O(g (n)) and
f (n) = Ω(g (n)).

30/49
Space complexity

• Space complexity is the total amount of space taken by an algorithm


as a function of input size.

• Auxiliary space complexity is the amount of space taken by an


algorithm excluding the space taken by the input.

I Many textbooks and online resources do not distinguish between the above
two terms and use the term “space complexity” when they are in fact
referring to auxiliary space complexity.

I In this unit, we use these two terms to differentiate between them.

31/49
In-place algorithm

• An in-place algorithm has O(1) auxiliary space complexity.

I In other words, it only requires constant space in addition to the space


taken by its input.

I Merging is not an in-place algorithm as it needs to create the output list


which is size n.

I Be mindful that some books use a different definition (e.g., space taken by
recursion may be ignored). For the sake of this unit, we will use the above
definition.

32/49
Time complexity of Binary Search

• Worst-case time complexity?


Binary Search
I Search space at start: n
1: function binary search(array [1..n], key )
2: lo = 1 and hi = n + 1 I After 1st iteration: n/2
3: while lo < hi − 1 do
I After 2nd iteration: n/4
4: mid = b(lo + hi)/2)c
5: if key ≥ array [mid] then lo = mid I ...
6: else hi = mid
7: if array [lo] = key then return lo
I After x iterations: 1
8: else return null I How many iterations?
I O(log n)

• Best-case time complexity?

I O(1)

33/49
Space complexity of Binary Search

• Space complexity?
Binary Search
I O(n)
1: function binary search(array [1..n], key )
2: lo = 1 and hi = n + 1
3: while lo < hi − 1 do • Auxiliary space complexity?
4: mid = b(lo + hi)/2)c
5: if key ≥ array [mid] then lo = mid I O(1)
6: else hi = mid
7: if array [lo] = key then return lo
8: else return null • It is an in-place algorithm!

34/49
What is the time complexity?

• Problem: Given a sorted array of n unique numbers and two values


x and y , print all numbers that are in between x and y .

• Algorithm: Binary search to find the smallest number greater than


x. Linear scan from x until next number is ≥ y .

• What is the time complexity?

• O(n) because all numbers may be between x and y .

• But it seems to really depend on the output . . .

35/49
Output-sensitive time complexity

• Output-sensitive time complexity is the time-complexity that also


depends on the size of the output.

x = 23, y = 35
1 5 8 17 22 27 31 32 36 41

• Algorithm: Binary search to find the smallest number greater than


x. Linear scan from x until next number is ≥ y .

• Let w be the number of values in the range (i.e., in output). What


is the output-sensitive time complexity of the algorithm?

• O(w + log n). Note that w may be n in the worst-case.

• Output-sensitive complexity is only relevant when output-size may


vary, e.g., it is not relevant for sorting, finding minimum value etc.

36/49
Solving Recurrence Relations
Recurrence relation

• A recurrence relation is an equation that recursively defines a


sequence of values, and one or more base cases are given, e.g.:

T (1) = b
T (n) = T (n − 1) + c

• The complexity of recursive algorithms can be analysed by writing


its recurrence relation and then solving it.

37/49
Solving a simple recurrence relation for time complexity

• Cost when n = 1:
Power1 I T (1) = b for constant b
1: function power1(x, n)
• Cost for general case:
2: if n = 0 then return 1
I T (n) = T (n − 1) + c for constant c
3: else if n = 1 then return x
I T (n) = T (n − 2) + 2c
4: else return x · power1(x, n − 1)
T (n) = T (n − 3) + 3c
I Pattern?
Goal I T (n) = T (n − k) + c · k
Reduce general case to be in terms of the I Set k = n − 1 to get base case.
base case.

Solution
T (n) = b + c · (n − 1)
= c ·n+b−c
= O(n)

38/49
Checking solution by substitution

• Cost when n = 1:
Power1 I T (1) = b for constant b
1: function power1(x, n)
2: if n = 0 then return 1 • We have that:
3: else if n = 1 then return x
T (1) = c ·1+b−c
4: else return x · power1(x, n − 1)
= b

Goal • Cost for general case:


Reduce general case to be in terms of the I T (n) = T (n − 1) + c for constant c

base case.
• We have that:

Solution T (n − 1) + c = c · (n − 1) + b − c + c
T (n) = b + c · (n − 1) = c ·n+b−c
= c ·n+b−c = T (n)
= O(n)

39/49
Space complexity of this power function

• Space complexity?
Power1 I Total space usage = local space
used by the function * maximum
1: function power1(x, n)
depth of recursion
2: if n = 0 then return 1
I O(n)
3: else if n = 1 then return x
4: else return x · power1(x, n − 1)
• We will not discuss tail-recursion in
this unit because it is language
specific, e.g., Python doesn’t utilise
Power2
tail-recursion.
1: function power2(x, n)
2: result = 1
• Auxiliary Space Complexity?
3: for i ← 1 to n do
4: result = result · x • power1(x, n) is not in-place.
5: return result
• Iterative version power2(x, n) is
in-place.

40/49
Yet another power function

• Cost when n = 1:
Power3 I T (1) = b for constant b
1: function power3(x, n)
• Cost for general case:
2: if n = 0 then return 1
I T (n) = T (n/2) + c for constant c
3: else if n = 1 then return x
I T (n) = T (n/4) + 2c
4: y = power3(x · x, bn/2c)
T (n) = T (n/8) + 3c
5: if n even then return y
I Pattern?
6: else return x · y  
n
I T (n) = T 2k
+c ·k
I Set k = log n to get base case.
Goal
Reduce general case to be in terms of the
base case.

Solution
T (n) = b + c · log n
= O(log n)

41/49
Another check by substitution

• Cost when n = 1:
Power3 I T (1) = b for constant b
1: function power3(x, n)
2: if n = 0 then return 1 • We have that:
3: else if n = 1 then return x
T (1) = b + c · log 1
4: y = power3(x · x, bn/2c)
5: if n even then return y = b
6: else return x · y
• Cost for general case:
I T (n) = T (n/2) + c for constant c
Goal
Reduce general case to be in terms of the • We have that:
base case.
T (n/2) + c = b + c · log(n/2) + c
= b + c · (log n − log 2) + c
Solution
= b + c · log n
T (n) = b + c · log n
= T (n)
= O(log n)

42/49
Recurrence and complexity

• Recurrence relation:

T (n) = T (n/2) + c
T (1) = b

• Algorithmic example?

I Binary search

• Asymptotic complexity?

I O(log n)

43/49
Recurrence and complexity

• Recurrence relation:

T (n) = T (n − 1) + c
T (1) = b

• Algorithmic example?

I Linear search

• Asymptotic complexity?

I O(n)

44/49
Recurrence and complexity

• Recurrence relation:

T (n) = 2 · T (n/2) + c · n
T (1) = b

• Algorithmic example?

I Merge Sort

• Asymptotic complexity?

I O(n log n)

45/49
Recurrence and complexity

• Recurrence relation:

T (n) = T (n − 1) + c · n
T (1) = b

• Algorithmic example?

I Selection Sort

• Asymptotic complexity?

I O(n2 )

46/49
Recurrence and complexity

• Recurrence relation:

T (n) = 2 · T (n − 1) + c
T (0) = b

• Algorithmic example?

I Naive recursive Fibonacci

• Asymptotic complexity?

I O(2n )

47/49
Reading

• Course Notes: Sections 1.2, 1.3 and 1.4, Chapter 2

• Additional resources (not required, but also not necessary covering


all topics). Contents related to this seminar and recap of previous
units can be found in standard algorithms books such as:

I CLRS: Chapters 3 and 4

I KT: Chapters 2 and 5

I Rou: Chapters 1 to 4

48/49
Concluding remarks

• This unit demands your efforts from Week 1.

• The Divide-and-Conquer algorithm design paradigm can be useful


for reducing time complexity.

• Coming up next:

I Analysis of algorithms

I Non-comparison based sorting (Counting Sort, Radix Sort)

• Preparation required before the next week:

I Revise computational complexity covered in earlier units.

I Complete Preparation 1 (in your own time to self-assess your prerequisite


knowledge).

I Complete Preparation 2 before your Applied Class of Week 2.

49/49

You might also like