Week-3
Week-3
Python
While this function works, it's difficult to understand at a glance. What does calc do? What do
x and y represent? This lack of clarity is the hallmark of poor design.
Essentially, the function is not readable, maintainable, or safe to use without careful
inspection of its internal code.
3. Improvement
Let's refactor the poorly designed function to follow good design principles.
First, we'll give the function and its parameters descriptive names. This immediately
improves readability.
Python
# Before (poor)
def calc(x, y):
return x / y
# After (good)
def calculate_ratio(numerator, denominator):
return numerator / denominator
Now, a reader can immediately guess that the function calculates a ratio and understands
the roles of numerator and denominator. The readability is dramatically improved without
even changing the logic.
A docstring is a special string placed at the very beginning of a function definition that
explains what the function does.
Args:
numerator (int or float): The number to be divided.
denominator (int or float): The number to divide by.
Returns:
float: The result of the division, numerator / denominator.
"""
# We still haven't made it robust, but it's now well-documented.
return numerator / denominator
●
●
Python
def check(n):
if n % 2 == 0:
return True
else:
return False
Critique:
1. Vague Name: The function name check is meaningless. What is it checking for? We
have to read the code to learn that it checks if a number is even. A better name
would be is_even.
2. Verbose Logic: The structure if condition: return True else: return False is
redundant. The expression n % 2 == 0 itself evaluates to either True or False. You
can just return the result of that expression directly.
3. No Documentation: There is no docstring to explain its purpose.
4. Not Robust: It assumes n will always be an integer. check("hello") would crash.
5. Further Improvements
Let's improve our friend's function using modern tools and techniques.
If we feed the check(n) function to a GenAI like ChatGPT with the prompt "Improve this
Python function," it would likely suggest:
Python
Args:
n (int): The integer to check.
Returns:
bool: True if n is even, False otherwise.
"""
return n % 2 == 0
This is a huge improvement! It has a better name, a docstring, type hints (n: int), and
simplified logic. However, the AI made an assumption: it assumed the function was only for
integers and added a type hint to reflect that. It didn't add code to handle non-integer inputs.
To ensure a function works as expected, we write test cases. These are simple checks that
verify the output for a given input.
Python
If any of these assert statements fail, the program will crash, telling us our function is buggy.
The technical prompt gives the AI specific constraints, leading to a more robust and
predictable result.
6. Simple Conditional Statements
Conditional statements allow a program to make decisions and execute different blocks of
code based on whether a condition is true or false. The basic structure is if, elif (else if), and
else.
Python
# Test cases
print(get_rating_category(9.1)) # Output: Excellent
print(get_rating_category(7.5)) # Output: Good
print(get_rating_category(4.2)) # Output: Average or below
A doctest is an example written directly inside a docstring that shows how a function should
be used.
Let's say we asked an AI to write a function to check if a number is positive, and it produced
this:
Python
def is_positive(n):
"""
Checks if a number is positive.
A number is positive if it is strictly greater than 0.
>>> is_positive(10)
True
>>> is_positive(-5)
False
>>> is_positive(0) # This doctest will fail!
False
"""
# The bug is here: >= includes 0, but the docstring says "strictly greater".
return n >= 0
The Critique:
● The docstring correctly states that a positive number is "strictly greater than 0" and
that is_positive(0) should be False.
● The code, however, implements the logic as n >= 0, which means it considers 0 to be
positive and returns True for an input of 0.
● When we run the doctests, the test is_positive(0) will fail. The observed output
(True) does not match the expected output (False).
This mismatch is a clear signal that the code is wrong. Always trust the specification (the
docstring) over the implementation.
Word Problem: A theme park offers a discount to a visitor if they meet one of the following
criteria:
1. They are a child (age 12 or under) and are visiting on a weekday.
2. They are a senior (age 65 or over), regardless of the day.
Python
Args:
age (int): The visitor's age.
is_weekday (bool): True if it is a weekday, False otherwise.
Returns:
bool: True if the visitor gets a discount, False otherwise.
"""
# The conditions are grouped with parentheses for clarity.
is_child_on_weekday = (age <= 12 and is_weekday)
is_senior = (age >= 65)
if is_child_on_weekday or is_senior:
return True
else:
return False
To solve a refute problem, you need to manually trace the function's execution for different
inputs, especially edge cases.
Python
Python
>>> is_multiple(10, 2)
True
>>> is_multiple(9, 2)
False
>>> is_multiple("10", "2") # Doctest uses strings!
True
"""
return m % n == 0
The Mismatch:
● The function signature def is_multiple(m: int, n: int) clearly states that the function is
designed to work with integers.
● However, the third doctest, is_multiple("10", "2"), passes strings as arguments.
● This suggests a conflict: either the type hints are wrong, or the doctest is wrong, or
the code itself has a bug when handling these unexpected types.
After the line def is_multiple(...) has been executed, memory looks something like this:
● Global Frame:
○ is_multiple ---> [function object at memory address 0x...]
At this point, the expression m % n == 0 has not been evaluated. The function is simply
defined and waiting to be called.
13. Visualizing a Function Call
When you call a function, like is_multiple(10, 2), a new, temporary workspace is created in
memory called a frame.
When the problematic doctest is_multiple("10", "2") is called, the same process happens.
The assignments are m = "10" and n = "2". The Python interpreter itself does not check the
type hints at runtime. It happily assigns strings to m and n.
This is why tools called static type checkers (like mypy) are important. They can analyze
your code before you run it and warn you about this type mismatch, saying, "You called
is_multiple with strings, but it expected integers."
The doctest is incorrect. It claims the function should return True, but in reality, the function
crashes. The bug is that the code cannot handle the string inputs that the doctest claims it
can.
15. Fixing the code
To fix the code so that it can pass the problematic doctest, we must ensure that the values
are integers before we use the modulo operator. We can do this by explicitly converting the
parameters to integers using the int() function.
Python
>>> is_multiple(10, 2)
True
>>> is_multiple(9, 2)
False
>>> is_multiple("10", "2")
True
"""
# Fix: Convert parameters to integers before the operation.
return int(m) % int(n) == 0
Python
>>> is_multiple_v2(10, 2)
True
>>> is_multiple_v2(10, 3)
False
>>> is_multiple_v2(10, 0)
False
>>> is_multiple_v2(0, 5)
True
"""
if n == 0:
return False
# This check for m==0 is redundant.
if m == 0:
return True
return m % n == 0
The doctests all pass, so if there is a bug, it must be in a case not covered by the tests.
18. Fixing the error, Simplification 1
The previous function's doctests all passed, suggesting it might be correct. However, good
code is not just correct, it's also simple and clear. The is_multiple_v2 function can be
simplified.
is_multiple_v2(0, 5)
1. n == 0 is false.
2. m == 0 is true. The function returns True.
Now let's see what happens without that special check for m=0. The expression 0 % 5
evaluates to 0. So 0 % 5 == 0 is True. The final line of the original function would have
handled this case correctly!
In this specific case, we didn't find a counterexample that proved the function was wrong, but
we found a redundancy that made the code unnecessarily complex. Removing redundant
code is a key part of refactoring and maintenance.
Simplification 1
By removing the redundant check, we get a simpler, cleaner function that has the exact
same behavior.
Python
# Simplified version
def is_multiple_v3(m: int, n: int) -> bool:
"""
Returns True if m is a multiple of n. Handles n=0 by returning False.
"""
if n == 0:
return False
# The final line correctly handles the m=0 case.
return m % n == 0
To prove a function is buggy, you must provide a complete refutation, which consists of:
Example:
● Function: A buggy grading function.
● Python
def get_grade(score):
"""Returns 'A' for >= 90, 'B' for >= 80, 'C' for >= 70."""
if score >= 70:
return 'C'
elif score >= 80:
return 'B'
elif score >= 90:
return 'A'
●
●
● Answer:
○ Counterexample: 95
○ Observed Output: 'C'
○ Expected Output: 'A'
20. Refuting the median function
Let's analyze a buggy function that claims to find the median of three numbers, along with
the kind of feedback a testing tool like "CodeCheck" might provide.
Python
An automated testing tool would run the function against many inputs and report the first
failure it finds. The feedback would look like this:
Trace: When called with a=10, b=1, c=5, the first if condition (1 < 10 < 5) is
false. The second if condition (10 < 1 < 5) is false. The function then incorrectly
returns c, which is 5. The sorted list would be [1, 5, 10], so the median is 1.
A much simpler, more declarative approach is to use Python's built-in sorted() function.
Python
This style of code is often easier to verify as correct. Instead of tracing complex logical
branches, you just need to trust that sorted() works correctly. It clearly states what you want
to achieve (the middle element of a sorted list) rather than getting bogged down in the how
(a complex series of comparisons).
First, always trace the doctests provided in the function's docstring. These are the "easy"
cases the author thought of.
Python
By tracing the doctests, we confirm they pass according to the code's logic. This tells us two
things:
1. The bug is not immediately obvious.
2. The bug must lie in an edge case not covered by the existing tests.
The specification is "between 4 and 10 characters, inclusive." This means 4 and 10 are valid
lengths. The doctests cover length 4, but not length 10. This is a prime place to look for a
bug.
Python
The primary bug here is that the function doesn't validate its input. The specification implies
month will be from 1-12.
Even if the comment were correct, the logic would be flawed due to the final else being a
catch-all.
25. Critique our friend's function
Our friend has written a function to determine a person's status based on their age. It
contains a classic bug related to using a series of if statements instead of if/elif/else.
Python
def get_status(age):
# This logic is flawed.
if age > 18:
status = "Adult"
if age < 65:
# This will overwrite the "Adult" status for anyone under 65.
status = "Not Senior"
if age < 18:
status = "Minor"
return status
Critique:
The problem is that for a single input, multiple if conditions can be true, and the last one to
be evaluated will win.
This is clearly not the intended behavior. The fix is to use an if/elif/else structure, which
guarantees that only one block of code will be executed. It's often best to handle special
cases or the most restrictive conditions first.
Corrected Version:
Python
def get_status_correct(age):
if age < 18:
return "Minor"
elif age < 65:
return "Adult"
else: # age must be >= 65
return "Senior"
26. Boolean/logical operators
Complex boolean expressions can often be simplified to make code more readable and less
error-prone. De Morgan's Laws are a pair of rules that are very useful for simplifying
expressions involving not, and, and or.
● not (A or B) is equivalent to (not A) and (not B)
● not (A and B) is equivalent to (not A) or (not B)
Python
# Send email if it's NOT the case that (the user has unsubscribed OR their account is inactive)
if not (user_has_unsubscribed or is_account_inactive):
send_promo_email()
This is logically correct but a bit hard to parse with the nested not. Let's apply De Morgan's
Law (not (A or B) == (not A) and (not B)):
● A is user_has_unsubscribed
● B is is_account_inactive
Python
This version, which can be read as "if the user is subscribed AND their account is active," is
much more direct and easier to understand.
27. Simplification 3
Understanding the logical effect of a piece of code allows you to rewrite it in a much simpler
form.
Let's analyze the conditions under which the function returns "Go outside":
This can be expressed as a single boolean condition: (not is_raining) or (is_raining and
has_umbrella). This is a bit complex. Let's think about the only condition under which you
"Stay home": when is_raining is True AND has_umbrella is False.
Simplified Code:
Python
This version is much shorter, has less nesting, and is easier to understand, but it has the
exact same logical effect as the original.
28. Identifying an unexpected behaviour
Bugs can arise when a general check is not specific enough for the subsequent code.
Tracing the buggy code with PythonTutor is an excellent way to find this kind of issue.
This function is supposed to log the first element of a data structure, but only if the data
exists.
Python
def log_first_element(data):
"""
Prints the first element of 'data' if 'data' is not None.
"""
# The check is not specific enough.
if data is not None:
# This line will crash if 'data' is an empty list [].
print(f"First element: {data[0]}")
else:
print("No data provided.")
In a boolean context (like an if statement), Python considers certain values to be "false" and
all others to be "true."
● Falsy Values:
○ False
○ None
○ Numeric zero of all types (0, 0.0)
○ Empty sequences and collections ("", [], (), {})
● Truthy Values: Everything else. Any non-empty string, non-zero number, or
non-empty list is considered True.
The bug in log_first_element occurred because the check if data is not None: is too specific.
An empty list [] is not None, so it passes this check. However, an empty list [] is Falsy.
The Fix:
The real intention of the programmer was not just to check for None, but to check if data was
a non-empty, usable collection. A more idiomatic and correct way to write this check is to rely
on truthiness directly.
Python
def log_first_element_fixed(data):
"""
Prints the first element of 'data' if 'data' exists and is not empty.
"""
# This check works for None, [], "", etc.
if data: # This checks for truthiness
print(f"First element: {data[0]}")
else:
print("No data provided or data is empty.")
Setting a Breakpoint: In a real debugger, you could set a breakpoint on the if line. The
program would pause there, and you could inspect the value of data to see exactly what it is
([]) and why your condition is behaving unexpectedly.
30. Short-Circuiting
Python's logical operators and and or use a behavior called short-circuiting to be more
efficient. This means they only evaluate the second operand if it's absolutely necessary to
determine the outcome.
Practical Example:
Imagine you want to check if a user is an admin and has access to a log file. Reading the log
file might be a slow operation.
Python
def is_admin(user):
# a quick check
return user.role == "admin"
def has_log_access(user):
# a slow operation, maybe reads a large file
print("...Checking log file access (slow)...")
return True
If is_admin(current_user) returns False, the and expression already knows the final result
must be False, so it short-circuits and never even calls the slow has_log_access function.
This saves time and resources.
31. Inequivalence due to short-circuiting
While A and B is logically the same as B and A, they are not programmatically equivalent
if A or B has a side effect (like printing to the screen, modifying a variable, or calling another
function).
Example:
Python
def log_and_return_false():
print("Function was called!")
return False
Python
Output:
Python
Output:
Python
Output:
Inside else
Notice that "Function was called!" was not printed. The log_and_return_false() function was
never executed. This demonstrates that changing the order of operands can change the
behavior of your program if side effects are involved.
32. Summary
1. Function Calls Create Frames: Each call gets a new, isolated workspace where
parameters are created and assigned the values of the arguments.
2. Refutation Requires Precision: To refute a function, you need three things: a
counterexample input, the observed incorrect output, and the expected output based
on the function's specification (its docstring).
3. Trace Code Mentally: The key to finding bugs is to trace the code's execution
step-by-step, paying special attention to boundaries and edge cases not covered in
the doctests.
4. Simplify and Refactor: Good code is simple code. Reading different solutions and
applying principles like De Morgan's Laws can help simplify complex conditional
logic.
5. Beware Truthiness: Relying on Python's "Truthy" and "Falsy" values can lead to
concise code (if my_list:), but be careful that the check is specific enough for the
code that follows (e.g., an empty list [] is Falsy, but it is not None).
6. Leverage Short-Circuiting: The and and or operators use short-circuiting, meaning
they only evaluate the second operand if necessary. This can be used for efficiency
and to prevent errors (e.g., if user is not None and user.is_admin:).
7. Order Matters with Side Effects: Because of short-circuiting, the order of operands
in a logical expression can change the program's behavior if the expressions have
side effects (like printing or modifying state).