Finding unneeded pragmas

Sunday 24 August 2025

A proof-of-concept tool for finding unneeded coverage.py exclusion pragmas

To answer a long-standing coverage.py feature request, I threw together an experiment: a tool to identify lines that have been excluded from coverage, but which were actually executed.

The program is a standalone file in the coverage.py repo. It is unsupported. I’d like people to try it to see what they think of the idea. Later we can decide what to do with it.

To try it: copy warn_executed.py from GitHub. Create a .toml file that looks something like this:

# Regexes that identify excluded lines:
warn-executed = [
    "pragma: no cover",
    "raise AssertionError",
    "pragma: cant happen",
    "pragma: never called",
    ]

# Regexes that identify partial branch lines:
warn-not-partial = [
    "pragma: no branch",
    ]

These are exclusion regexes that you’ve used in your coverage runs. The program will print out any line identified by a pattern and that ran during your tests. It might be that you don’t need to exclude the line, because it ran.

In this file, none of your coverage settings or the default regexes are assumed: you need to explicitly specify all the patterns you want flagged.

Run the program with Python 3.11 or higher, giving the name of the coverage data file and the name of your new TOML configuration file. It will print the lines that might not need excluding:

$ python3.12 warn_executed.py .coverage warn.toml

The reason for a new list of patterns instead of just reading the existing coverage settings is that some exclusions are “don’t care” rather than “this will never happen.” For example, I exclude “def __repr__” because some __repr__’s are just to make my debugging easier. I don’t care if the test suite runs them or not. It might run them, so I don’t want it to be a warning that they actually ran.

This tool is not perfect. For example, I exclude “if TYPE_CHECKING:” because I want that entire clause excluded. But the if-line itself is actually run. If I include that pattern in the warn-executed list, it will flag all of those lines. Maybe I’m forgetting a way to do this: it would be good to have a way to exclude the body of the if clause while understanding that the if-line itself is executed.

Give warn_executed.py a try and comment on the issue about what you think of it.

Starting with pytest’s parametrize

Wednesday 13 August 2025

Pytest’s parametrize feature is powerful but it looks scary. I hope this step-by-step explanation helps people use it more.

Writing tests can be difficult and repetitive. Pytest has a feature called parametrize that can make it reduce duplication, but it can be hard to understand if you are new to the testing world. It’s not as complicated as it seems.

Let’s say you have a function called add_nums() that adds up a list of numbers, and you want to write tests for it. Your tests might look like this:

def test_123():
    assert add_nums([1, 2, 3]) == 6

def test_negatives():
    assert add_nums([1, 2, -3]) == 0

def test_empty():
    assert add_nums([]) == 0

This is great: you’ve tested some behaviors of your add_nums() function. But it’s getting tedious to write out more test cases. The names of the function have to be different from each other, and they don’t mean anything, so it’s extra work for no benefit. The test functions all have the same structure, so you’re repeating uninteresting details. You want to add more cases but it feels like there’s friction that you want to avoid.

If we look at these functions, they are very similar. In any software, when we have functions that are similar in structure, but differ in some details, we can refactor them to be one function with parameters for the differences. We can do the same for our test functions.

Here the functions all have the same structure: call add_nums() and assert what the return value should be. The differences are the list we pass to add_nums() and the value we expect it to return. So we can turn those into two parameters in our refactored function:

def test_add_nums(nums, expected_total):
    assert add_nums(nums) == expected_total

Unfortunately, tests aren’t run like regular functions. We write the test functions, but we don’t call them ourselves. That’s the reason the names of the test functions don’t matter. The test runner (pytest) finds functions named test_* and calls them for us. When they have no parameters, pytest can call them directly. But now that our test function has two parameters, we have to give pytest instructions about how to call it.

To do that, we use the @pytest.mark.parametrize decorator. Using it looks like this:

import pytest

@pytest.mark.parametrize(
    "nums, expected_total",
    [
        ([1, 2, 3], 6),
        ([1, 2, -3], 0),
        ([], 0),
    ]
)
def test_add_nums(nums, expected_total):
    assert add_nums(nums) == expected_total

There’s a lot going on here, so let’s take it step by step.

If you haven’t seen a decorator before, it starts with @ and is like a prologue to a function definition. It can affect how the function is defined or provide information about the function.

The parametrize decorator is itself a function call that takes two arguments. The first is a string (“nums, expected_total”) that names the two arguments to the test function. Here the decorator is instructing pytest, “when you call test_add_nums, you will need to provide values for its nums andexpected_total parameters.”

The second argument to parametrize is a list of the values to supply as the arguments. Each element of the list will become one call to our test function. In this example, the list has three tuples, so pytest will call our test function three times. Since we have two parameters to provide, each element of the list is a tuple of two values.

The first tuple is ([1, 2, 3], 6), so the first time pytest calls test_add_nums, it will call it as test_add_nums([1, 2, 3], 6). All together, pytest will call us three times, like this:

test_add_nums([1, 2, 3], 6)
test_add_nums([1, 2, -3], 0)
test_add_nums([], 0)

This will all happen automatically. With our original test functions, when we ran pytest, it showed the results as three passing tests because we had three separate test functions. Now even though we only have one function, it still shows as three passing tests! Each set of values is considered a separate test that can pass or fail independently. This is the main advantage of using parametrize instead of writing three separate assert lines in the body of a simple test function.

What have we gained?

  • We don’t have to write three separate functions with different names.
  • We don’t have to repeat the same details in each function (assert, add_nums(), ==).
  • The differences between the tests (the actual data) are written succinctly all in one place.
  • Adding another test case is as simple as adding another line of data to the decorator.

Coverage.py regex pragmas

Monday 28 July 2025

Coverage.py uses regexes to define pragma syntax. This is surprisingly powerful.

Coverage.py lets you indicate code to exclude from measurement by adding comments to your Python files. But coverage implements them differently than other similar tools. Rather than having fixed syntax for these comments, they are defined using regexes that you can change or add to. This has been surprisingly powerful.

The basic behavior: coverage finds lines in your source files that match the regexes. These lines are excluded from measurement, that is, it’s OK if they aren’t executed. If a matched line is part of a multi-line statement the whole multi-line statement is excluded. If a matched line introduces a block of code the entire block is excluded.

At first, these regexes were just to make it easier to implement the basic “here’s the comment you use” behavior for pragma comments. But it also enabled pragma-less exclusions. You could decide (for example) that you didn’t care to test any __repr__ methods. By adding def __repr__ as an exclusion regex, all of those methods were automatically excluded from coverage measurement without having to add a comment to each one. Very nice.

Not only did this let people add custom exclusions in their projects, but it enabled third-party plugins that could configure regexes in other interesting ways:

  • covdefaults adds a bunch of default exclusions, and also platform- and version-specific comment syntaxes.
  • coverage-conditional-plugin gives you a way to create comment syntaxes for entire files, for whether other packages are installed, and so on.

Then about a year ago, Daniel Diniz contributed a change that amped up the power: regexes could match multi-line patterns. This sounds like not that large a change, but it enabled much more powerful exclusions. As a sign, it made it possible to support four different feature requests.

To make it work, Daniel changed the matching code. Originally, it was a loop over the lines in the source file, checking each line for a match against the regexes. The new code uses the entire source file as the target string, and loops over the matches against that text. Each match is converted into a set of line numbers and added to the results.

The power comes from being able to use one pattern to match many lines. For example, one of the four feature requests was how to exclude an entire file. With configurable multi-line regex patterns, you can do this yourself:

\A(?s:.*# pragma: exclude file.*)\Z

With this regex, if you put the comment “# pragma: exclude file” in your source file, the entire file will be excluded. The \A and \Z match the start and end of the target text, which remember is the entire file. The (?s:...) means the s/DOTALL flag is in effect, so . can match newlines. This pattern matches the entire source file if the desired pragma is somewhere in the file.

Another requested feature was excluding code between two lines. We can use “# no cover: start” and “# no cover: end” as delimiters with this regex:

# no cover: start(?s:.*?)# no cover: stop

Here (?s:.*?) means any number of any character at all, but as few as possible. A star in regexes means as many as possible, but star-question-mark means as few as possible. We need the minimal match so that we don’t match from the start of one pair of comments all the way through to the end of a different pair of comments.

This regex approach is powerful, but is still fairly shallow. For example, either of these two examples would get the wrong lines if you had a string literal with the pragma text in it. There isn’t a regex that skips easily over string literals.

This kind of difficulty hit home when I added a new default pattern to exclude empty placeholder methods like this:

def not_yet(self): ...

def also_not_this(self):
    ...

async def definitely_not_this(
    self,
    arg1,
):
    ...

We can’t just match three dots, because ellipses can be used in other places than empty function bodies. We need to be more delicate. I ended up with:

^\s*(((async )?def .*?)?\)(\s*->.*?)?:\s*)?\.\.\.\s*(#|$)

This craziness ensures the ellipsis is part of an (async) def, that the ellipsis appears first in the body (but no docstring allowed, doh!), allows for a comment on the line, and so on. And even with a pattern this complex, it would incorrectly match this contrived line:

def f(): print("(well): ... #2 false positive!")

So regexes aren’t perfect, but they’re a pretty good balance: flexible and powerful, and will work great on real code even if we can invent weird edge cases where they fail.

What started as a simple implementation expediency has turned into a powerful configuration option that has done more than I would have thought.

Coverage 7.10.0: patch

Thursday 24 July 2025

Coverage 7.10 has some significant new features that have solved some long-standing problems.

Years ago I greeted a friend returning from vacation and asked how it had been. She answered, “It was good, I got a lot done!” I understand that feeling. I just had a long vacation myself, and used the time to clean up some old issues and add some new features in coverage.py v7.10.

The major new feature is a configuration option, [run] patch. With it, you specify named patches that coverage can use to monkey-patch some behavior that gets in the way of coverage measurement.

The first is subprocess. Coverage works great when you start your program with coverage measurement, but has long had the problem of how to also measure the coverage of sub-processes that your program created. The existing solution had been a complicated two-step process of creating obscure .pth files and setting environment variables. Whole projects appeared on PyPI to handle this for you.

Now, patch = subprocess will do this for you automatically, and clean itself up when the program ends. It handles sub-processes created by the subprocess module, the os.system() function, and any of the execv or spawnv families of functions.

This alone has spurred one user to exclaim,

The latest release of Coverage feels like a Christmas present! The native support for Python subprocesses is so good!

Another patch is _exit. This patches os._exit() so that coverage saves its data before exiting. The os._exit() function is an immediate and abrupt termination of the program, skipping all kinds of registered clean up code. This patch makes it possible to collect coverage data from programs that end this way.

The third patch is execv. The execv functions end the current program and replace it with a new program in the same process. The execv patch arranges for coverage to save its data before the current program is ended.

Now that these patches are available, it seems silly that it’s taken so long. They (mostly) weren’t difficult. I guess it took looking at the old issues, realizing the friction they caused, and thinking up a new way to let users control the patching. Monkey-patching is a bit invasive, so I’ve never wanted to do it implicitly. The patch option gives the user an explicit way to request what they need without having to get into the dirty details themselves.

Another process-oriented feature was contributed by Arkady Gilinsky: with --save-signal=USR1 you can specify a user signal that coverage will attend to. When you send the signal to your running coverage process, it will save the collected data to disk. This gives a way to measure coverage in a long-running process without having to end the process.

There were some other fixes and features along the way, like better HTML coloring of multi-line statements, and more default exclusions (if TYPE_CHECKING: and ...).

It feels good to finally address some of these pain points. I also closed some stale issues and pull requests. There is more to do, always more to do, but this feels like a real step forward. Give coverage 7.10.0 a try and let me know how it works for you.

2048: iterators and iterables

Tuesday 15 July 2025

Making a simple game, I waded into a classic iterator/iterable confusion.

I wrote a low-tech terminal-based version of the classic 2048 game and had some interesting difficulties with iterators along the way.

2048 has a 4×4 grid with sliding tiles. Because the tiles can slide left or right and up or down, sometimes we want to loop over the rows and columns from 0 to 3, and sometimes from 3 to 0. My first attempt looked like this:

N = 4
if sliding_right:
    cols = range(N-1, -1, -1)   # 3 2 1 0
else:
    cols = range(N)             # 0 1 2 3

if sliding_down:
    rows = range(N-1, -1, -1)   # 3 2 1 0
else:
    rows = range(N)             # 0 1 2 3

for row in rows:
    for col in cols:
        ...

This worked, but those counting-down ranges are ugly. Let’s make it nicer:

cols = range(N)                 # 0 1 2 3
if sliding_right:
    cols = reversed(cols)       # 3 2 1 0

rows = range(N)                 # 0 1 2 3
if sliding_down:
    rows = reversed(rows)       # 3 2 1 0

for row in rows:
    for col in cols:
        ...

Looks cleaner, but it doesn’t work! Can you see why? It took me a bit of debugging to see the light.

range() produces an iterable: something that can be iterated over. Similar but different is that reversed() produces an iterator: something that is already iterating. Some iterables (like ranges) can be used more than once, creating a new iterator each time. But once an iterator like reversed() has been consumed, it is done. Iterating it again will produce no values.

If “iterable” vs “iterator” is already confusing here’s a quick definition: an iterable is something that can be iterated, that can produce values in a particular order. An iterator tracks the state of an iteration in progress. An analogy: the pages of a book are iterable; a bookmark is an iterator. The English hints at it: an iter-able is able to be iterated at some point, an iterator is actively iterating.

The outer loop of my double loop was iterating only once over the rows, so the row iteration was fine whether it was going forward or backward. But the columns were being iterated again for each row. If the columns were going forward, they were a range, a reusable iterable, and everything worked fine.

But if the columns were meant to go backward, they were a one-use-only iterator made by reversed(). The first row would get all the columns, but the other rows would try to iterate using a fully consumed iterator and get nothing.

The simple fix was to use list() to turn my iterator into a reusable iterable:

cols = list(reversed(cols))

The code was slightly less nice, but it worked. An even better fix was to change my doubly nested loop into a single loop:

for row, col in itertools.product(rows, cols):

That also takes care of the original iterator/iterable problem, so I can get rid of that first fix:

cols = range(N)
if sliding_right:
    cols = reversed(cols)

rows = range(N)
if sliding_down:
    rows = reversed(rows)

for row, col in itertools.product(rows, cols):
    ...

Once I had this working, I wondered why product() solved the iterator/iterable problem. The docs have a sample Python implementation that shows why: internally, product() is doing just what my list() call did: it makes an explicit iterable from each of the iterables it was passed, then picks values from them to make the pairs. This lets product() accept iterators (like my reversed range) rather than forcing the caller to always pass iterables.

If your head is spinning from all this iterable / iterator / iteration talk, I don’t blame you. Just now I said, “it makes an explicit iterable from each of the iterables it was passed.” How does that make sense? Well, an iterator is an iterable. So product() can take either a reusable iterable (like a range or a list) or it can take a use-once iterator (like a reversed range). Either way, it populates its own reusable iterables internally.

Python’s iteration features are powerful but sometimes require careful thinking to get right. Don’t overlook the tools in itertools, and mind your iterators and iterables!

•    •    •

Some more notes:

1: Another way to reverse a range: you can slice them!

>>> range(4)
range(0, 4)
>>> range(4)[::-1]
range(3, -1, -1)
>>> reversed(range(4))
<range_iterator object at 0x10307cba0>

It didn’t occur to me to reverse-slice the range, since reversed is right there, but the slice gives you a new reusable range object while reversing the range gives you a use-once iterator.

2: Why did product() explicitly store the values it would need but reversed did not? Two reasons: first, reversed() depends on the __reversed__ dunder method, so it’s up to the original object to decide how to implement it. Ranges know how to produce their values in backward order, so they don’t need to store them all. Second, product() is going to need to use the values from each iterable many times and can’t depend on the iterables being reusable.

Math factoid of the day: 63

Monday 16 June 2025

Two geometric facts about 63, but how to connect them?

63 is a centered octahedral number. That means if you build an approximation of an octahedron with cubes, one size of octahedron will have 63 cubes.

In the late 1700’s René Just Haüy developed a theory about how crystals formed: successive layers of fundamental primitives in orderly arrangements. One of those arrangements was stacking cubes together to make an octahedron.

Start with one cube:

Just one lonely cube

Add six more cubes around it, one on each face. Now we have seven:

Seven cubes as a crude octahedron

Add another layer, adding a cube to touch each visible cube, making 25:

25 cubes arranged like an octahedron five cubes wide

One more layer and we have a total of 63:

63 cubes arranged like an octahedron seven cubes wide

The remaining numbers in the sequence less than 10,000 are 129, 231, 377, 575, 833, 1159, 1561, 2047, 2625, 3303, 4089, 4991, 6017, 7175, 8473, 9919.

63 also shows up in the Delannoy numbers: the number of ways to traverse a grid from the lower left corner to upper right using only steps north, east, or northeast. Here are the 63 ways of moving on a 3×3 grid:

63 different ways to traverse a 3x3 grid

(Diagram from Wikipedia)

In fact, the number of cubes in a Haüy octahedron with N layers is the same as the number of Delannoy steps on a 3×N grid!

Since the two ideas are both geometric and fairly simple, I would love to find a geometric explanation for the correspondence. The octahedron is three-dimensional, and the Delannoy grids have that tantalizing 3 in them. It seems like there should be a way to convert Haüy coordinates to Delannoy coordinates to show how they relate. But I haven’t found one...

•    •    •

Colophon: I made the octahedron diagrams by asking Claude to write a Python program to do it. It wasn’t a fast process because it took pushing and prodding to get the diagrams to come out the way I liked. But Claude was very competent, and I could think about the results rather than about projections or color spaces. I could dip into it for 10 minutes at a time over a number of days without having to somehow reconstruct a mental context.

This kind of casual hobby programming is perfect for AI assistance. I don’t need the code to be perfect or even good, I just want the diagrams to be nice. I don’t have the focus time to learn how to write the program, so I can leave it to an imperfect assistant.

Older:

Apr 3:

Nedflix