Cartesian Product of Lists in Python: itertools.product
In Python, you can generate a Cartesian product of multiple lists using itertools.product()
.
All sample code in this article assumes that the itertools
and pprint
modules have been imported. The pprint
module is used to make the results easier to read.
import itertools
import pprint
What is the Cartesian product?
The Cartesian product is the set of all combinations of elements from multiple sets.
Specific examples are shown below.
Basic usage of itertools.product()
itertools.product()
returns an iterator
When you pass two lists as arguments, itertools.product()
returns an object of type itertools.product
, which is an iterator. Therefore, the contents are not directly output by print()
.
l1 = [1, 2, 3]
l2 = ['A', 'B']
p = itertools.product(l1, l2)
print(p)
# <itertools.product object at 0x105e6e2c0>
print(type(p))
# <class 'itertools.product'>
You can obtain the combination of elements from each list as a tuple using a for
loop. Note that if an iterator has reached the end and is iterated over again in the for
loop, nothing will be output.
for v in p:
print(v)
# (1, 'A')
# (1, 'B')
# (2, 'A')
# (2, 'B')
# (3, 'A')
# (3, 'B')
for v in p:
print(v)
It is also possible to get each element separately instead of a tuple.
for v1, v2 in itertools.product(l1, l2):
print(v1, v2)
# 1 A
# 1 B
# 2 A
# 2 B
# 3 A
# 3 B
The result is the same as when using nested loops (multiple loops).
for v1 in l1:
for v2 in l2:
print(v1, v2)
# 1 A
# 1 B
# 2 A
# 2 B
# 3 A
# 3 B
Convert the result to a list
You can also use list()
to convert the result into a list of tuples.
l1 = [1, 2, 3]
l2 = ['A', 'B']
l_p = list(itertools.product(l1, l2))
print(l_p)
# [(1, 'A'), (1, 'B'), (2, 'A'), (2, 'B'), (3, 'A'), (3, 'B')]
print(type(l_p))
# <class 'list'>
print(type(l_p[0]))
# <class 'tuple'>
Multiple iterables with itertools.product()
You can pass multiple types of iterables (like tuple
, list
, range
, etc.) to itertools.product()
.
t = ('A', 'B')
d = {'key1': 'value1', 'key2': 'value2'}
r = range(2)
pprint.pprint(list(itertools.product(t, d, r)))
# [('A', 'key1', 0),
# ('A', 'key1', 1),
# ('A', 'key2', 0),
# ('A', 'key2', 1),
# ('B', 'key1', 0),
# ('B', 'key1', 1),
# ('B', 'key2', 0),
# ('B', 'key2', 1)]
As you can see from the result above, iterating over a dictionary returns the keys. If you need the values, use the values()
method. See the following article for details.
See the following article for more information about range()
.
Use the same list (iterable) repeatedly: repeat
You can specify the number of repetitions in the keyword argument, repeat
. The same iterable is used repeatedly to generate a Cartesian product.
l1 = ['A', 'B']
pprint.pprint(list(itertools.product(l1, repeat=3)))
# [('A', 'A', 'A'),
# ('A', 'A', 'B'),
# ('A', 'B', 'A'),
# ('A', 'B', 'B'),
# ('B', 'A', 'A'),
# ('B', 'A', 'B'),
# ('B', 'B', 'A'),
# ('B', 'B', 'B')]
Same as the following example without repeat
.
pprint.pprint(list(itertools.product(l1, l1, l1)))
# [('A', 'A', 'A'),
# ('A', 'A', 'B'),
# ('A', 'B', 'A'),
# ('A', 'B', 'B'),
# ('B', 'A', 'A'),
# ('B', 'A', 'B'),
# ('B', 'B', 'A'),
# ('B', 'B', 'B')]
If multiple iterables are specified:
l1 = [1, 2]
l2 = ['A', 'B']
pprint.pprint(list(itertools.product(l1, l2, repeat=2)))
# [(1, 'A', 1, 'A'),
# (1, 'A', 1, 'B'),
# (1, 'A', 2, 'A'),
# (1, 'A', 2, 'B'),
# (1, 'B', 1, 'A'),
# (1, 'B', 1, 'B'),
# (1, 'B', 2, 'A'),
# (1, 'B', 2, 'B'),
# (2, 'A', 1, 'A'),
# (2, 'A', 1, 'B'),
# (2, 'A', 2, 'A'),
# (2, 'A', 2, 'B'),
# (2, 'B', 1, 'A'),
# (2, 'B', 1, 'B'),
# (2, 'B', 2, 'A'),
# (2, 'B', 2, 'B')]
Same as the following example. Note that with repeat=2
, the order of arguments is l1, l2, l1, l2
instead of l1, l1, l2, l2
.
pprint.pprint(list(itertools.product(l1, l2, l1, l2)))
# [(1, 'A', 1, 'A'),
# (1, 'A', 1, 'B'),
# (1, 'A', 2, 'A'),
# (1, 'A', 2, 'B'),
# (1, 'B', 1, 'A'),
# (1, 'B', 1, 'B'),
# (1, 'B', 2, 'A'),
# (1, 'B', 2, 'B'),
# (2, 'A', 1, 'A'),
# (2, 'A', 1, 'B'),
# (2, 'A', 2, 'A'),
# (2, 'A', 2, 'B'),
# (2, 'B', 1, 'A'),
# (2, 'B', 1, 'B'),
# (2, 'B', 2, 'A'),
# (2, 'B', 2, 'B')]
Speed comparison with multiple loops (nested loops)
As mentioned above, multiple loops (nested loops) give the same result as itertools.product()
.
l1 = [1, 2, 3]
l2 = ['A', 'B']
for v1, v2 in itertools.product(l1, l2):
print(v1, v2)
# 1 A
# 1 B
# 2 A
# 2 B
# 3 A
# 3 B
for v1 in l1:
for v2 in l2:
print(v1, v2)
# 1 A
# 1 B
# 2 A
# 2 B
# 3 A
# 3 B
As you can see below, itertools.product()
is actually slower than nested loops.
The results may differ depending on the number of elements in the iterable and the number of loops, but following Q&A on Stack Overflow also answers that itertools.product()
is slower.
- loops - Python itertools - slow? - Stack Overflow
- python - itertools.product slower than nested for loops - Stack Overflow
The following examples use the Jupyter Notebook magic command %%timeit
. Note that these will not work if run as Python scripts.
Example using a double loop with 1000 elements
The result of itertools.product()
is faster to unpack.
import itertools
A = range(1000)
%%timeit
for x in itertools.product(A, A):
pass
# 30.8 ms ± 910 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
for a1, a2 in itertools.product(A, A):
pass
# 22.8 ms ± 293 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Nested loops are about the same (slightly faster) as itertools.product()
when unpacked.
%%timeit
for a1 in A:
for a2 in A:
pass
# 22.6 ms ± 345 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
It's faster not to unpack when using a generator expression, which is the generator version of list comprehension. However, it's still slower than using itertools.product()
or nested loops.
%%timeit
for x in ((a1, a2) for a1 in A for a2 in A):
pass
# 82.2 ms ± 467 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
for a1, a2 in ((a1, a2) for a1 in A for a2 in A):
pass
# 91.4 ms ± 276 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Example of calculating the sum of the products of each combination. Again, it's faster to use nested loops than itertools.product()
.
%%timeit
v = 0
for a1, a2 in itertools.product(A, A):
v += a1 * a2
# 98.8 ms ± 579 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
v = 0
for a1 in A:
for a2 in A:
v += a1 * a2
# 95.7 ms ± 4.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In this example, passing the generator expression to sum()
is slightly faster.
%%timeit
v = sum(a1 * a2 for a1, a2 in itertools.product(A, A))
# 94 ms ± 2.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
v = sum(a1 * a2 for a1 in A for a2 in A)
# 92.7 ms ± 4.83 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Example using a triple loop with 100 elements
Again, using nested loops is the fastest.
B = range(100)
%%timeit
for x in itertools.product(B, B, B):
pass
# 31.6 ms ± 725 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
for b1, b2, b3 in itertools.product(B, B, B):
pass
# 26.2 ms ± 490 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
for b1 in B:
for b2 in B:
for b3 in B:
pass
# 12.9 ms ± 176 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%%timeit
for x in ((b1, b2, b3) for b1 in B for b2 in B for b3 in B):
pass
# 80.9 ms ± 1.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
for b1, b2, b3 in ((b1, b2, b3) for b1 in B for b2 in B for b3 in B):
pass
# 93.8 ms ± 3.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
As mentioned above, the time difference between using a double loop with 1000 elements and a triple loop with 100 elements is just a few tens of milliseconds.