Set Operations in Python (Union, Intersection, Symmetric Difference)

Modified: | Tags: Python, Mathematics

In Python, set is a collection of unique elements. It can perform set operations such as union, intersection, difference, and symmetric difference.

set is mutable, allowing adding and removing elements. Conversely, Python also offers frozenset, which supports set operations similar to set but is immutable. frozenset cannot be modified by adding or removing elements.

For both set and frozenset, just like with list and tuple, you can use the built-in len() function to determine the number of elements and the in operator to check for element existence.

Create a set object: {}, set comprehensions

set objects can be created by enclosing elements in curly brackets {}.

Duplicate values are ignored, and only unique values remain as elements. Since a set is unordered, the order in which elements are added is not preserved.

s = {3, 1, 2, 2, 3, 1, 4}
print(s)
# {1, 2, 3, 4}

print(type(s))
# <class 'set'>
source: set.py

While a set can contain elements of different types, it cannot include mutable objects such as a list.

s = {1.23, 'abc', (0, 1, 2)}
print(s)
# {(0, 1, 2), 1.23, 'abc'}

# s = {[0, 1, 2]}
# TypeError: unhashable type: 'list'
source: set.py

Values from different types are treated as duplicates if they are equal. For example, in Python, the boolean type (bool) is a subclass of the integer type (int). This means True is equivalent to 1, and False is equivalent to 0.

s = {1, 1.0, True}
print(s)
# {1}
source: set.py

Since an empty {} is considered a dictionary (dict), an empty set can be created using set().

s = set()
print(s)
# set()

print(type(s))
# <class 'set'>
source: set.py

Python supports set comprehensions, similar to list comprehensions. Use curly brackets {} instead of square brackets [].

s = {i**2 for i in range(5)}
print(s)
# {0, 1, 4, 9, 16}
source: set.py

Convert between set, list, and tuple: set(), list(), tuple()

set objects can also be created with set().

By providing an iterable object, such as a list or a tuple, as an argument, a set object is created that contains only unique values, excluding duplicates.

l = [2, 2, 3, 1, 3, 4]
print(set(l))
# {1, 2, 3, 4}

t = (2, 2, 3, 1, 3, 4)
print(set(t))
# {1, 2, 3, 4}
source: set.py

As demonstrated above, set() can be used to remove duplicate elements from a list or a tuple. However, the original order is not preserved. See the following article for removing duplicate elements in the original order or extracting only duplicate elements.

You can use list() and tuple() to convert a set into a list or a tuple.

s = {1, 2, 3}

print(list(s))
# [1, 2, 3]

print(tuple(s))
# (1, 2, 3)
source: set.py

Add an element to the set: add()

Use the add() method to add an element to the set.

s = {0, 1, 2}

s.add(3)
print(s)
# {0, 1, 2, 3}
source: set.py

To add elements to a set by joining another set to it, use the | operator or the union() method described below.

Remove an element from the set: discard(), remove(), pop(), clear()

Use the discard(), remove(), pop(), and clear() methods to remove an element from the set.

The discard() method removes the element specified by its argument. If a value that does not exist in the set is specified, no action is taken.

s = {0, 1, 2}

s.discard(1)
print(s)
# {0, 2}

s.discard(10)
print(s)
# {0, 2}
source: set.py

The remove() method also removes the element specified by the argument, but it raises an error KeyError if a value that does not exist in the set is specified.

s = {0, 1, 2}

s.remove(1)
print(s)
# {0, 2}

# s.remove(10)
# KeyError: 10
source: set.py

The pop() method removes an element and returns its value. You cannot choose which values to remove. It raises an error KeyError if the set is empty.

s = {0, 1, 2}

v = s.pop()
print(v)
# 0

print(s)
# {1, 2}

s = set()

# v = s.pop()
# KeyError: 'pop from an empty set'
source: set.py

The clear() method removes all elements from the set, making it empty.

s = {0, 1, 2}

s.clear()
print(s)
# set()
source: set.py

Union: | operator, union()

You can get the union with the | operator or the union() method.

s1 = {0, 1, 2}
s2 = {1, 2, 3}
s3 = {2, 3, 4}

print(s1 | s2)
# {0, 1, 2, 3}

print(s1.union(s2))
# {0, 1, 2, 3}
source: set.py

Multiple arguments can be specified for union().

You can specify arguments not only as set but also other types like list and tuple, which can be converted to set. The same applies to the following methods.

print(s1.union(s2, s3))
# {0, 1, 2, 3, 4}

print(s1.union(s2, [5, 6, 5, 7, 5]))
# {0, 1, 2, 3, 5, 6, 7}
source: set.py

There are also cumulative assignment operators |= and the update() method that assign and update the result to the object.

s1 |= s2
print(s1)
# {0, 1, 2, 3}

s2.update(s3)
print(s2)
# {1, 2, 3, 4}
source: set.py

Intersection: & operator, intersection()

You can get the intersection with the & operator or the intersection() method.

s1 = {0, 1, 2}
s2 = {1, 2, 3}
s3 = {2, 3, 4}

print(s1 & s2)
# {1, 2}

print(s1.intersection(s2))
# {1, 2}

print(s1.intersection(s2, s3))
# {2}
source: set.py

Use the &= operator and the intersection_update() method for cumulative assignments.

s1 &= s2
print(s1)
# {1, 2}

s2.intersection_update(s3)
print(s2)
# {2, 3}
source: set.py

Difference: - operator, difference()

You can get the difference with the - operator or the difference() method.

s1 = {0, 1, 2}
s2 = {1, 2, 3}
s3 = {2, 3, 4}

print(s1 - s2)
# {0}

print(s1.difference(s2))
# {0}

print(s1.difference(s2, s3))
# {0}
source: set.py

Use the -= operator and the difference_update() method for cumulative assignments.

s1 -= s2
print(s1)
# {0}

s2.difference_update(s3)
print(s2)
# {1}
source: set.py

Symmetric difference: ^ operator, symmetric_difference()

You can get the symmetric difference with the ^ operator or symmetric_difference(). Unlike previous methods, only one argument can be specified for the symmetric_difference() method.

s1 = {0, 1, 2}
s2 = {1, 2, 3}
s3 = {2, 3, 4}

print(s1 ^ s2)
# {0, 3}

print(s1.symmetric_difference(s2))
# {0, 3}
source: set.py

Use the ^= operator and the symmetric_difference_update() method for cumulative assignments.

s1 ^= s2
print(s1)
# {0, 3}

s2.symmetric_difference_update(s3)
print(s2)
# {1, 4}
source: set.py

Test if A is a subset of B: <= operator, issubset()

To test whether A is a subset of B, meaning all elements of A are contained in B, use the <= operator or the issubset() method.

s1 = {0, 1}
s2 = {0, 1, 2, 3}

print(s1 <= s2)
# True

print(s1.issubset(s2))
# True
source: set.py

Both the <= operator and the issubset() method return True for equivalent sets. To test if a set is a proper subset, use the < operator, which returns False for equivalent sets.

print(s1 <= s1)
# True

print(s1.issubset(s1))
# True

print(s1 < s1)
# False
source: set.py

Test if A is a superset of B: >= operator, issuperset()

To test whether A is a superset of B, meaning all elements of B are contained in A, use the >= operator or the issuperset() method.

s1 = {0, 1}
s2 = {0, 1, 2, 3}

print(s2 >= s1)
# True

print(s2.issuperset(s1))
# True
source: set.py

Both the >= operator and the issuperset() method return True for equivalent sets. To test if a set is a proper superset, use the > operator, which returns False for equivalent sets.

print(s1 >= s1)
# True

print(s1.issuperset(s1))
# True

print(s1 > s1)
# False
source: set.py

Test if A and B are disjoint: isdisjoint()

To test whether A and B are disjoint, i.e., whether A and B have no common elements, use the isdisjoint() method.

s1 = {0, 1}
s2 = {1, 2}
s3 = {2, 3}

print(s1.isdisjoint(s2))
# False

print(s1.isdisjoint(s3))
# True
source: set.py

Immutable sets: frozenset

As explained, set is mutable, allowing elements to be added, removed, and so on. An immutable set type, frozenset, is also provided.

You can create a frozenset object by specifying a list or other iterable objects in the constructor, frozenset().

fs = frozenset([2, 2, 3, 1, 3, 4])
print(fs)
# frozenset({1, 2, 3, 4})

print(type(fs))
# <class 'frozenset'>
source: frozenset.py

The add() and discard() methods for adding and removing elements are not available for frozenset.

# fs.add(5)
# AttributeError: 'frozenset' object has no attribute 'add'

# fs.discard(1)
# AttributeError: 'frozenset' object has no attribute 'discard'
source: frozenset.py

Operators and methods for set operations, such as union, can be used like those for the set.

fs1 = frozenset([0, 1, 2])
fs2 = frozenset([1, 2, 3])

print(fs1 | fs2)
# frozenset({0, 1, 2, 3})

print(fs1.difference(fs2))
# frozenset({0})

print(fs1.isdisjoint(fs2))
# False
source: frozenset.py

A mutable set cannot be used as a key in a dictionary or as an element of another set, whereas an immutable frozenset can.

s = {0, 1, 2}
fs = frozenset([0, 1, 2])

# d = {s: 100}
# TypeError: unhashable type: 'set'

d = {fs: 100}
print(d)
# {frozenset({0, 1, 2}): 100}

# ss = {s}
# TypeError: unhashable type: 'set'

ss = {fs}
print(ss)
# {frozenset({0, 1, 2})}
source: frozenset.py

Related Categories

Related Articles