Set Operations in Python (Union, Intersection, Symmetric Difference)
In Python, set
is a collection of unique elements. It can perform set operations such as union, intersection, difference, and symmetric difference.
- Create a
set
object:{}
, set comprehensions - Convert between
set
,list
, andtuple
:set()
,list()
,tuple()
- Add an element to the
set
:add()
- Remove an element from the
set
:discard()
,remove()
,pop()
,clear()
- Union:
|
operator,union()
- Intersection:
&
operator,intersection()
- Difference:
-
operator,difference()
- Symmetric difference:
^
operator,symmetric_difference()
- Test if A is a subset of B:
<=
operator,issubset()
- Test if A is a superset of B:
>=
operator,issuperset()
- Test if A and B are disjoint:
isdisjoint()
- Immutable sets:
frozenset
set
is mutable, allowing adding and removing elements. Conversely, Python also offers frozenset
, which supports set operations similar to set
but is immutable. frozenset
cannot be modified by adding or removing elements.
For both set
and frozenset
, just like with list
and tuple
, you can use the built-in len()
function to determine the number of elements and the in
operator to check for element existence.
Create a set
object: {}
, set comprehensions
set
objects can be created by enclosing elements in curly brackets {}
.
Duplicate values are ignored, and only unique values remain as elements. Since a set
is unordered, the order in which elements are added is not preserved.
s = {3, 1, 2, 2, 3, 1, 4}
print(s)
# {1, 2, 3, 4}
print(type(s))
# <class 'set'>
While a set
can contain elements of different types, it cannot include mutable objects such as a list
.
s = {1.23, 'abc', (0, 1, 2)}
print(s)
# {(0, 1, 2), 1.23, 'abc'}
# s = {[0, 1, 2]}
# TypeError: unhashable type: 'list'
Values from different types are treated as duplicates if they are equal. For example, in Python, the boolean type (bool
) is a subclass of the integer type (int
). This means True
is equivalent to 1
, and False
is equivalent to 0
.
s = {1, 1.0, True}
print(s)
# {1}
Since an empty {}
is considered a dictionary (dict
), an empty set
can be created using set()
.
s = set()
print(s)
# set()
print(type(s))
# <class 'set'>
Python supports set comprehensions, similar to list comprehensions. Use curly brackets {}
instead of square brackets []
.
s = {i**2 for i in range(5)}
print(s)
# {0, 1, 4, 9, 16}
Convert between set
, list
, and tuple
: set()
, list()
, tuple()
set
objects can also be created with set()
.
By providing an iterable object, such as a list
or a tuple
, as an argument, a set
object is created that contains only unique values, excluding duplicates.
l = [2, 2, 3, 1, 3, 4]
print(set(l))
# {1, 2, 3, 4}
t = (2, 2, 3, 1, 3, 4)
print(set(t))
# {1, 2, 3, 4}
As demonstrated above, set()
can be used to remove duplicate elements from a list
or a tuple
. However, the original order is not preserved. See the following article for removing duplicate elements in the original order or extracting only duplicate elements.
You can use list()
and tuple()
to convert a set
into a list
or a tuple
.
s = {1, 2, 3}
print(list(s))
# [1, 2, 3]
print(tuple(s))
# (1, 2, 3)
Add an element to the set
: add()
Use the add()
method to add an element to the set
.
s = {0, 1, 2}
s.add(3)
print(s)
# {0, 1, 2, 3}
To add elements to a set
by joining another set
to it, use the |
operator or the union()
method described below.
Remove an element from the set
: discard()
, remove()
, pop()
, clear()
Use the discard()
, remove()
, pop()
, and clear()
methods to remove an element from the set
.
The discard()
method removes the element specified by its argument. If a value that does not exist in the set
is specified, no action is taken.
s = {0, 1, 2}
s.discard(1)
print(s)
# {0, 2}
s.discard(10)
print(s)
# {0, 2}
The remove()
method also removes the element specified by the argument, but it raises an error KeyError
if a value that does not exist in the set
is specified.
s = {0, 1, 2}
s.remove(1)
print(s)
# {0, 2}
# s.remove(10)
# KeyError: 10
The pop()
method removes an element and returns its value. You cannot choose which values to remove. It raises an error KeyError
if the set
is empty.
s = {0, 1, 2}
v = s.pop()
print(v)
# 0
print(s)
# {1, 2}
s = set()
# v = s.pop()
# KeyError: 'pop from an empty set'
The clear()
method removes all elements from the set
, making it empty.
s = {0, 1, 2}
s.clear()
print(s)
# set()
Union: |
operator, union()
You can get the union with the |
operator or the union()
method.
s1 = {0, 1, 2}
s2 = {1, 2, 3}
s3 = {2, 3, 4}
print(s1 | s2)
# {0, 1, 2, 3}
print(s1.union(s2))
# {0, 1, 2, 3}
Multiple arguments can be specified for union()
.
You can specify arguments not only as set
but also other types like list
and tuple
, which can be converted to set
. The same applies to the following methods.
print(s1.union(s2, s3))
# {0, 1, 2, 3, 4}
print(s1.union(s2, [5, 6, 5, 7, 5]))
# {0, 1, 2, 3, 5, 6, 7}
There are also cumulative assignment operators |=
and the update()
method that assign and update the result to the object.
s1 |= s2
print(s1)
# {0, 1, 2, 3}
s2.update(s3)
print(s2)
# {1, 2, 3, 4}
Intersection: &
operator, intersection()
You can get the intersection with the &
operator or the intersection()
method.
s1 = {0, 1, 2}
s2 = {1, 2, 3}
s3 = {2, 3, 4}
print(s1 & s2)
# {1, 2}
print(s1.intersection(s2))
# {1, 2}
print(s1.intersection(s2, s3))
# {2}
Use the &=
operator and the intersection_update()
method for cumulative assignments.
s1 &= s2
print(s1)
# {1, 2}
s2.intersection_update(s3)
print(s2)
# {2, 3}
Difference: -
operator, difference()
You can get the difference with the -
operator or the difference()
method.
s1 = {0, 1, 2}
s2 = {1, 2, 3}
s3 = {2, 3, 4}
print(s1 - s2)
# {0}
print(s1.difference(s2))
# {0}
print(s1.difference(s2, s3))
# {0}
Use the -=
operator and the difference_update()
method for cumulative assignments.
s1 -= s2
print(s1)
# {0}
s2.difference_update(s3)
print(s2)
# {1}
Symmetric difference: ^
operator, symmetric_difference()
You can get the symmetric difference with the ^
operator or symmetric_difference()
. Unlike previous methods, only one argument can be specified for the symmetric_difference()
method.
s1 = {0, 1, 2}
s2 = {1, 2, 3}
s3 = {2, 3, 4}
print(s1 ^ s2)
# {0, 3}
print(s1.symmetric_difference(s2))
# {0, 3}
Use the ^=
operator and the symmetric_difference_update()
method for cumulative assignments.
s1 ^= s2
print(s1)
# {0, 3}
s2.symmetric_difference_update(s3)
print(s2)
# {1, 4}
Test if A is a subset of B: <=
operator, issubset()
To test whether A is a subset of B, meaning all elements of A are contained in B, use the <=
operator or the issubset()
method.
s1 = {0, 1}
s2 = {0, 1, 2, 3}
print(s1 <= s2)
# True
print(s1.issubset(s2))
# True
Both the <=
operator and the issubset()
method return True
for equivalent sets. To test if a set is a proper subset, use the <
operator, which returns False
for equivalent sets.
print(s1 <= s1)
# True
print(s1.issubset(s1))
# True
print(s1 < s1)
# False
Test if A is a superset of B: >=
operator, issuperset()
To test whether A is a superset of B, meaning all elements of B are contained in A, use the >=
operator or the issuperset()
method.
s1 = {0, 1}
s2 = {0, 1, 2, 3}
print(s2 >= s1)
# True
print(s2.issuperset(s1))
# True
Both the >=
operator and the issuperset()
method return True
for equivalent sets. To test if a set is a proper superset, use the >
operator, which returns False
for equivalent sets.
print(s1 >= s1)
# True
print(s1.issuperset(s1))
# True
print(s1 > s1)
# False
Test if A and B are disjoint: isdisjoint()
To test whether A and B are disjoint, i.e., whether A and B have no common elements, use the isdisjoint()
method.
s1 = {0, 1}
s2 = {1, 2}
s3 = {2, 3}
print(s1.isdisjoint(s2))
# False
print(s1.isdisjoint(s3))
# True
Immutable sets: frozenset
As explained, set
is mutable, allowing elements to be added, removed, and so on. An immutable set type, frozenset
, is also provided.
You can create a frozenset
object by specifying a list
or other iterable objects in the constructor, frozenset()
.
fs = frozenset([2, 2, 3, 1, 3, 4])
print(fs)
# frozenset({1, 2, 3, 4})
print(type(fs))
# <class 'frozenset'>
The add()
and discard()
methods for adding and removing elements are not available for frozenset
.
# fs.add(5)
# AttributeError: 'frozenset' object has no attribute 'add'
# fs.discard(1)
# AttributeError: 'frozenset' object has no attribute 'discard'
Operators and methods for set operations, such as union, can be used like those for the set
.
fs1 = frozenset([0, 1, 2])
fs2 = frozenset([1, 2, 3])
print(fs1 | fs2)
# frozenset({0, 1, 2, 3})
print(fs1.difference(fs2))
# frozenset({0})
print(fs1.isdisjoint(fs2))
# False
A mutable set
cannot be used as a key in a dictionary or as an element of another set
, whereas an immutable frozenset
can.
s = {0, 1, 2}
fs = frozenset([0, 1, 2])
# d = {s: 100}
# TypeError: unhashable type: 'set'
d = {fs: 100}
print(d)
# {frozenset({0, 1, 2}): 100}
# ss = {s}
# TypeError: unhashable type: 'set'
ss = {fs}
print(ss)
# {frozenset({0, 1, 2})}