Sort a 2D List in Python

Posted: 2023-07-27 | Tags: Python, List

This article explains how to sort a 2D list (list of lists) in Python.

Contents

Default behavior of sorted() or sort() for 2D lists
Sort rows/columns independently in 2D lists
- Use sorted() and list comprehension
- Use NumPy: np.sort()
Sort 2D lists according to given rows/columns
- Use the key argument of sorted() or sort()
- Use pandas: sort_values()

Working with external libraries like NumPy or pandas can simplify the task. We recommend using them if possible.

The sample code in this article uses the pprint module for improved readability.

Pretty-print with pprint in Python

import pprint

source: list_2d_sort.py

In the provided code, width=20 is specified as an argument to pprint.pprint(). This argument merely helps format the output and doesn't require special attention.

Default behavior of `sorted()` or `sort()` for 2D lists

Let's consider the following 2D list (list of lists) as an example:

l_2d = [[20, 3, 100], [1, 200, 30], [300, 10, 2]]
pprint.pprint(l_2d, width=20)
# [[20, 3, 100],
#  [1, 200, 30],
#  [300, 10, 2]]

source: list_2d_sort.py

By default, the sorted() function and the sort() method sort each list by comparing and arranging them.

The comparison is based on the first unequal elements, meaning the first element of each list is compared and sorted accordingly in this case.

Compare lists in Python

pprint.pprint(sorted(l_2d), width=20)
# [[1, 200, 30],
#  [20, 3, 100],
#  [300, 10, 2]]

source: list_2d_sort.py

For more details on sort() and sorted(), refer to the following article:

Sort a list, string, tuple in Python (sort, sorted)

Sort rows/columns independently in 2D lists

Use `sorted()` and list comprehension

Sorting each row is equivalent to sorting each list. To do this, you can use list comprehension to apply sorted() to each list.

List comprehensions in Python

l_2d = [[20, 3, 100], [1, 200, 30], [300, 10, 2]]
pprint.pprint(l_2d, width=20)
# [[20, 3, 100],
#  [1, 200, 30],
#  [300, 10, 2]]

pprint.pprint([sorted(l) for l in l_2d], width=20)
# [[3, 20, 100],
#  [1, 30, 200],
#  [2, 10, 300]]

source: list_2d_sort.py

If you want to sort each column, you need to transpose the original list of lists, sort each list, and then transpose it back. You can use zip() and * for transposing.

Transpose 2D list in Python (swap rows and columns)

pprint.pprint([list(x) for x in zip(*[sorted(l) for l in zip(*l_2d)])], width=20)
# [[1, 3, 2],
#  [20, 10, 30],
#  [300, 200, 100]]

source: list_2d_sort.py

Use NumPy: `np.sort()`

NumPy simplifies the process. Let's consider the following 2D list (list of lists) as an example:

l_2d = [[20, 3, 100], [1, 200, 30], [300, 10, 2]]
pprint.pprint(l_2d, width=20)
# [[20, 3, 100],
#  [1, 200, 30],
#  [300, 10, 2]]

source: list_2d_sort.py

By default, the np.sort() function sorts each row. If you set axis=0, it will sort each column instead. The function returns a NumPy array (ndarray).

import numpy as np

print(np.sort(l_2d))
# [[  3  20 100]
#  [  1  30 200]
#  [  2  10 300]]

print(np.sort(l_2d, axis=0))
# [[  1   3   2]
#  [ 20  10  30]
#  [300 200 100]]

print(type(np.sort(l_2d)))
# <class 'numpy.ndarray'>

source: list_2d_sort.py

If you need to convert ndarray back to a list, you can use the tolist() method.

Convert numpy.ndarray and list to each other

print(np.sort(l_2d).tolist())
# [[3, 20, 100], [1, 30, 200], [2, 10, 300]]

print(type(np.sort(l_2d).tolist()))
# <class 'list'>

source: list_2d_sort.py

Note that NumPy only treats lists with an equal number of elements as multi-dimensional arrays. Lists with an unequal number of elements will result in an error.

l_2d_error = [[1, 2], [3, 4, 5]]

# print(np.sort(l_2d_error))
# ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

source: list_2d_sort.py

Sort 2D lists according to given rows/columns

Instead of sorting each row and column independently, this section explains how to sort according to given rows or columns.

Use the `key` argument of `sorted()` or `sort()`

Basic idea

As mentioned above, by default, sorting is based on the first column (the first element of each list).

l_2d = [[20, 3, 100], [1, 200, 30], [300, 10, 2]]
pprint.pprint(l_2d, width=20)
# [[20, 3, 100],
#  [1, 200, 30],
#  [300, 10, 2]]

pprint.pprint(sorted(l_2d), width=20)
# [[1, 200, 30],
#  [20, 3, 100],
#  [300, 10, 2]]

source: list_2d_sort.py

If you want to sort according to the second or third column, you can use the key argument in the sorted() function or the sort() method.

This argument should be a callable object, like a function. The sorting is based on the results of applying this function to each element.

In this case, you should specify a function that retrieves an element at a desired index in the list. You can use a lambda expression or the itemgetter() function from the operator module.

Here's an example using a lambda expression:

pprint.pprint(sorted(l_2d, key=lambda x: x[1]), width=20)
# [[20, 3, 100],
#  [300, 10, 2],
#  [1, 200, 30]]

pprint.pprint(sorted(l_2d, key=lambda x: x[2]), width=20)
# [[300, 10, 2],
#  [1, 200, 30],
#  [20, 3, 100]]

source: list_2d_sort.py

Here's an example using operator.itemgetter():

import operator

pprint.pprint(sorted(l_2d, key=operator.itemgetter(1)), width=20)
# [[20, 3, 100],
#  [300, 10, 2],
#  [1, 200, 30]]

pprint.pprint(sorted(l_2d, key=operator.itemgetter(2)), width=20)
# [[300, 10, 2],
#  [1, 200, 30],
#  [20, 3, 100]]

source: list_2d_sort.py

For more details on the key argument and the operator module, refer to the following articles:

To sort according to a specific row, transpose the array, apply the sorting operation as described above, and then transpose the array back.

Sort according to multiple rows/columns

Let's consider a case with duplicated values.

l_2d_dup = [[1, 3, 100], [1, 200, 30], [1, 3, 2]]
pprint.pprint(l_2d_dup, width=20)
# [[1, 3, 100],
#  [1, 200, 30],
#  [1, 3, 2]]

source: list_2d_sort.py

By default, sorting compares and orders each list based on the first unequal element. Thus, if the first column's elements are identical, sorting will proceed according to the second column, and so forth.

pprint.pprint(sorted(l_2d_dup), width=20)
# [[1, 3, 2],
#  [1, 3, 100],
#  [1, 200, 30]]

source: list_2d_sort.py

If you want to sort according to multiple columns in an arbitrary order, you can use the key argument.

When you provide multiple values (indices) to operator.itemgetter(), the comparison for sorting proceeds according to the second value, if the first value is identical. In the following example, sorting is based on the first and third columns.

pprint.pprint(sorted(l_2d_dup, key=operator.itemgetter(0, 2)), width=20)
# [[1, 3, 2],
#  [1, 200, 30],
#  [1, 3, 100]]

source: list_2d_sort.py

The same process can also be executed with a lambda expression.

pprint.pprint(sorted(l_2d_dup, key=lambda x: (x[0], x[2])), width=20)
# [[1, 3, 2],
#  [1, 200, 30],
#  [1, 3, 100]]

source: list_2d_sort.py

Again, to sort according to a specific row, transpose the array, apply the sorting operation as described above, and then transpose the array back.

Use pandas: `sort_values()`

Pandas offers an even simpler approach. Let's consider the following 2D list (list of lists) as an example:

l_2d_dup = [[1, 3, 100], [1, 200, 30], [1, 3, 2]]
pprint.pprint(l_2d_dup, width=20)
# [[1, 3, 100],
#  [1, 200, 30],
#  [1, 3, 2]]

source: list_2d_sort.py

You can generate a pandas.DataFrame from a list of lists. Row and column names are optional and can be anything that suits your needs.

import pandas as pd

df = pd.DataFrame(l_2d_dup, columns=['A', 'B', 'C'], index=['X', 'Y', 'Z'])
print(df)
#    A    B    C
# X  1    3  100
# Y  1  200   30
# Z  1    3    2

source: list_2d_sort.py

The sort_values() method enables sorting according to specific columns or rows. It sorts by columns by default; however, by setting axis=1, you can sort by rows.

print(df.sort_values('C'))
#    A    B    C
# Z  1    3    2
# Y  1  200   30
# X  1    3  100

print(df.sort_values('Z', axis=1))
#    A    C    B
# X  1  100    3
# Y  1   30  200
# Z  1    2    3

source: list_2d_sort.py

You can specify multiple columns or rows.

print(df.sort_values(['A', 'C']))
#    A    B    C
# Z  1    3    2
# Y  1  200   30
# X  1    3  100

source: list_2d_sort.py

If you omit the columns and index arguments when generating a pandas.DataFrame, the row and column names will default to a series of numbers. Even though this may appear confusing with both row and column names being numeric, it does not affect the processing.

df = pd.DataFrame(l_2d_dup)
print(df)
#    0    1    2
# 0  1    3  100
# 1  1  200   30
# 2  1    3    2

print(df.sort_values(2))
#    0    1    2
# 2  1    3    2
# 1  1  200   30
# 0  1    3  100

print(df.sort_values(2, axis=1))
#    0    2    1
# 0  1  100    3
# 1  1   30  200
# 2  1    2    3

print(df.sort_values([0, 2]))
#    0    1    2
# 2  1    3    2
# 1  1  200   30
# 0  1    3  100

source: list_2d_sort.py

For more details on sort_values(), such as specifying the sorting order with the ascending argument, refer to the following article:

pandas: Sort DataFrame, Series with sort_values(), sort_index()

You can also convert a pandas.DataFrame to a list or numpy.ndarray.

For example, you can convert a pandas.DataFrame to a list as follows

print(df.sort_values([0, 2]).values.tolist())
# [[1, 3, 2], [1, 200, 30], [1, 3, 100]]

print(type(df.sort_values([0, 2]).values.tolist()))
# <class 'list'>

source: list_2d_sort.py

Sort a 2D List in Python

Default behavior of `sorted()` or `sort()` for 2D lists

Sort rows/columns independently in 2D lists

Use `sorted()` and list comprehension

Use NumPy: `np.sort()`

Sort 2D lists according to given rows/columns

Use the `key` argument of `sorted()` or `sort()`

Basic idea

Sort according to multiple rows/columns

Use pandas: `sort_values()`

Related Categories

Related Articles

Sort a 2D List in Python

Default behavior of sorted() or sort() for 2D lists

Sort rows/columns independently in 2D lists

Use sorted() and list comprehension

Use NumPy: np.sort()

Sort 2D lists according to given rows/columns

Use the key argument of sorted() or sort()

Basic idea

Sort according to multiple rows/columns

Use pandas: sort_values()

Related Categories

Related Articles

Default behavior of `sorted()` or `sort()` for 2D lists

Use `sorted()` and list comprehension

Use NumPy: `np.sort()`

Use the `key` argument of `sorted()` or `sort()`

Use pandas: `sort_values()`