Get the Length of a String (number of Characters) in Python

Modified: | Tags: Python, String

In Python, you can get the length of a string (str), which is the number of characters, using the built-in len() function.

To learn how to count specific characters or substrings, see the following article.

For more on using len() with other types like list, see the following article.

Get the length of a string (number of characters) using len()

By passing a string to the built-in len() function, its length (the number of characters) is returned as an integer value.

s = 'abcde'

print(len(s))
# 5
source: str_len.py

Full-width and half-width characters

Both full-width and half-width characters are each counted as a single character.

s = 'あいうえお'

print(len(s))
# 5

s = 'abcdeあいうえお'

print(len(s))
# 10
source: str_len.py

Escape sequences and special characters

In Python, special characters like TAB are represented with a backslash, as in \t. The backslash itself is represented by \\. These special characters are considered single characters.

s = 'a\tb\\c'
print(s)
# a b\c

print(len(s))
# 5
source: str_len.py

In raw strings, escape sequences are not interpreted and the string is treated as-is. The number of characters is also counted as-is.

s = r'a\tb\\c'
print(s)
# a\tb\\c

print(len(s))
# 7
source: str_len.py

A Unicode escape sequence, such as \uXXXX, is also treated as a single character.

s = '\u3042\u3044\u3046'
print(s)
# あいう

print(len(s))
# 3
source: str_len.py

In raw strings, even Unicode escape sequences are not interpreted and are treated as literal text.

s = r'\u3042\u3044\u3046'
print(s)
# \u3042\u3044\u3046

print(len(s))
# 18
source: str_len.py

Line breaks

\n (LF: Line Feed) is also considered a single character.

s = 'a\nb'
print(s)
# a
# b

print(len(s))
# 3
source: str_len.py

If \r\n (CR: Carriage Return + LF: Line Feed) is used, it is counted as two characters, \r and \n.

s = 'a\r\nb'
print(s)
# a
# b

print(len(s))
# 4
source: str_len.py

If \n and \r\n are mixed in the same text, the number of characters used to represent line breaks will differ.

s = 'abc\nabcd\r\nab'
print(s)
# abc
# abcd
# ab

print(len(s))
# 12
source: str_len.py

To handle mixed \n and \r\n, you can use the splitlines() method, which splits the string into lines and returns them as a list.

print(s.splitlines())
# ['abc', 'abcd', 'ab']
source: str_len.py

The number of elements in the list returned by splitlines() corresponds to the number of lines.

print(len(s.splitlines()))
# 3
source: str_len.py

You can get the number of characters in each line using a list comprehension.

print([len(line) for line in s.splitlines()])
# [3, 4, 2]
source: str_len.py

To calculate the total number of characters, use sum() with a generator expression. Generator expressions are enclosed in () instead of []. When used within another (), as in this example, the enclosing parentheses can be omitted.

print(sum(len(line) for line in s.splitlines()))
# 9
source: str_len.py

For more information about handling line breaks, see the following article.

Related Categories

Related Articles