Get the Length of a String (number of Characters) in Python
In Python, you can get the length of a string (str
), which is the number of characters, using the built-in len()
function.
To learn how to count specific characters or substrings, see the following article.
For more on using len()
with other types like list
, see the following article.
Get the length of a string (number of characters) using len()
By passing a string to the built-in len()
function, its length (the number of characters) is returned as an integer value.
s = 'abcde'
print(len(s))
# 5
Full-width and half-width characters
Both full-width and half-width characters are each counted as a single character.
s = 'あいうえお'
print(len(s))
# 5
s = 'abcdeあいうえお'
print(len(s))
# 10
Escape sequences and special characters
In Python, special characters like TAB are represented with a backslash, as in \t
. The backslash itself is represented by \\
. These special characters are considered single characters.
s = 'a\tb\\c'
print(s)
# a b\c
print(len(s))
# 5
In raw strings, escape sequences are not interpreted and the string is treated as-is. The number of characters is also counted as-is.
s = r'a\tb\\c'
print(s)
# a\tb\\c
print(len(s))
# 7
A Unicode escape sequence, such as \uXXXX
, is also treated as a single character.
s = '\u3042\u3044\u3046'
print(s)
# あいう
print(len(s))
# 3
In raw strings, even Unicode escape sequences are not interpreted and are treated as literal text.
s = r'\u3042\u3044\u3046'
print(s)
# \u3042\u3044\u3046
print(len(s))
# 18
Line breaks
\n
(LF: Line Feed) is also considered a single character.
s = 'a\nb'
print(s)
# a
# b
print(len(s))
# 3
If \r\n
(CR: Carriage Return + LF: Line Feed) is used, it is counted as two characters, \r
and \n
.
s = 'a\r\nb'
print(s)
# a
# b
print(len(s))
# 4
If \n
and \r\n
are mixed in the same text, the number of characters used to represent line breaks will differ.
s = 'abc\nabcd\r\nab'
print(s)
# abc
# abcd
# ab
print(len(s))
# 12
To handle mixed \n
and \r\n
, you can use the splitlines()
method, which splits the string into lines and returns them as a list.
print(s.splitlines())
# ['abc', 'abcd', 'ab']
The number of elements in the list returned by splitlines()
corresponds to the number of lines.
print(len(s.splitlines()))
# 3
You can get the number of characters in each line using a list comprehension.
print([len(line) for line in s.splitlines()])
# [3, 4, 2]
To calculate the total number of characters, use sum()
with a generator expression. Generator expressions are enclosed in ()
instead of []
. When used within another ()
, as in this example, the enclosing parentheses can be omitted.
print(sum(len(line) for line in s.splitlines()))
# 9
For more information about handling line breaks, see the following article.