Unicodedata – Unicode Database in Python Last Updated : 19 Nov, 2020 Comments Improve Suggest changes Like Article Like Report Unicode Character Database (UCD) is defined by Unicode Standard Annex #44 which defines the character properties for all unicode characters. This module provides access to UCD and uses the same symbols and names as defined by the Unicode Character Database. Functions defined by the module : unicodedata.lookup(name) This function looks up for the character by name. If a character with the given name is found in the database, then, the corresponding character is returned otherwise Keyerror is raised. Example : Python3 import unicodedata print (unicodedata.lookup('LEFT CURLY BRACKET')) print (unicodedata.lookup('RIGHT CURLY BRACKET')) print (unicodedata.lookup('ASTERISK')) # gives error as there is # no symbol called ASTER # print (unicodedata.lookup('ASTER')) Output : { } * unicodedata.name(chr[, default]) This function returns the name assigned to the given character as a string. If no name is defined, default is returned by the function otherwise ValueError is raised if name is not given. Example : Python3 import unicodedata print (unicodedata.name(u'/')) print (unicodedata.name(u'|')) print (unicodedata.name(u':')) Output : SOLIDUS VERTICAL LINE COLON unicodedata.decimal(chr[, default]) This function returns the decimal value assigned to the given character as integer. If no value is defined, default is returned by the function otherwise ValueError is raised if value is not given. Example : Python3 import unicodedata print (unicodedata.decimal(u'9')) print (unicodedata.decimal(u'a')) Output : 9 Traceback (most recent call last): File "7e736755dd176cd0169eeea6f5d32057.py", line 4, in print unicodedata.decimal(u'a') ValueError: not a decimal unicodedata.digit(chr[, default]) This function returns the digit value assigned to the given character as integer. If no value is defined, default is returned by the function otherwise ValueError is raised if value is not given. Example : Python3 import unicodedata print (unicodedata.decimal(u'9')) print (unicodedata.decimal(u'143')) Output : 9 Traceback (most recent call last): File "ad47ae996380a777426cc1431ec4a8cd.py", line 4, in print unicodedata.decimal(u'143') TypeError: need a single Unicode character as parameter unicodedata.numeric(chr[, default]) This function returns the numeric value assigned to the given character as integer. If no value is defined, default is returned by the function otherwise ValueError is raised if value is not given. Example : Python3 import unicodedata print (unicodedata.decimal(u'9')) print (unicodedata.decimal(u'143')) Output : 9 Traceback (most recent call last): File "ad47ae996380a777426cc1431ec4a8cd.py", line 4, in print unicodedata.decimal(u'143') TypeError: need a single Unicode character as parameter unicodedata.category(chr) This function returns the general category assigned to the given character as string. For example, it returns 'L' for letter and 'u' for uppercase. Example : Python3 import unicodedata print (unicodedata.category(u'A')) print (unicodedata.category(u'b')) Output : Lu Ll unicodedata.bidirectional(chr) This function returns the bidirectional class assigned to the given character as string. For example, it returns 'A' for arabic and 'N' for number. An empty string is returned by this function if no such value is defined. Example : Python3 import unicodedata print (unicodedata.bidirectional(u'\u0660')) Output : AN unicodedata.normalize(form, unistr) This function returns the normal form for the Unicode string unistr. Valid values for form are ‘NFC’, ‘NFKC’, ‘NFD’, and ‘NFKD’. Example : Python from unicodedata import normalize print ('%r' % normalize('NFD', u'\u00C7')) print ('%r' % normalize('NFC', u'C\u0327')) print ('%r' % normalize('NFKD', u'\u2460')) Output : u'C\u0327' u'\xc7' u'1' Comment More infoAdvertise with us Next Article Unicodedata – Unicode Database in Python A Aditi Gupta Improve Article Tags : Python Python-Library Practice Tags : python Similar Reads Python Database Tutorial Python being a high-level language provides support for various databases. We can connect and run queries for a particular database using Python and without writing raw queries in the terminal or shell of that particular database, we just need to have that database installed in our system. In this t 4 min read Convert Unicode to Bytes in Python Unicode, often known as the Universal Character Set, is a standard for text encoding. The primary objective of Unicode is to create a universal character set that can represent text in any language or writing system. Text characters from various writing systems are given distinctive representations 2 min read Convert Unicode to ASCII in Python Unicode is the universal character set and a standard to support all the world's languages. It contains 140,000+ characters used by 150+ scripts along with various symbols. ASCII on the other hand is a subset of Unicode and the most compatible character set, consisting of 128 letters made of English 2 min read Working with Unicode in Python Unicode serves as the global standard for character encoding, ensuring uniform text representation across diverse computing environments. Python, a widely used programming language, adopts the Unicode Standard for its strings, facilitating internationalization in software development. This tutorial 3 min read How To Print Unicode Character In Python? Unicode characters play a crucial role in handling diverse text and symbols in Python programming. This article will guide you through the process of printing Unicode characters in Python, showcasing five simple and effective methods to enhance your ability to work with a wide range of characters Pr 2 min read Convert Unicode String to a Byte String in Python Python is a versatile programming language known for its simplicity and readability. Unicode support is a crucial aspect of Python, allowing developers to handle characters from various scripts and languages. However, there are instances where you might need to convert a Unicode string to a regular 2 min read Convert Unicode String to Dictionary in Python Python's versatility shines in its ability to handle diverse data types, with Unicode strings playing a crucial role in managing text data spanning multiple languages and scripts. When faced with a Unicode string and the need to organize it for effective data manipulation, the common task is convert 2 min read unicode_literals in Python Unicode is also called Universal Character set. ASCII uses 8 bits(1 byte) to represents a character and can have a maximum of 256 (2^8) distinct combinations. The issue with the ASCII is that it can only support the English language but what if we want to use another language like Hindi, Russian, Ch 3 min read How to Urlencode a Querystring in Python? URL encoding a query string consists of converting characters into a format that can be safely transmitted over the internet. This process replaces special characters with a '%' followed by their hexadecimal equivalent. In this article, we will explore three different approaches to urlencode a query 2 min read How to resolve a UnicodeDecodeError for a CSV file in Python? Several errors can arise when an attempt to decode a byte string from a certain coding scheme is made. The reason is the inability of some encoding schemes to represent all code points. One of the most common errors during these conversions is UnicodeDecode Error which occurs when decoding a byte st 5 min read Like