Certain characters need to be encoded in URLs for various reasons such as control characters, non-ASCII characters, and reserved characters used in URL syntax. Characters are encoded by replacing them with the % symbol followed by the two-digit hexadecimal value of their ISO-Latin code point. This allows characters like spaces, quotes, and other symbols to be safely used within URLs.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
587 views2 pages
Url Encoded Characters
Certain characters need to be encoded in URLs for various reasons such as control characters, non-ASCII characters, and reserved characters used in URL syntax. Characters are encoded by replacing them with the % symbol followed by the two-digit hexadecimal value of their ISO-Latin code point. This allows characters like spaces, quotes, and other symbols to be safely used within URLs.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2
URL Encoded characters
What characters need to be encoded and why?
ASCII Control characters
Why: These characters are not printable. Characters: Includes the ISO-8859-1 (ISO-Latin) character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal.) Non-ASCII characters Why: These are by definition not legal in URLs since they are not in the ASCII set. Characters: Includes the entire "top half" of the ISO-Latin set 80-FF hex (128-255 decimal.) "Reserved characters" Why: URLs use some characters for special use in defining their syntax. When these characters are not used in their special role inside a URL, they need to be encoded. Characters: Code Code Character Points Points (Hex) (Dec) Dollar ("$") 24 36 Ampersand ("&") 26 38 Plus ("+") 2B 43 Comma (",") 2C 44 Forward slash/Virgule ("/") 2F 47 Colon (":") 3A 58 Semi-colon (";") 3B 59 Equals ("=") 3D 61 Question mark ("?") 3F 63 'At' symbol ("@") 40 64 "Unsafe characters" Why: Some characters present the possibility of being misunderstood within URLs for various reasons. These characters should also always be encoded. Characters: Code Code Character Points Points Why encode? (Hex) (Dec) Significant sequences of spaces Space 20 32 may be lost in some uses (especially multiple spaces) Quotation marks 22 34 These characters are often used 'Less Than' symbol ("<") 3C 60 to delimit URLs in plain text. 'Greater Than' symbol (">") 3E 62 This is used in URLs to indicate where a fragment identifier 'Pound' character ("#") 23 35 (bookmarks/anchors in HTML) begins. This is used to URL encode/escape other characters, Percent character ("%") 25 37 so it should itself also be encoded. Misc. characters: Some systems can possibly Left Curly Brace ("{") 7B 123 modify these characters. Right Curly Brace ("}") 7D 125 Vertical Bar/Pipe ("|") 7C 124 Backslash ("\") 5C 92 Caret ("^") 5E 94 Tilde ("~") 7E 126 Left Square Bracket ("[") 5B 91 Right Square Bracket ("]") 5D 93 Grave Accent ("`") 60 96
How are characters URL encoded?
URL encoding of a character consists of a "%" symbol, followed by the two-digit hexadecimal representation (case-insensitive) of the ISO-Latin code point for the character. Example
Space = decimal code point 32 in the ISO-Latin set.
32 decimal = 20 in hexadecimal The URL encoded representation will be "%20"