Java - URL Processing
URL stands for Uniform Resource Locator and represents a resource on the World Wide Web, such as a Web page or FTP directory.
This section shows you how to write Java programs that communicate with a URL. A URL can be broken down into parts, as follows −
protocol://host:port/path?query#ref
Examples of protocols include HTTP, HTTPS, FTP, and File. The path is also referred to as the filename, and the host is also called the authority.
The following is a URL to a web page whose protocol is HTTP −
https://www.amrood.com/index.htm?language=en#j2se
Notice that this URL does not specify a port, in which case the default port for the protocol is used. With HTTP, the default port is 80.
Constructors
The java.net.URL class represents a URL and has a complete set of methods to manipulate URL in Java.
The URL class has several constructors for creating URLs, including the following −
| Sr.No. | Constructors & Description |
|---|---|
| 1 | public URL(String protocol, String host, int port, String file) throws MalformedURLException Creates a URL by putting together the given parts. |
| 2 | public URL(String protocol, String host, String file) throws MalformedURLException Identical to the previous constructor, except that the default port for the given protocol is used. |
| 3 | public URL(String url) throws MalformedURLException Creates a URL from the given String. |
| 4 | public URL(URL context, String url) throws MalformedURLException Creates a URL by parsing together the URL and String arguments. |
The URL class contains many methods for accessing the various parts of the URL being represented. Some of the methods in the URL class include the following −
| Sr.No. | Method & Description |
|---|---|
| 1 |
This method compares this URL for equality with another object. |
| 2 |
This method returns the authority of the URL. |
| 3 |
This method returns the contents of this URL. |
| 4 | public Object getContent(Class<?>[] classes) This method returns the contents of this URL. |
| 5 |
This method returns the default port for the protocol of the URL. |
| 6 |
This method returns the filename of the URL. |
| 7 |
This method returns the host of the URL. |
| 8 |
This method returns the path of the URL. |
| 9 |
This method returns the port of the URL. |
| 10 |
This method returns the protocol of the URL. |
| 11 |
This method returns the query part of the URL. |
| 12 |
This method returns the reference part of the URL. |
| 13 |
This method returns the userInfo part of the URL. |
| 14 |
This method creates and return an integer suitable for hash table indexing. |
| 15 | public URLConnection openConnection() This method returns a URLConnection instance that represents a connection to the remote object referred to by the URL. |
| 16 | public URLConnection openConnection(Proxy proxy) This method acts as openConnection(), except that the connection will be made through the specified proxy; Protocol handlers that do not support proxing will ignore the proxy parameter and make a normal connection. |
| 17 | public InputStream openStream() This method opens a connection to this URL and returns an InputStream for reading from that connection. |
| 18 | public boolean sameFile(URL other) This method compares two URLs, excluding the fragment component. |
| 19 | public static void setURLStreamHandlerFactory(URLStreamHandlerFactory fac) This method sets an application's URLStreamHandlerFactory. |
| 20 | public String toExternalForm() This method constructs and return a string representation of this URL. |
| 21 |
This method constructs and return a string representation of this URL. |
| 22 |
This method returns a URI equivalent to this URL. |
Example
The following URLDemo program demonstrates the various parts of a URL. A URL is entered on the command line, and the URLDemo program outputs each part of the given URL.
// File Name : URLDemo.java
import java.io.IOException;
import java.net.URL;
public class URLDemo {
public static void main(String [] args) {
try {
URL url = new URL("https://www.tutorialspoint.com/index.htm?language=en#j2se");
System.out.println("URL is " + url.toString());
System.out.println("protocol is " + url.getProtocol());
System.out.println("authority is " + url.getAuthority());
System.out.println("file name is " + url.getFile());
System.out.println("host is " + url.getHost());
System.out.println("path is " + url.getPath());
System.out.println("port is " + url.getPort());
System.out.println("default port is " + url.getDefaultPort());
System.out.println("query is " + url.getQuery());
System.out.println("ref is " + url.getRef());
} catch (IOException e) {
e.printStackTrace();
}
}
}
A sample run of the this program will produce the following result −
Output
URL is https://www.tutorialspoint.com/index.htm?language=en#j2se protocol is https authority is www.tutorialspoint.com file name is /index.htm?language=en host is www.tutorialspoint.com path is /index.htm port is -1 default port is 443 query is language=en ref is j2se
URLConnections Class Methods
The openConnection() method returns a java.net.URLConnection, an abstract class whose subclasses represent the various types of URL connections.
For example −
If you connect to a URL whose protocol is HTTP, the openConnection() method returns an HttpURLConnection object.
If you connect to a URL that represents a JAR file, the openConnection() method returns a JarURLConnection object, etc.
The URLConnection class has many methods for setting or determining information about the connection, including the following −
| Sr.No. | Method & Description |
|---|---|
| 1 | void addRequestProperty(String key, String value) Adds a general request property specified by a key-value pair. |
| 2 | boolean getAllowUserInteraction() Returns the value of the allowUserInteraction field for this object. |
| 3 |
Returns setting for connect timeout. |
| 4 |
Retrieves the contents of this URL connection. |
| 5 | Object getContent(Class[] classes) Retrieves the contents of this URL connection. |
| 6 |
Returns the value of the content-encoding header field. |
| 7 |
Returns the value of the content-length header field. |
| 8 |
Returns the value of the content-length header field as long. |
| 9 |
Returns the value of the content-type header field. |
| 10 |
Returns the value of the date header field. |
| 11 | static boolean getDefaultAllowUserInteraction() Returns the default value of the allowUserInteraction field. |
| 12 |
Returns the default value of a URLConnection's useCaches flag. |
| 13 | static boolean getDefaultUseCaches(String protocol) Returns the default value of the useCaches flag for the given protocol. |
| 14 |
Returns the value of this URLConnection's doInput flag. |
| 15 |
Returns the value of this URLConnection's doOutput flag. |
| 16 |
Returns the value of the expires header field. |
| 17 | static FileNameMap getFileNameMap() Loads filename map (a mimetable) from a data file. |
| 18 |
Returns the value for the nth header field. |
| 19 | String getHeaderField(String name) Returns the value of the named header field. |
| 20 | long getHeaderFieldDate(String name, long Default) Returns the value of the named field parsed as date. |
| 21 | int getHeaderFieldInt(String name, int Default) Returns the value of the named field parsed as a number. |
| 22 | String getHeaderFieldKey(int n) Returns the key for the nth header field. |
| 23 | long getHeaderFieldLong(String name, long Default) Returns the value of the named field parsed as a number. |
| 24 | Map<String,List<String>> getHeaderFields() Returns an unmodifiable Map of the header fields. |
| 25 |
Returns the value of this object's ifModifiedSince field. |
| 26 |
Returns an input stream that reads from this open connection. |
| 27 |
Returns the value of the last-modified header field. |
| 28 | OutputStream getOutputStream() Returns an output stream that writes to this connection. |
| 29 |
Returns a permission object representing the permission necessary to make the connection represented by this object. |
| 30 |
Returns setting for read timeout. 0 return implies that the option is disabled (i.e., timeout of infinity). |
| 31 | Map<String,List<String>> getRequestProperties() Returns an unmodifiable Map of general request properties for this connection. |
| 32 | String getRequestProperty(String key) Returns the value of the named general request property for this connection. |
| 33 |
Returns the value of this URLConnection's URL field. |
| 34 |
Returns the value of this URLConnection's useCaches field. |
| 35 | static String guessContentTypeFromName(String fname) Tries to determine the content type of an object, based on the specified "file" component of a URL. |
| 36 | static String guessContentTypeFromStream(InputStream is) Tries to determine the type of an input stream based on the characters at the beginning of the input stream. |
| 37 | void setAllowUserInteraction(boolean allowuserinteraction) Set the value of the allowUserInteraction field of this URLConnection. |
| 38 | void setConnectTimeout(int timeout) Sets a specified timeout value, in milliseconds, to be used when opening a communications link to the resource referenced by this URLConnection. |
| 39 | static void setContentHandlerFactory(ContentHandlerFactory fac) Sets the ContentHandlerFactory of an application. |
| 40 | static void setDefaultAllowUserInteraction(boolean defaultallowuserinteraction) Sets the default value of the allowUserInteraction field for all future URLConnection objects to the specified value. |
| 41 | void setDefaultUseCaches(boolean defaultusecaches) Sets the default value of the useCaches field to the specified value. |
| 42 | static void setDefaultUseCaches(String protocol, boolean defaultVal) Sets the default value of the useCaches field for the named protocol to the given value. |
| 43 | void setDoInput(boolean doinput) Sets the value of the doInput field for this URLConnection to the specified value. |
| 44 | void setDoOutput(boolean dooutput) Sets the value of the doOutput field for this URLConnection to the specified value. |
| 45 | static void setFileNameMap(FileNameMap map) Sets the FileNameMap. |
| 46 | void setIfModifiedSince(long ifmodifiedsince) Sets the value of the ifModifiedSince field of this URLConnection to the specified value. |
| 47 | void setReadTimeout(int timeout) Sets the read timeout to a specified timeout, in milliseconds. |
| 48 | void setRequestProperty(String key, String value) Sets the general request property. |
| 49 | void setUseCaches(boolean usecaches) Sets the value of the useCaches field of this URLConnection to the specified value. |
| 50 |
Returns a String representation of this URL connection. |
Example
The following URLConnectionDemo program connects to a URL entered from the command line.
If the URL represents an HTTP resource, the connection is cast to HttpURLConnection, and the data in the resource is read one line at a time.
package com.tutorialspoint;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.net.URLConnection;
public class URLConnDemo {
public static void main(String [] args) {
try {
URL url = new URL("https://www.tutorialspoint.com");
URLConnection urlConnection = url.openConnection();
HttpURLConnection connection = null;
if(urlConnection instanceof HttpURLConnection) {
connection = (HttpURLConnection) urlConnection;
}else {
System.out.println("Please enter an HTTP URL.");
return;
}
BufferedReader in = new BufferedReader(
new InputStreamReader(connection.getInputStream()));
String urlString = "";
String current;
while((current = in.readLine()) != null) {
urlString += current;
}
System.out.println(urlString);
} catch (IOException e) {
e.printStackTrace();
}
}
}
A sample run of this program will produce the following result −
Output
$ java URLConnDemo .....a complete HTML content of home page of tutorialspoint.com.....