htmlparser-user Mailing List for HTML Parser

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Mail to htmlparser-user
A MASSIVE schlong, is only a few months away

cari Bot
http://pinester.com/

just like in the following:
  "<li>info</li>
  <li>info</li>
  <li>info</li>
  <li>info</li>"

  or
  "<a title="taught" href="/https/sourceforge.net/index.html" rel="section">info</a> info <a title="research-led" href="/https/sourceforge.net/index.html" rel="section">info</a> info."
  How to extract these info circularly?
  Thanks in advance!

---------------------------------
Building a website is a piece of cake. 
Yahoo! Small Business gives you all the tools to get online.
http://www.mankine.com/
Hi htmlparser-user
Tired of being just average?

Ravinder janzen
You probably have to hit the login page first.=0AThen use the same Connecti=
onManager to access the desired page.=0A=0A----- Original Message ----=0AFr=
om: "mic...@Ta..." <mic...@Ta...>=0ATo: htmlparser user l=
ist <htm...@li...>=0ACc: htmlparser user list <htm=
lpa...@li...>=0ASent: Friday, August 24, 2007 10:36:0=
5 AM=0ASubject: Re: [Htmlparser-user] How to login to web page=0A=0ADerrick=
,=0AI'm trying this approach (strictly HtmlParser) as well, but I can't get=
=0Alogged in. The array of URLs is that of a page shown when not logged in.=
=0A=0ACan you suggest anything else?=0AThanks,=0AMick=0A=0AURL[] urlArray;=
=0A=0AConnectionManager connectionManager =3D new ConnectionManager();=0Aur=
l =3D new URL("www.someloginpage.com";);=0AconnectionManager.openConnection=
(url);=0A=0AconnectionManager.setRedirectionProcessingEnabled(true);=0Aconn=
ectionManager.setCookieProcessingEnabled(true);=0AconnectionManager.setUser=
(USER_NAME);=0AconnectionManager.setPassword(PASSWORD);=0A=0A// go to link =
with stuff=0Aurl =3D new URL("a page beyond the login page");=0AconnectionM=
anager.openConnection(url);=0AlinkBean.setConnection(connectionManager.open=
Connection(url));=0AurlArray =3D linkBean.getLinks(); // get all links=0A=
=0A=0A---------------------------------------------------------------------=
=0A> You might try setRedirectionProcessingEnabled(true). Often the first U=
RL=0A> is only a gateway.=0A> Also, it's setCookieProcessingEnabled(true), =
not addCookies.=0A=0A=0A---------------------------------------------------=
----------------------=0AThis SF.net email is sponsored by: Splunk Inc.=0AS=
till grepping through log files to find problems?  Stop.=0ANow Search log e=
vents and configuration files using AJAX and a browser.=0ADownload your FRE=
E copy of Splunk now >>  http://get.splunk.com/=0A_________________________=
______________________=0AHtmlparser-user mailing list=0AHtmlparser-user@lis=
ts.sourceforge.net=0Ahttps://lists.sourceforge.net/lists/listinfo/htmlparse=
r-user=0A=0A=0A=0A=0A
Derrick,
I'm trying this approach (strictly HtmlParser) as well, but I can't get
logged in. The array of URLs is that of a page shown when not logged in.

Can you suggest anything else?
Thanks,
Mick

URL[] urlArray;

ConnectionManager connectionManager = new ConnectionManager();
url = new URL("www.someloginpage.com");
connectionManager.openConnection(url);

connectionManager.setRedirectionProcessingEnabled(true);
connectionManager.setCookieProcessingEnabled(true);
connectionManager.setUser(USER_NAME);
connectionManager.setPassword(PASSWORD);

// go to link with stuff
url = new URL("a page beyond the login page");
connectionManager.openConnection(url);
linkBean.setConnection(connectionManager.openConnection(url));
urlArray = linkBean.getLinks(); // get all links

---------------------------------------------------------------------
> You might try setRedirectionProcessingEnabled(true). Often the first URL
> is only a gateway.
> Also, it's setCookieProcessingEnabled(true), not addCookies.

Thanks for all the information!

In your first response to my post, in your code, you call these methods:
getCookiesArrayList(client)
client.getResponseHeaders()
getRequestHeaders()

I removed these calls, but I guess now I need to use them.
Can you post these methods?

> 2007/8/24, mic...@ta... <mic...@ta...>:
>>
>> Mattia,
>>
>> In my original post, I showed this code that uses HtmlParser to connect
>> to
>> a web page and get all link from that page.
>>
>> url = new URL("http://www.google.com");
>> urlConnection = url.openConnection();
>> ConnectionManager connectionManager = new ConnectionManager();
>> connectionManager.setRedirectionProcessingEnabled(true);
>> linkBean.setConnection(connectionManager.openConnection(url));
>> urlArray = linkBean.getLinks(); // get all links
>
>
> Ok. So dis work for you and You are exactly in the same situation I was
> some
> month ago.
> I needed to parse some pages and when I faced pages protected from login I
> used HttpClient. Sorry for confusing you a bit! :o)
>
> My problem was that I couldn't get a page that required a login. You
>> replied with some code, which I have modified a bit (below). This works,
>> but it is not using HtmlParser. It uses apache commons.
>>
>> Can I get the HtmlParser code to work with the apache commons code. In
>> particular, can I use the connection established with the HttpClient
>> (from
>> apache commons) in the HtmlParser code? If so, how?
>
>
> The site I'm working on requires 2 cookies to be sent to pages protected
> by
> login so the first step is logging in with HttpClient commons lib, take
> the
> cookies and use those inside HttpHeaders I'm sending to the second page
> (page protected from login) so I could enter and parse this second page.
>
>
> I think you have to investigate if the site you are tring to enter has a
> particular system to understand if you are logged or not. Try to see from
> fireworks "right click" on the page and select "Page info" you will see
> headers form and so on. Lets see if there are cookies.
>
>
> Suppose you are logged in and you took a cookie that say to the site that
> you are logged I use the HtmlParser in this way (similarly as you do in
> your
> code):
>
>     public Hashtable getMySecondPage(String url, ArrayList cookies,
>             Header[] headers) {
>         logger.info("url: " + url);
>         try {
>             // I pass headers and cookies making the request so the site
> see
> that I'm logged
>             setUpConnectionManager(cookies, headers);
>             Parser parser = new Parser(url);
>             NodeList nodelist = parser.parse(null);
>             for (int i = 0; i < nodelist.size(); i++) {
>                 System.out.print(nodelist.toHtml());
>             }
>         } catch (Exception e) {
>             e.printStackTrace();
>         }
>         return null;
>     }
>
> If the site doesn't use cookies try just to make 2 request sequentially,
> FIRST the login call than call the page protected from login.
>
> Cheers
>
> Mattia
>
>
>
> Thanks,
>> Mick
>>
>> --------------------------------------------------------------------------
>>
>> import java.io.IOException;
>> import java.util.logging.*;
>> import java.util.ArrayList;
>> import org.apache.commons.httpclient.NameValuePair;
>>
>> /**
>> * WebScraper2
>> *
>> */
>> public class WebScraper2 {
>>
>>       private Logger logger;
>>       private HttpClientUtil httpClientUtil;
>>       private String loginURL;
>>
>>       /**
>>        * constructor
>>        */
>>       public WebScraper2() {
>>
>>             createLogger();
>>
>>             loginURL = "https://www.ctslink.com/login.do";
>>
>>             httpClientUtil = new HttpClientUtil();
>>
>>             login();
>>
>>       }
>>
>>
>>       /*
>>        * login to site
>>        */
>>       public String login() {
>>
>>             String responseString = null;
>>
>>             //logger.info(this.getClass().getName() + " - login");
>>
>>             try {
>>                   ArrayList<NameValuePair> parameters = new
>> ArrayList<NameValuePair>();
>>                   parameters.add(new NameValuePair("username",
>> "joeUser"));
>>                   parameters.add(new NameValuePair("password",
>> "somePassword"));
>>
>>                   int response =
>> httpClientUtil.submitPostForm(this.loginURL, "",
>>                               parameters, null);
>>                   logger.info("Response = " + response);
>>                   parameters.clear();
>>             } catch (Exception e) {
>>                   logger.warning(" LOGIN PROBLEM!");
>>                   e.printStackTrace();
>>             }
>>
>>             return responseString;
>>       }
>>
>>       /*
>>        * create the logger
>>        */
>>       public void createLogger() {
>>             // Get a logger; the logger is automatically created if
>>             // it doesn't already exist
>>             try {
>>                   // Create a file handler that writes log record to a
>> file
>>                   FileHandler handler = new
>> FileHandler("webscraper.log");
>>                   handler.setFormatter(new SimpleFormatter()); // set
>> file
>> format
>>                   // to plain text, not xml
>>
>>                   // Add to the desired logger
>>                   logger = Logger.getLogger("webscraper.Webscraper");
>>                   logger.addHandler(handler);
>>                   logger.setLevel(Level.INFO);
>>             } catch (IOException e) {
>>                   System.out
>>                               .println("WebScraper2:createLogger():
>> Error
>> creating logger");
>>             }
>>       }
>>
>>
>>       /**
>>        *
>>        * @param args
>>        */
>>       public static void main(String[] args) {
>>
>>             new WebScraper2();
>>         System.out.println("Done");
>>
>>       }
>> }
>>
>> ---------------------------------------------------------------------
>>
>> import java.io.*;
>> import java.util.ArrayList;
>> import org.apache.commons.httpclient.methods.PostMethod;
>> import org.apache.commons.httpclient.*;
>> import org.apache.commons.httpclient.cookie.CookiePolicy;
>> import org.apache.commons.httpclient.NameValuePair;
>>
>> /**
>> * HttpClientUtil
>> *
>> */
>> public class HttpClientUtil extends HttpClient {
>>
>>       private PostMethod postMethod;
>>
>>       /**
>>        * constructor
>>        */
>>       public HttpClientUtil() {
>>
>>       }
>>
>>
>>       /**
>>        * submitPostForm
>>        *
>>        * @param relativeUrl
>>        * @param formName
>>        * @param params
>>        * @param requestHeaders
>>        * @return
>>        */
>>       public int submitPostForm(String relativeUrl, String formName,
>>                   ArrayList<NameValuePair> params, Header[]
>> requestHeaders) {
>>           BufferedReader bufferedReader = null;
>>           int statusCode = -999;
>>
>>             byte[] result = null;
>>             try {
>>                   NameValuePair[] data = null;
>>                   if (params != null) {
>>                         data = new NameValuePair[params.size()];
>>                         for (int i = 0; i < params.size(); i++) {
>>                               data[i] = (NameValuePair) params.get(i);
>>                         }
>>                   }
>>                   PostMethod method = new PostMethod(relativeUrl);
>>                   this.postMethod = method;
>>                   method.getParams().setCookiePolicy(CookiePolicy.RFC_2109
>> );
>>
>>                   if (params != null) {
>>                         method.addParameters(data);
>>                   }
>>                   statusCode = this.executeMethod(method);
>>
>>             bufferedReader = new BufferedReader(new
>> InputStreamReader(method.getResponseBodyAsStream()));
>>               String readLine;
>>               while(((readLine = bufferedReader.readLine()) != null)) {
>>                 System.out.println(readLine);
>>             }
>>
>>
>>             } catch (IOException ioe) {
>>                   ioe.printStackTrace();
>>             }
>>             return statusCode;
>>       }
>>
>> }
>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Splunk Inc.
>> Still grepping through log files to find problems?  Stop.
>> Now Search log events and configuration files using AJAX and a browser.
>> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> _______________________________________________
>> Htmlparser-user mailing list
>> Htm...@li...
>> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>
> http://get.splunk.com/_______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>

Hello htmlparser-user
Become the ultimate pleasure machine, women will love it
http://www.paviboo.com/

2007/8/24, mic...@ta... <mic...@ta...>:
>
> Mattia,
>
> In my original post, I showed this code that uses HtmlParser to connect to
> a web page and get all link from that page.
>
> url = new URL("http://www.google.com");
> urlConnection = url.openConnection();
> ConnectionManager connectionManager = new ConnectionManager();
> connectionManager.setRedirectionProcessingEnabled(true);
> linkBean.setConnection(connectionManager.openConnection(url));
> urlArray = linkBean.getLinks(); // get all links

Ok. So dis work for you and You are exactly in the same situation I was some
month ago.
I needed to parse some pages and when I faced pages protected from login I
used HttpClient. Sorry for confusing you a bit! :o)

My problem was that I couldn't get a page that required a login. You
> replied with some code, which I have modified a bit (below). This works,
> but it is not using HtmlParser. It uses apache commons.
>
> Can I get the HtmlParser code to work with the apache commons code. In
> particular, can I use the connection established with the HttpClient (from
> apache commons) in the HtmlParser code? If so, how?

The site I'm working on requires 2 cookies to be sent to pages protected by
login so the first step is logging in with HttpClient commons lib, take the
cookies and use those inside HttpHeaders I'm sending to the second page
(page protected from login) so I could enter and parse this second page.

I think you have to investigate if the site you are tring to enter has a
particular system to understand if you are logged or not. Try to see from
fireworks "right click" on the page and select "Page info" you will see
headers form and so on. Lets see if there are cookies.

Suppose you are logged in and you took a cookie that say to the site that
you are logged I use the HtmlParser in this way (similarly as you do in your
code):

    public Hashtable getMySecondPage(String url, ArrayList cookies,
            Header[] headers) {
        logger.info("url: " + url);
        try {
            // I pass headers and cookies making the request so the site see
that I'm logged
            setUpConnectionManager(cookies, headers);
            Parser parser = new Parser(url);
            NodeList nodelist = parser.parse(null);
            for (int i = 0; i < nodelist.size(); i++) {
                System.out.print(nodelist.toHtml());
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

If the site doesn't use cookies try just to make 2 request sequentially,
FIRST the login call than call the page protected from login.

Cheers

Mattia

Thanks,
> Mick
>
> --------------------------------------------------------------------------
>
> import java.io.IOException;
> import java.util.logging.*;
> import java.util.ArrayList;
> import org.apache.commons.httpclient.NameValuePair;
>
> /**
> * WebScraper2
> *
> */
> public class WebScraper2 {
>
>       private Logger logger;
>       private HttpClientUtil httpClientUtil;
>       private String loginURL;
>
>       /**
>        * constructor
>        */
>       public WebScraper2() {
>
>             createLogger();
>
>             loginURL = "https://www.ctslink.com/login.do";
>
>             httpClientUtil = new HttpClientUtil();
>
>             login();
>
>       }
>
>
>       /*
>        * login to site
>        */
>       public String login() {
>
>             String responseString = null;
>
>             //logger.info(this.getClass().getName() + " - login");
>
>             try {
>                   ArrayList<NameValuePair> parameters = new
> ArrayList<NameValuePair>();
>                   parameters.add(new NameValuePair("username",
> "joeUser"));
>                   parameters.add(new NameValuePair("password",
> "somePassword"));
>
>                   int response =
> httpClientUtil.submitPostForm(this.loginURL, "",
>                               parameters, null);
>                   logger.info("Response = " + response);
>                   parameters.clear();
>             } catch (Exception e) {
>                   logger.warning(" LOGIN PROBLEM!");
>                   e.printStackTrace();
>             }
>
>             return responseString;
>       }
>
>       /*
>        * create the logger
>        */
>       public void createLogger() {
>             // Get a logger; the logger is automatically created if
>             // it doesn't already exist
>             try {
>                   // Create a file handler that writes log record to a
> file
>                   FileHandler handler = new FileHandler("webscraper.log");
>                   handler.setFormatter(new SimpleFormatter()); // set file
> format
>                   // to plain text, not xml
>
>                   // Add to the desired logger
>                   logger = Logger.getLogger("webscraper.Webscraper");
>                   logger.addHandler(handler);
>                   logger.setLevel(Level.INFO);
>             } catch (IOException e) {
>                   System.out
>                               .println("WebScraper2:createLogger(): Error
> creating logger");
>             }
>       }
>
>
>       /**
>        *
>        * @param args
>        */
>       public static void main(String[] args) {
>
>             new WebScraper2();
>         System.out.println("Done");
>
>       }
> }
>
> ---------------------------------------------------------------------
>
> import java.io.*;
> import java.util.ArrayList;
> import org.apache.commons.httpclient.methods.PostMethod;
> import org.apache.commons.httpclient.*;
> import org.apache.commons.httpclient.cookie.CookiePolicy;
> import org.apache.commons.httpclient.NameValuePair;
>
> /**
> * HttpClientUtil
> *
> */
> public class HttpClientUtil extends HttpClient {
>
>       private PostMethod postMethod;
>
>       /**
>        * constructor
>        */
>       public HttpClientUtil() {
>
>       }
>
>
>       /**
>        * submitPostForm
>        *
>        * @param relativeUrl
>        * @param formName
>        * @param params
>        * @param requestHeaders
>        * @return
>        */
>       public int submitPostForm(String relativeUrl, String formName,
>                   ArrayList<NameValuePair> params, Header[]
> requestHeaders) {
>           BufferedReader bufferedReader = null;
>           int statusCode = -999;
>
>             byte[] result = null;
>             try {
>                   NameValuePair[] data = null;
>                   if (params != null) {
>                         data = new NameValuePair[params.size()];
>                         for (int i = 0; i < params.size(); i++) {
>                               data[i] = (NameValuePair) params.get(i);
>                         }
>                   }
>                   PostMethod method = new PostMethod(relativeUrl);
>                   this.postMethod = method;
>                   method.getParams().setCookiePolicy(CookiePolicy.RFC_2109
> );
>
>                   if (params != null) {
>                         method.addParameters(data);
>                   }
>                   statusCode = this.executeMethod(method);
>
>             bufferedReader = new BufferedReader(new
> InputStreamReader(method.getResponseBodyAsStream()));
>               String readLine;
>               while(((readLine = bufferedReader.readLine()) != null)) {
>                 System.out.println(readLine);
>             }
>
>
>             } catch (IOException ioe) {
>                   ioe.printStackTrace();
>             }
>             return statusCode;
>       }
>
> }
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>

Mattia,

In my original post, I showed this code that uses HtmlParser to connect to
a web page and get all link from that page.

url = new URL("http://www.google.com");
urlConnection = url.openConnection();
ConnectionManager connectionManager = new ConnectionManager();
connectionManager.setRedirectionProcessingEnabled(true);
linkBean.setConnection(connectionManager.openConnection(url));
urlArray = linkBean.getLinks(); // get all links

My problem was that I couldn't get a page that required a login. You
replied with some code, which I have modified a bit (below). This works,
but it is not using HtmlParser. It uses apache commons.

Can I get the HtmlParser code to work with the apache commons code. In
particular, can I use the connection established with the HttpClient (from
apache commons) in the HtmlParser code? If so, how?

Thanks,
Mick

--------------------------------------------------------------------------

import java.io.IOException;
import java.util.logging.*;
import java.util.ArrayList;
import org.apache.commons.httpclient.NameValuePair;

/**
 * WebScraper2
 *
 */
public class WebScraper2 {

      private Logger logger;
      private HttpClientUtil httpClientUtil;
      private String loginURL;

      /**
       * constructor
       */
      public WebScraper2() {

            createLogger();

            loginURL = "https://www.ctslink.com/login.do";

            httpClientUtil = new HttpClientUtil();

            login();

      }

      /*
       * login to site
       */
      public String login() {

            String responseString = null;

            //logger.info(this.getClass().getName() + " - login");

            try {
                  ArrayList<NameValuePair> parameters = new
ArrayList<NameValuePair>();
                  parameters.add(new NameValuePair("username", "joeUser"));
                  parameters.add(new NameValuePair("password",
"somePassword"));

                  int response =
httpClientUtil.submitPostForm(this.loginURL, "",
                              parameters, null);
                  logger.info("Response = " + response);
                  parameters.clear();
            } catch (Exception e) {
                  logger.warning(" LOGIN PROBLEM!");
                  e.printStackTrace();
            }

            return responseString;
      }

      /*
       * create the logger
       */
      public void createLogger() {
            // Get a logger; the logger is automatically created if
            // it doesn't already exist
            try {
                  // Create a file handler that writes log record to a file
                  FileHandler handler = new FileHandler("webscraper.log");
                  handler.setFormatter(new SimpleFormatter()); // set file
format
                  // to plain text, not xml

                  // Add to the desired logger
                  logger = Logger.getLogger("webscraper.Webscraper");
                  logger.addHandler(handler);
                  logger.setLevel(Level.INFO);
            } catch (IOException e) {
                  System.out
                              .println("WebScraper2:createLogger(): Error
creating logger");
            }
      }

      /**
       *
       * @param args
       */
      public static void main(String[] args) {

            new WebScraper2();
        System.out.println("Done");

      }
}

---------------------------------------------------------------------

import java.io.*;
import java.util.ArrayList;
import org.apache.commons.httpclient.methods.PostMethod;
import org.apache.commons.httpclient.*;
import org.apache.commons.httpclient.cookie.CookiePolicy;
import org.apache.commons.httpclient.NameValuePair;

/**
 * HttpClientUtil
 *
 */
public class HttpClientUtil extends HttpClient {

      private PostMethod postMethod;

      /**
       * constructor
       */
      public HttpClientUtil() {

      }

      /**
       * submitPostForm
       *
       * @param relativeUrl
       * @param formName
       * @param params
       * @param requestHeaders
       * @return
       */
      public int submitPostForm(String relativeUrl, String formName,
                  ArrayList<NameValuePair> params, Header[] requestHeaders) {
          BufferedReader bufferedReader = null;
          int statusCode = -999;

            byte[] result = null;
            try {
                  NameValuePair[] data = null;
                  if (params != null) {
                        data = new NameValuePair[params.size()];
                        for (int i = 0; i < params.size(); i++) {
                              data[i] = (NameValuePair) params.get(i);
                        }
                  }
                  PostMethod method = new PostMethod(relativeUrl);
                  this.postMethod = method;
                  method.getParams().setCookiePolicy(CookiePolicy.RFC_2109);

                  if (params != null) {
                        method.addParameters(data);
                  }
                  statusCode = this.executeMethod(method);

            bufferedReader = new BufferedReader(new
InputStreamReader(method.getResponseBodyAsStream()));
              String readLine;
              while(((readLine = bufferedReader.readLine()) != null)) {
                System.out.println(readLine);
            }

            } catch (IOException ioe) {
                  ioe.printStackTrace();
            }
            return statusCode;
      }

}

Ok, PERFECT!
200 is the HTTP code that stands for page succesfully reached or loaded.
Search for Http Code to look at other's You'll need them, 404 is page not
found, 500 is server error...

Good job.

Mattia

2007/8/23, mic...@ta... <mic...@ta...>:
>
> I think i got it.
> The 'response' returned is the html code of loginURL.
> The statusCode returned = 200
>
> Did it work? (ie, did it log in successfully?)
> Where can I find what status code = 200 means?
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>

I think i got it.
The 'response' returned is the html code of loginURL.
The statusCode returned = 200

Did it work? (ie, did it log in successfully?)
Where can I find what status code = 200 means?

Exactly, those are global objects of the class with login method.

for login page look at this example:

Open firefox on the page where the login form is, view page's source code,
look at the form tag and you will see something like:

<FORM name=form enctype=x-www-form-urlencoded method=post
action="/forms/Myform.jsp">

loginPage for me is the page where the submission of the form send the
navigation.

I hope all is's clear now.

Bye

Mattia

2007/8/23, mic...@ta... <mic...@ta...>:
>
> So I did this:
> private String baseUrl;
> private Any user;
> private Any password;
>
> this.user.insert_string("giovanni_doe");
> this.password.insert_string("mypassword);
>
> I think all I need is to understand what "loginPage" and it will compile,
> at least.
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>

So I did this:
private String baseUrl;
private Any user;
private Any password;

this.user.insert_string("giovanni_doe");
this.password.insert_string("mypassword);

I think all I need is to understand what "loginPage" and it will compile,
at least.

I guess I still don't understand what "loginPage" is.
Can you give me an example?

> loginData forget it, it's just a check not to login every time. Delete if
> instruction.
>
> loginPage, is the action of the form you are tring to submit.
>
> For now forgive Headers, both Request and Response and also cookies.
>
> Just call submitPostForm(String url, ArrayList params)
> Passing an url and the login parameters.
>
> client inside HttpClientUtil id an HttpClient instance.
>
> HttpClientUtil is a wrapper of HttpClient just to reuse some basic method
> to
> post forms in post or get way.
>
> Delete wat you don't need for now.
>
> Bye.
>
> Mattia
>
>
>
> 2007/8/23, mic...@ta... <mic...@ta...>:
>>
>> I find some problems with this code:
>>
>> loginData, loginPage, getRequestHeaders(), getResponseHeaders(),
>> getCookiesArrayList() are undefined
>>
>> In HttpClientUtil:
>>
>> If client is an instantiation of HttpClientUtil and submitPostForm() is
>> a
>> method in HttpClientUtil, then how can submitPostForm() use 'client'?
>>
>> Methods addHeader() is undefined for HttpClientUtil.
>>
>> cookies (in printCookies(cookies);) is undefined.
>>
>>
>> > Hi,
>> >
>> > this is code I used to login:
>> >
>> >
>> >     public Object login() {
>> >         logger.info(this.getClass().getName() + " - login");
>> >         if (loginData == null) {
>> >             try {
>> >                 client.setBaseURL(new URL(this.baseUrl));
>> >
>> >                 ArrayList parametri = new ArrayList();
>> >                 parametri.add(new NameValuePair("username",
>> this.user));
>> >                 parametri.add(new NameValuePair("password",
>> > this.password));
>> >
>> >                 byte[] response =
>> client.submitPostForm(this.loginPage,
>> > "",
>> >                         parametri, getRequestHeaders());
>> >                 String workingResponse = new String(response);
>> >                 Header[] headers = client.getResponseHeaders();
>> >                 for (int j = 0; j < headers.length; j++) {
>> >                     logger.info("headers " + j + ": " +
>> > headers[j].getName()
>> >                             + "=" + headers[j].getValue());
>> >                 }
>> >                 logger.info(workingResponse);
>> >
>> >                 parametri.clear();
>> >             } catch (Exception e) {
>> >                 logger.error(" LOGIN PROBLEM!");
>> >                 e.printStackTrace();
>> >             }
>> >         }
>> >         return getCookiesArrayList(client);
>> >     }
>> >
>> >
>> > client is an utility class HttpClientUtil that contains the method
>> > submitPostForm below.
>> >
>> >     public byte[] submitPostForm(String relativeUrl, String formName,
>> >             ArrayList params, Header[] requestHeaders) {
>> >         byte[] result = null;
>> >         try {
>> >             NameValuePair[] data = null;
>> >             if (params != null) {
>> >                 data = new NameValuePair[params.size()];
>> >                 for (int i = 0; i < params.size(); i++) {
>> >                     data[i] = (NameValuePair) params.get(i);
>> >                 }
>> >             }
>> >             PostMethod method = new PostMethod(getBaseURL() +
>> > relativeUrl);
>> >             this.method = method;
>> >             logger.info("BASE URL: " + getBaseURL());
>> >             logger.info("RELATIVE URL: " + relativeUrl);
>> >             method.getParams().setCookiePolicy(CookiePolicy.RFC_2109);
>> >
>> >             addHeaders(requestHeaders);
>> >             logger.info("URI PRE QueryString Setting>>> "
>> >                     + method.getURI());
>> >             if (params != null) {
>> >                 method.addParameters(data);
>> >             }
>> >             int statusCode = client.executeMethod(method);
>> >
>> >             logger.info("QueryString>>> " + method.getQueryString());
>> >             logger.info("URI>>> " + method.getURI());
>> >
>> >             result = method.getResponseBody();
>> >
>> >             setCookies(client.getState().getCookies());
>> >             printCookies(cookies);
>> >             Header[] headers = method.getRequestHeaders();
>> >             addHeaders(headers);
>> >         } catch (IOException ioe) {
>> >             ioe.printStackTrace();
>> >         }
>> >         return result;
>> >     }
>> >
>> >
>> > Hope it help's.
>> >
>> > Cheers
>> >
>> > Mattia
>> > 2007/8/23, Derrick Oswald <der...@ro...>:
>> >>
>> >>
>> >> You might try setRedirectionProcessingEnabled(true). Often the first
>> URL
>> >> is only a gateway.
>> >> Also, it's setCookieProcessingEnabled(true), not addCookies.
>> >>
>> >> ----- Original Message ----
>> >> From: "mic...@Ta..." <mic...@Ta...>
>> >> To: htm...@li...
>> >> Sent: Wednesday, August 22, 2007 7:05:36 PM
>> >> Subject: [Htmlparser-user] How to login to web page
>> >>
>> >> How does one get to a web page that requires login?
>> >> The code that I wrote (below) doesn't seem to work.
>> >>
>> >> url = new URL("http://www.somewebsite.com"
>> >> <http://www.somewebsite.com%22>
>> >> ;);
>> >> urlConnection = url.openConnection();
>> >>
>> >> ConnectionManager connectionManager = new ConnectionManager();
>> >> connectionManager.setUser(USER_NAME);
>> >> connectionManager.setPassword(PASSWORD);
>> >> connectionManager.addCookies(urlConnection);
>> >>
>> >> linkBean.setConnection(connectionManager.openConnection(url));
>> >> URL[] urlArray = linkBean.getLinks(); // get all links
>> >>
>> >> Thanks for any help.
>> >>
>> >>
>> >>
>> -------------------------------------------------------------------------
>> >> This SF.net email is sponsored by: Splunk Inc.
>> >> Still grepping through log files to find problems?  Stop.
>> >> Now Search log events and configuration files using AJAX and a
>> browser.
>> >> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> >> _______________________________________________
>> >> Htmlparser-user mailing list
>> >> Htm...@li...
>> >> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>> >>
>> >>
>> >>
>> -------------------------------------------------------------------------
>> >> This SF.net email is sponsored by: Splunk Inc.
>> >> Still grepping through log files to find problems?  Stop.
>> >> Now Search log events and configuration files using AJAX and a
>> browser.
>> >> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> >> _______________________________________________
>> >> Htmlparser-user mailing list
>> >> Htm...@li...
>> >> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>> >>
>> >>
>> >
>> -------------------------------------------------------------------------
>> > This SF.net email is sponsored by: Splunk Inc.
>> > Still grepping through log files to find problems?  Stop.
>> > Now Search log events and configuration files using AJAX and a
>> browser.
>> > Download your FREE copy of Splunk now >>
>> > http://get.splunk.com/_______________________________________________
>> > Htmlparser-user mailing list
>> > Htm...@li...
>> > https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>> >
>>
>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Splunk Inc.
>> Still grepping through log files to find problems?  Stop.
>> Now Search log events and configuration files using AJAX and a browser.
>> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> _______________________________________________
>> Htmlparser-user mailing list
>> Htm...@li...
>> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>
> http://get.splunk.com/_______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>

What are the classes of:
this.baseUrl
this.user
this.password

user and password are not Strings, which was my guess.

> loginData forget it, it's just a check not to login every time. Delete if
> instruction.
>
> loginPage, is the action of the form you are tring to submit.
>
> For now forgive Headers, both Request and Response and also cookies.
>
> Just call submitPostForm(String url, ArrayList params)
> Passing an url and the login parameters.
>
> client inside HttpClientUtil id an HttpClient instance.
>
> HttpClientUtil is a wrapper of HttpClient just to reuse some basic method
> to
> post forms in post or get way.
>
> Delete wat you don't need for now.
>
> Bye.
>
> Mattia
>
>
>
> 2007/8/23, mic...@ta... <mic...@ta...>:
>>
>> I find some problems with this code:
>>
>> loginData, loginPage, getRequestHeaders(), getResponseHeaders(),
>> getCookiesArrayList() are undefined
>>
>> In HttpClientUtil:
>>
>> If client is an instantiation of HttpClientUtil and submitPostForm() is
>> a
>> method in HttpClientUtil, then how can submitPostForm() use 'client'?
>>
>> Methods addHeader() is undefined for HttpClientUtil.
>>
>> cookies (in printCookies(cookies);) is undefined.
>>
>>
>> > Hi,
>> >
>> > this is code I used to login:
>> >
>> >
>> >     public Object login() {
>> >         logger.info(this.getClass().getName() + " - login");
>> >         if (loginData == null) {
>> >             try {
>> >                 client.setBaseURL(new URL(this.baseUrl));
>> >
>> >                 ArrayList parametri = new ArrayList();
>> >                 parametri.add(new NameValuePair("username",
>> this.user));
>> >                 parametri.add(new NameValuePair("password",
>> > this.password));
>> >
>> >                 byte[] response =
>> client.submitPostForm(this.loginPage,
>> > "",
>> >                         parametri, getRequestHeaders());
>> >                 String workingResponse = new String(response);
>> >                 Header[] headers = client.getResponseHeaders();
>> >                 for (int j = 0; j < headers.length; j++) {
>> >                     logger.info("headers " + j + ": " +
>> > headers[j].getName()
>> >                             + "=" + headers[j].getValue());
>> >                 }
>> >                 logger.info(workingResponse);
>> >
>> >                 parametri.clear();
>> >             } catch (Exception e) {
>> >                 logger.error(" LOGIN PROBLEM!");
>> >                 e.printStackTrace();
>> >             }
>> >         }
>> >         return getCookiesArrayList(client);
>> >     }
>> >
>> >
>> > client is an utility class HttpClientUtil that contains the method
>> > submitPostForm below.
>> >
>> >     public byte[] submitPostForm(String relativeUrl, String formName,
>> >             ArrayList params, Header[] requestHeaders) {
>> >         byte[] result = null;
>> >         try {
>> >             NameValuePair[] data = null;
>> >             if (params != null) {
>> >                 data = new NameValuePair[params.size()];
>> >                 for (int i = 0; i < params.size(); i++) {
>> >                     data[i] = (NameValuePair) params.get(i);
>> >                 }
>> >             }
>> >             PostMethod method = new PostMethod(getBaseURL() +
>> > relativeUrl);
>> >             this.method = method;
>> >             logger.info("BASE URL: " + getBaseURL());
>> >             logger.info("RELATIVE URL: " + relativeUrl);
>> >             method.getParams().setCookiePolicy(CookiePolicy.RFC_2109);
>> >
>> >             addHeaders(requestHeaders);
>> >             logger.info("URI PRE QueryString Setting>>> "
>> >                     + method.getURI());
>> >             if (params != null) {
>> >                 method.addParameters(data);
>> >             }
>> >             int statusCode = client.executeMethod(method);
>> >
>> >             logger.info("QueryString>>> " + method.getQueryString());
>> >             logger.info("URI>>> " + method.getURI());
>> >
>> >             result = method.getResponseBody();
>> >
>> >             setCookies(client.getState().getCookies());
>> >             printCookies(cookies);
>> >             Header[] headers = method.getRequestHeaders();
>> >             addHeaders(headers);
>> >         } catch (IOException ioe) {
>> >             ioe.printStackTrace();
>> >         }
>> >         return result;
>> >     }
>> >
>> >
>> > Hope it help's.
>> >
>> > Cheers
>> >
>> > Mattia
>> > 2007/8/23, Derrick Oswald <der...@ro...>:
>> >>
>> >>
>> >> You might try setRedirectionProcessingEnabled(true). Often the first
>> URL
>> >> is only a gateway.
>> >> Also, it's setCookieProcessingEnabled(true), not addCookies.
>> >>
>> >> ----- Original Message ----
>> >> From: "mic...@Ta..." <mic...@Ta...>
>> >> To: htm...@li...
>> >> Sent: Wednesday, August 22, 2007 7:05:36 PM
>> >> Subject: [Htmlparser-user] How to login to web page
>> >>
>> >> How does one get to a web page that requires login?
>> >> The code that I wrote (below) doesn't seem to work.
>> >>
>> >> url = new URL("http://www.somewebsite.com"
>> >> <http://www.somewebsite.com%22>
>> >> ;);
>> >> urlConnection = url.openConnection();
>> >>
>> >> ConnectionManager connectionManager = new ConnectionManager();
>> >> connectionManager.setUser(USER_NAME);
>> >> connectionManager.setPassword(PASSWORD);
>> >> connectionManager.addCookies(urlConnection);
>> >>
>> >> linkBean.setConnection(connectionManager.openConnection(url));
>> >> URL[] urlArray = linkBean.getLinks(); // get all links
>> >>
>> >> Thanks for any help.
>> >>
>> >>
>> >>
>> -------------------------------------------------------------------------
>> >> This SF.net email is sponsored by: Splunk Inc.
>> >> Still grepping through log files to find problems?  Stop.
>> >> Now Search log events and configuration files using AJAX and a
>> browser.
>> >> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> >> _______________________________________________
>> >> Htmlparser-user mailing list
>> >> Htm...@li...
>> >> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>> >>
>> >>
>> >>
>> -------------------------------------------------------------------------
>> >> This SF.net email is sponsored by: Splunk Inc.
>> >> Still grepping through log files to find problems?  Stop.
>> >> Now Search log events and configuration files using AJAX and a
>> browser.
>> >> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> >> _______________________________________________
>> >> Htmlparser-user mailing list
>> >> Htm...@li...
>> >> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>> >>
>> >>
>> >
>> -------------------------------------------------------------------------
>> > This SF.net email is sponsored by: Splunk Inc.
>> > Still grepping through log files to find problems?  Stop.
>> > Now Search log events and configuration files using AJAX and a
>> browser.
>> > Download your FREE copy of Splunk now >>
>> > http://get.splunk.com/_______________________________________________
>> > Htmlparser-user mailing list
>> > Htm...@li...
>> > https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>> >
>>
>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Splunk Inc.
>> Still grepping through log files to find problems?  Stop.
>> Now Search log events and configuration files using AJAX and a browser.
>> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> _______________________________________________
>> Htmlparser-user mailing list
>> Htm...@li...
>> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>
> http://get.splunk.com/_______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>

loginData forget it, it's just a check not to login every time. Delete if
instruction.

loginPage, is the action of the form you are tring to submit.

For now forgive Headers, both Request and Response and also cookies.

Just call submitPostForm(String url, ArrayList params)
Passing an url and the login parameters.

client inside HttpClientUtil id an HttpClient instance.

HttpClientUtil is a wrapper of HttpClient just to reuse some basic method to
post forms in post or get way.

Delete wat you don't need for now.

Bye.

Mattia

2007/8/23, mic...@ta... <mic...@ta...>:
>
> I find some problems with this code:
>
> loginData, loginPage, getRequestHeaders(), getResponseHeaders(),
> getCookiesArrayList() are undefined
>
> In HttpClientUtil:
>
> If client is an instantiation of HttpClientUtil and submitPostForm() is a
> method in HttpClientUtil, then how can submitPostForm() use 'client'?
>
> Methods addHeader() is undefined for HttpClientUtil.
>
> cookies (in printCookies(cookies);) is undefined.
>
>
> > Hi,
> >
> > this is code I used to login:
> >
> >
> >     public Object login() {
> >         logger.info(this.getClass().getName() + " - login");
> >         if (loginData == null) {
> >             try {
> >                 client.setBaseURL(new URL(this.baseUrl));
> >
> >                 ArrayList parametri = new ArrayList();
> >                 parametri.add(new NameValuePair("username", this.user));
> >                 parametri.add(new NameValuePair("password",
> > this.password));
> >
> >                 byte[] response = client.submitPostForm(this.loginPage,
> > "",
> >                         parametri, getRequestHeaders());
> >                 String workingResponse = new String(response);
> >                 Header[] headers = client.getResponseHeaders();
> >                 for (int j = 0; j < headers.length; j++) {
> >                     logger.info("headers " + j + ": " +
> > headers[j].getName()
> >                             + "=" + headers[j].getValue());
> >                 }
> >                 logger.info(workingResponse);
> >
> >                 parametri.clear();
> >             } catch (Exception e) {
> >                 logger.error(" LOGIN PROBLEM!");
> >                 e.printStackTrace();
> >             }
> >         }
> >         return getCookiesArrayList(client);
> >     }
> >
> >
> > client is an utility class HttpClientUtil that contains the method
> > submitPostForm below.
> >
> >     public byte[] submitPostForm(String relativeUrl, String formName,
> >             ArrayList params, Header[] requestHeaders) {
> >         byte[] result = null;
> >         try {
> >             NameValuePair[] data = null;
> >             if (params != null) {
> >                 data = new NameValuePair[params.size()];
> >                 for (int i = 0; i < params.size(); i++) {
> >                     data[i] = (NameValuePair) params.get(i);
> >                 }
> >             }
> >             PostMethod method = new PostMethod(getBaseURL() +
> > relativeUrl);
> >             this.method = method;
> >             logger.info("BASE URL: " + getBaseURL());
> >             logger.info("RELATIVE URL: " + relativeUrl);
> >             method.getParams().setCookiePolicy(CookiePolicy.RFC_2109);
> >
> >             addHeaders(requestHeaders);
> >             logger.info("URI PRE QueryString Setting>>> "
> >                     + method.getURI());
> >             if (params != null) {
> >                 method.addParameters(data);
> >             }
> >             int statusCode = client.executeMethod(method);
> >
> >             logger.info("QueryString>>> " + method.getQueryString());
> >             logger.info("URI>>> " + method.getURI());
> >
> >             result = method.getResponseBody();
> >
> >             setCookies(client.getState().getCookies());
> >             printCookies(cookies);
> >             Header[] headers = method.getRequestHeaders();
> >             addHeaders(headers);
> >         } catch (IOException ioe) {
> >             ioe.printStackTrace();
> >         }
> >         return result;
> >     }
> >
> >
> > Hope it help's.
> >
> > Cheers
> >
> > Mattia
> > 2007/8/23, Derrick Oswald <der...@ro...>:
> >>
> >>
> >> You might try setRedirectionProcessingEnabled(true). Often the first
> URL
> >> is only a gateway.
> >> Also, it's setCookieProcessingEnabled(true), not addCookies.
> >>
> >> ----- Original Message ----
> >> From: "mic...@Ta..." <mic...@Ta...>
> >> To: htm...@li...
> >> Sent: Wednesday, August 22, 2007 7:05:36 PM
> >> Subject: [Htmlparser-user] How to login to web page
> >>
> >> How does one get to a web page that requires login?
> >> The code that I wrote (below) doesn't seem to work.
> >>
> >> url = new URL("http://www.somewebsite.com"
> >> <http://www.somewebsite.com%22>
> >> ;);
> >> urlConnection = url.openConnection();
> >>
> >> ConnectionManager connectionManager = new ConnectionManager();
> >> connectionManager.setUser(USER_NAME);
> >> connectionManager.setPassword(PASSWORD);
> >> connectionManager.addCookies(urlConnection);
> >>
> >> linkBean.setConnection(connectionManager.openConnection(url));
> >> URL[] urlArray = linkBean.getLinks(); // get all links
> >>
> >> Thanks for any help.
> >>
> >>
> >>
> -------------------------------------------------------------------------
> >> This SF.net email is sponsored by: Splunk Inc.
> >> Still grepping through log files to find problems?  Stop.
> >> Now Search log events and configuration files using AJAX and a browser.
> >> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> >> _______________________________________________
> >> Htmlparser-user mailing list
> >> Htm...@li...
> >> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> >>
> >>
> >>
> -------------------------------------------------------------------------
> >> This SF.net email is sponsored by: Splunk Inc.
> >> Still grepping through log files to find problems?  Stop.
> >> Now Search log events and configuration files using AJAX and a browser.
> >> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> >> _______________________________________________
> >> Htmlparser-user mailing list
> >> Htm...@li...
> >> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> >>
> >>
> >
> -------------------------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems?  Stop.
> > Now Search log events and configuration files using AJAX and a browser.
> > Download your FREE copy of Splunk now >>
> > http://get.splunk.com/_______________________________________________
> > Htmlparser-user mailing list
> > Htm...@li...
> > https://lists.sourceforge.net/lists/listinfo/htmlparser-user
> >
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>

I find some problems with this code:

loginData, loginPage, getRequestHeaders(), getResponseHeaders(),
getCookiesArrayList() are undefined

In HttpClientUtil:

If client is an instantiation of HttpClientUtil and submitPostForm() is a
method in HttpClientUtil, then how can submitPostForm() use 'client'?

Methods addHeader() is undefined for HttpClientUtil.

cookies (in printCookies(cookies);) is undefined.

> Hi,
>
> this is code I used to login:
>
>
>     public Object login() {
>         logger.info(this.getClass().getName() + " - login");
>         if (loginData == null) {
>             try {
>                 client.setBaseURL(new URL(this.baseUrl));
>
>                 ArrayList parametri = new ArrayList();
>                 parametri.add(new NameValuePair("username", this.user));
>                 parametri.add(new NameValuePair("password",
> this.password));
>
>                 byte[] response = client.submitPostForm(this.loginPage,
> "",
>                         parametri, getRequestHeaders());
>                 String workingResponse = new String(response);
>                 Header[] headers = client.getResponseHeaders();
>                 for (int j = 0; j < headers.length; j++) {
>                     logger.info("headers " + j + ": " +
> headers[j].getName()
>                             + "=" + headers[j].getValue());
>                 }
>                 logger.info(workingResponse);
>
>                 parametri.clear();
>             } catch (Exception e) {
>                 logger.error(" LOGIN PROBLEM!");
>                 e.printStackTrace();
>             }
>         }
>         return getCookiesArrayList(client);
>     }
>
>
> client is an utility class HttpClientUtil that contains the method
> submitPostForm below.
>
>     public byte[] submitPostForm(String relativeUrl, String formName,
>             ArrayList params, Header[] requestHeaders) {
>         byte[] result = null;
>         try {
>             NameValuePair[] data = null;
>             if (params != null) {
>                 data = new NameValuePair[params.size()];
>                 for (int i = 0; i < params.size(); i++) {
>                     data[i] = (NameValuePair) params.get(i);
>                 }
>             }
>             PostMethod method = new PostMethod(getBaseURL() +
> relativeUrl);
>             this.method = method;
>             logger.info("BASE URL: " + getBaseURL());
>             logger.info("RELATIVE URL: " + relativeUrl);
>             method.getParams().setCookiePolicy(CookiePolicy.RFC_2109);
>
>             addHeaders(requestHeaders);
>             logger.info("URI PRE QueryString Setting>>> "
>                     + method.getURI());
>             if (params != null) {
>                 method.addParameters(data);
>             }
>             int statusCode = client.executeMethod(method);
>
>             logger.info("QueryString>>> " + method.getQueryString());
>             logger.info("URI>>> " + method.getURI());
>
>             result = method.getResponseBody();
>
>             setCookies(client.getState().getCookies());
>             printCookies(cookies);
>             Header[] headers = method.getRequestHeaders();
>             addHeaders(headers);
>         } catch (IOException ioe) {
>             ioe.printStackTrace();
>         }
>         return result;
>     }
>
>
> Hope it help's.
>
> Cheers
>
> Mattia
> 2007/8/23, Derrick Oswald <der...@ro...>:
>>
>>
>> You might try setRedirectionProcessingEnabled(true). Often the first URL
>> is only a gateway.
>> Also, it's setCookieProcessingEnabled(true), not addCookies.
>>
>> ----- Original Message ----
>> From: "mic...@Ta..." <mic...@Ta...>
>> To: htm...@li...
>> Sent: Wednesday, August 22, 2007 7:05:36 PM
>> Subject: [Htmlparser-user] How to login to web page
>>
>> How does one get to a web page that requires login?
>> The code that I wrote (below) doesn't seem to work.
>>
>> url = new URL("http://www.somewebsite.com"
>> <http://www.somewebsite.com%22>
>> ;);
>> urlConnection = url.openConnection();
>>
>> ConnectionManager connectionManager = new ConnectionManager();
>> connectionManager.setUser(USER_NAME);
>> connectionManager.setPassword(PASSWORD);
>> connectionManager.addCookies(urlConnection);
>>
>> linkBean.setConnection(connectionManager.openConnection(url));
>> URL[] urlArray = linkBean.getLinks(); // get all links
>>
>> Thanks for any help.
>>
>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Splunk Inc.
>> Still grepping through log files to find problems?  Stop.
>> Now Search log events and configuration files using AJAX and a browser.
>> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> _______________________________________________
>> Htmlparser-user mailing list
>> Htm...@li...
>> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>>
>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Splunk Inc.
>> Still grepping through log files to find problems?  Stop.
>> Now Search log events and configuration files using AJAX and a browser.
>> Download your FREE copy of Splunk now >>  http://get.splunk.com/
>> _______________________________________________
>> Htmlparser-user mailing list
>> Htm...@li...
>> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>>
>>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>
> http://get.splunk.com/_______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>

This is an automatically generated Delivery Status Notification.

Unable to deliver message to the following recipients, due to being unable to connect successfully to the destination mail server.

       htm...@li...

Hi,

this is code I used to login:

    public Object login() {
        logger.info(this.getClass().getName() + " - login");
        if (loginData == null) {
            try {
                client.setBaseURL(new URL(this.baseUrl));

                ArrayList parametri = new ArrayList();
                parametri.add(new NameValuePair("username", this.user));
                parametri.add(new NameValuePair("password", this.password));

                byte[] response = client.submitPostForm(this.loginPage, "",
                        parametri, getRequestHeaders());
                String workingResponse = new String(response);
                Header[] headers = client.getResponseHeaders();
                for (int j = 0; j < headers.length; j++) {
                    logger.info("headers " + j + ": " + headers[j].getName()
                            + "=" + headers[j].getValue());
                }
                logger.info(workingResponse);

                parametri.clear();
            } catch (Exception e) {
                logger.error(" LOGIN PROBLEM!");
                e.printStackTrace();
            }
        }
        return getCookiesArrayList(client);
    }

client is an utility class HttpClientUtil that contains the method
submitPostForm below.

    public byte[] submitPostForm(String relativeUrl, String formName,
            ArrayList params, Header[] requestHeaders) {
        byte[] result = null;
        try {
            NameValuePair[] data = null;
            if (params != null) {
                data = new NameValuePair[params.size()];
                for (int i = 0; i < params.size(); i++) {
                    data[i] = (NameValuePair) params.get(i);
                }
            }
            PostMethod method = new PostMethod(getBaseURL() +
relativeUrl);
            this.method = method;
            logger.info("BASE URL: " + getBaseURL());
            logger.info("RELATIVE URL: " + relativeUrl);
            method.getParams().setCookiePolicy(CookiePolicy.RFC_2109);

            addHeaders(requestHeaders);
            logger.info("URI PRE QueryString Setting>>> "
                    + method.getURI());
            if (params != null) {
                method.addParameters(data);
            }
            int statusCode = client.executeMethod(method);

            logger.info("QueryString>>> " + method.getQueryString());
            logger.info("URI>>> " + method.getURI());

            result = method.getResponseBody();

            setCookies(client.getState().getCookies());
            printCookies(cookies);
            Header[] headers = method.getRequestHeaders();
            addHeaders(headers);
        } catch (IOException ioe) {
            ioe.printStackTrace();
        }
        return result;
    }

Hope it help's.

Cheers

Mattia
2007/8/23, Derrick Oswald <der...@ro...>:
>
>
> You might try setRedirectionProcessingEnabled(true). Often the first URL
> is only a gateway.
> Also, it's setCookieProcessingEnabled(true), not addCookies.
>
> ----- Original Message ----
> From: "mic...@Ta..." <mic...@Ta...>
> To: htm...@li...
> Sent: Wednesday, August 22, 2007 7:05:36 PM
> Subject: [Htmlparser-user] How to login to web page
>
> How does one get to a web page that requires login?
> The code that I wrote (below) doesn't seem to work.
>
> url = new URL("http://www.somewebsite.com" <http://www.somewebsite.com%22>
> ;);
> urlConnection = url.openConnection();
>
> ConnectionManager connectionManager = new ConnectionManager();
> connectionManager.setUser(USER_NAME);
> connectionManager.setPassword(PASSWORD);
> connectionManager.addCookies(urlConnection);
>
> linkBean.setConnection(connectionManager.openConnection(url));
> URL[] urlArray = linkBean.getLinks(); // get all links
>
> Thanks for any help.
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> Htmlparser-user mailing list
> Htm...@li...
> https://lists.sourceforge.net/lists/listinfo/htmlparser-user
>
>

=0AYou might try setRedirectionProcessingEnabled(true). Often the first URL=
 is only a gateway.=0AAlso, it's setCookieProcessingEnabled(true), not addC=
ookies.=0A=0A----- Original Message ----=0AFrom: "mic...@Ta..." <m=
ick...@Ta...>=0ATo: htm...@li...=0ASent: W=
ednesday, August 22, 2007 7:05:36 PM=0ASubject: [Htmlparser-user] How to lo=
gin to web page=0A=0AHow does one get to a web page that requires login?=0A=
The code that I wrote (below) doesn't seem to work.=0A=0Aurl =3D new URL("h=
ttp://www.somewebsite.com";);=0AurlConnection =3D url.openConnection();=0A=
=0AConnectionManager connectionManager =3D new ConnectionManager();=0Aconne=
ctionManager.setUser(USER_NAME);=0AconnectionManager.setPassword(PASSWORD);=
=0AconnectionManager.addCookies(urlConnection);=0A=0AlinkBean.setConnection=
(connectionManager.openConnection(url));=0AURL[] urlArray =3D linkBean.getL=
inks(); // get all links=0A=0AThanks for any help.=0A=0A=0A----------------=
---------------------------------------------------------=0AThis SF.net ema=
il is sponsored by: Splunk Inc.=0AStill grepping through log files to find =
problems?  Stop.=0ANow Search log events and configuration files using AJAX=
 and a browser.=0ADownload your FREE copy of Splunk now >>  http://get.splu=
nk.com/=0A_______________________________________________=0AHtmlparser-user=
 mailing list=0AH...@li...=0Ahttps://lists.sourc=
eforge.net/lists/listinfo/htmlparser-user=0A=0A=0A=0A=0A
How does one get to a web page that requires login?
The code that I wrote (below) doesn't seem to work.

url = new URL("http://www.somewebsite.com");
urlConnection = url.openConnection();

ConnectionManager connectionManager = new ConnectionManager();
connectionManager.setUser(USER_NAME);
connectionManager.setPassword(PASSWORD);
connectionManager.addCookies(urlConnection);

linkBean.setConnection(connectionManager.openConnection(url));
URL[] urlArray = linkBean.getLinks(); // get all links

Thanks for any help.

This is an automatically generated Delivery Status Notification.

Unable to deliver message to the following recipients, due to being unable to connect successfully to the destination mail server.

       htm...@li...

H_E+R,E WE GO AG,AIN! 

T H+E B I_G O'N.E B-EFORE T,H-E SEPTEM*BER.R*ALLY! 

T*H E MARK+ET IS ABO,UT TO P*O,P,, A+N D SO IS E'X'M,T,! 

T'ick: E-X'M-T 

5*-day po.tential+: 0+._4'0 

F.irm: E'XC+HANGE MO_BILE T E*L'E (Othe'r O*T+C : EXM'T.PK) 
A's'k': 0+.+1'0 (+2.5.00%*) UP TO 2,5+% in 1 day

N*o_t o-n,l_y d.o'e,s t,h i s f-i r_m h_a,v*e g+reat fun-damenta.ls, 

b_u t get'ting t*h+i_s opportun_,ity at t+h e ri,ght t+i'm,e-, 

righ t befo-re t'h.e rall+y is w-h.a.t mak.es t+h.i.s d-e_a_l so s_weet! 
T h i,s a g+reat opportuni-t*y to at lea*st doub,le up!

Re_-moving ----_--_-- R'u_n UnregisterHelp2005. 

T,h_e g_o+d s h.a+d n'o,t tu*rned f r o.m m'e+n*, w*a+s t.h.e message+. 

N o,w*, si mply ico,-nizing t_h e for-g_round a+p+p l_e*t's it swap,, a_n.d it shr inks to ab,out 1 M-B-. 
A+r_e y-o-u worr-ied 'cau_se y-o-u_r gi_rlfriend'.s j u s t a lit+tle l_a*t+e . 
T'h,e No vell NetW*are networ.k environme,+nt o+ffers f+i n_e sec*uri-ty. 

H.E.R.E WE GO A*GAIN! 
T.H,E B-I.G O*N,E B+EFORE T+H E SEPTEMBE__R.RALLY! 

T_H.E MA+RKET IS AB-OUT TO P-O-P+, A_N-D SO IS E X*M_T-! 

Tick.: E.X M'T 

5_-day poten_ti,al: 0*. 4'0 
Fi'rm: EXCHAN+_GE M.OBILE T*E L.E (Othe r O-T C+: EX'MT.PK) 
A*s+k-: 0 .,1_0 (+25*.00+%) UP TO 2.5.% in 1 day

N-o,t o-n_l_y d'o-e's t+h_i_s f.i-r,m h-a.v.e g*reat fundame-ntal,s, 

b u't gettin,g t*h*i.s op_por_tunity at t_h'e righ+t t-i+m e*, 
rig*ht befor'e t,h_e ra*lly is w h-a-t m+akes t,h.i,s d,e+a+l so sw'eet! 

T'h-i s a g,reat op,*portunity to at le'ast d*ouble up!

The_re a_r_e a l+s'o po-'inters to t_h,e w*idgets cont'ain ed w,ithin t-h-e f+i.l+e se lec*tion widget*. 
MPSe+tInformati+on handl.es set.ting O.I.D val,ues on VE'LAN m+in iports. 

T'h_i.s w-i+l.l be show+n on o.verv iew p*ages - bett.er k'e,e_p it shor't a*n_d simple.. 
T*h+o.u abo ,minable da,mn'd ch+.eater, a_r_t t-h_o+u n-o_t asham'ed to be ca_lled ca*ptain. 

Wor'king ho-urs h a,d b-e*e n dr+astical.ly in.crease+d in a*nticip ation of H-a t.e W-e.e+k,. 

H E'R-E WE GO AG-AIN! 

T+H.E B+I.G O,N+E BE-FORE T H-E SEPTEMBER.RAL.LY*! 
T_H+E MARK_ET IS A BOUT TO P_O,P , A-N,D SO IS E.X.M.T ! 

Ti-ck: E-X,M-T 

5+-day pote*ntia,l: 0 .+4.0 
Fi+rm: EXC  HANGE M_OBILE T-E'L*E (Othe_r O T-C,: EX MT.PK) 

A.s'k.: 0..+1'0 (+25.._00%) UP TO 2-5'% in 1 day

N'o_t o*n'l.y d,o,e+s t h i.s f i'r,m h+a_v.e gr*eat f,undament+als, 
b_u.t gett,ing t_h,i-s oppo_rtuni+ty at t'h.e ri.ght t'i,m e,, 
r-ight be+fore t_h e ra*lly is w*h,a*t m.akes t'h i.s d-e_a_l so swe_et! =

T-h+i's a g.reat opport+un_ity to at lea'st doub_le up!

Priv+ac,y, he sa id, w'a,s a v-e-r+y val.uabl*e th_ing. 
Chil d b'o,r-n ev,ery mi'nute some*wher e. 

U*s,e t*h*e iE.asySi*te inst'ant p'ubli.sher to u pload y.o,u*r W.e-b =
S,i+t.e to t h*e i.nterne,t. 

Ther+e w'e-r+e eig_ht of them , g,reat, braw+ny lizar dl,ike c'reat ures =
in th-ickly quil'ted cloaks,. 

Po+lici+es a_r.e n*o.t configu're.d f*o'r e.a*c_h pr-efix sepa-rat.ely =
b-u,t f*o'r gro+ups of prefix_e+s. 

2001	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (1)	Dec
2002	Jan (7)	Feb	Mar (9)	Apr (50)	May (20)	Jun (47)	Jul (37)	Aug (32)	Sep (30)	Oct (11)	Nov (37)	Dec (47)
2003	Jan (31)	Feb (70)	Mar (67)	Apr (34)	May (66)	Jun (25)	Jul (48)	Aug (43)	Sep (58)	Oct (25)	Nov (10)	Dec (25)
2004	Jan (38)	Feb (17)	Mar (24)	Apr (25)	May (11)	Jun (6)	Jul (24)	Aug (42)	Sep (13)	Oct (17)	Nov (13)	Dec (44)
2005	Jan (10)	Feb (16)	Mar (16)	Apr (23)	May (6)	Jun (19)	Jul (39)	Aug (15)	Sep (40)	Oct (49)	Nov (29)	Dec (41)
2006	Jan (28)	Feb (24)	Mar (52)	Apr (41)	May (31)	Jun (34)	Jul (22)	Aug (12)	Sep (11)	Oct (11)	Nov (11)	Dec (4)
2007	Jan (39)	Feb (13)	Mar (16)	Apr (24)	May (13)	Jun (12)	Jul (21)	Aug (61)	Sep (31)	Oct (13)	Nov (32)	Dec (15)
2008	Jan (7)	Feb (8)	Mar (14)	Apr (12)	May (23)	Jun (20)	Jul (9)	Aug (6)	Sep (2)	Oct (7)	Nov (3)	Dec (2)
2009	Jan (5)	Feb (8)	Mar (10)	Apr (22)	May (85)	Jun (82)	Jul (45)	Aug (28)	Sep (26)	Oct (50)	Nov (8)	Dec (16)
2010	Jan (3)	Feb (11)	Mar (39)	Apr (56)	May (80)	Jun (64)	Jul (49)	Aug (48)	Sep (16)	Oct (3)	Nov (5)	Dec (5)
2011	Jan (13)	Feb	Mar (1)	Apr (7)	May (7)	Jun (7)	Jul (7)	Aug (8)	Sep	Oct (6)	Nov (2)	Dec
2012	Jan (5)	Feb	Mar (3)	Apr (3)	May (4)	Jun (8)	Jul (1)	Aug (5)	Sep (10)	Oct (3)	Nov (2)	Dec (4)
2013	Jan (4)	Feb (2)	Mar (7)	Apr (7)	May (6)	Jun (7)	Jul (3)	Aug	Sep (1)	Oct	Nov	Dec
2014	Jan	Feb (2)	Mar (1)	Apr	May (3)	Jun (1)	Jul	Aug	Sep (1)	Oct (4)	Nov (2)	Dec (4)
2015	Jan (4)	Feb (2)	Mar (8)	Apr (7)	May (6)	Jun (7)	Jul (3)	Aug (1)	Sep (1)	Oct (4)	Nov (3)	Dec (4)
2016	Jan (4)	Feb (6)	Mar (9)	Apr (9)	May (6)	Jun (1)	Jul (1)	Aug	Sep	Oct (1)	Nov (1)	Dec (1)
2017	Jan	Feb (1)	Mar (3)	Apr (1)	May	Jun (1)	Jul (2)	Aug (3)	Sep (6)	Oct (3)	Nov (2)	Dec (5)
2018	Jan (3)	Feb (13)	Mar (28)	Apr (5)	May (4)	Jun (2)	Jul (2)	Aug (8)	Sep (2)	Oct (1)	Nov (5)	Dec (1)
2019	Jan (8)	Feb (1)	Mar	Apr (1)	May (4)	Jun	Jul (1)	Aug	Sep	Oct	Nov (2)	Dec (2)
2020	Jan	Feb	Mar (1)	Apr (1)	May (1)	Jun (2)	Jul (1)	Aug (1)	Sep (1)	Oct	Nov (1)	Dec (1)
2021	Jan (3)	Feb (2)	Mar (1)	Apr (1)	May (2)	Jun (1)	Jul (2)	Aug (1)	Sep	Oct	Nov	Dec
2022	Jan	Feb	Mar	Apr (1)	May (1)	Jun (1)	Jul	Aug (1)	Sep	Oct	Nov	Dec
2023	Jan (2)	Feb	Mar	Apr	May	Jun	Jul	Aug (1)	Sep	Oct	Nov	Dec
2024	Jan (2)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2025	Jan	Feb	Mar	Apr	May	Jun (1)	Jul	Aug	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
			1 (2)	2	3	4
5	6 (1)	7 (4)	8 (2)	9 (2)	10 (1)	11 (4)
12 (3)	13 (2)	14 (1)	15 (2)	16 (1)	17 (2)	18 (4)
19 (2)	20 (3)	21 (3)	22 (1)	23 (12)	24 (5)	25
26 (1)	27	28 (2)	29	30 (1)	31

htmlparser-user Mailing List for HTML Parser

htmlparser-user — The user mailing list for users of the htmlparser library