htmlparser-user Mailing List for HTML Parser
Brought to you by:
derrickoswald
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(7) |
Feb
|
Mar
(9) |
Apr
(50) |
May
(20) |
Jun
(47) |
Jul
(37) |
Aug
(32) |
Sep
(30) |
Oct
(11) |
Nov
(37) |
Dec
(47) |
2003 |
Jan
(31) |
Feb
(70) |
Mar
(67) |
Apr
(34) |
May
(66) |
Jun
(25) |
Jul
(48) |
Aug
(43) |
Sep
(58) |
Oct
(25) |
Nov
(10) |
Dec
(25) |
2004 |
Jan
(38) |
Feb
(17) |
Mar
(24) |
Apr
(25) |
May
(11) |
Jun
(6) |
Jul
(24) |
Aug
(42) |
Sep
(13) |
Oct
(17) |
Nov
(13) |
Dec
(44) |
2005 |
Jan
(10) |
Feb
(16) |
Mar
(16) |
Apr
(23) |
May
(6) |
Jun
(19) |
Jul
(39) |
Aug
(15) |
Sep
(40) |
Oct
(49) |
Nov
(29) |
Dec
(41) |
2006 |
Jan
(28) |
Feb
(24) |
Mar
(52) |
Apr
(41) |
May
(31) |
Jun
(34) |
Jul
(22) |
Aug
(12) |
Sep
(11) |
Oct
(11) |
Nov
(11) |
Dec
(4) |
2007 |
Jan
(39) |
Feb
(13) |
Mar
(16) |
Apr
(24) |
May
(13) |
Jun
(12) |
Jul
(21) |
Aug
(61) |
Sep
(31) |
Oct
(13) |
Nov
(32) |
Dec
(15) |
2008 |
Jan
(7) |
Feb
(8) |
Mar
(14) |
Apr
(12) |
May
(23) |
Jun
(20) |
Jul
(9) |
Aug
(6) |
Sep
(2) |
Oct
(7) |
Nov
(3) |
Dec
(2) |
2009 |
Jan
(5) |
Feb
(8) |
Mar
(10) |
Apr
(22) |
May
(85) |
Jun
(82) |
Jul
(45) |
Aug
(28) |
Sep
(26) |
Oct
(50) |
Nov
(8) |
Dec
(16) |
2010 |
Jan
(3) |
Feb
(11) |
Mar
(39) |
Apr
(56) |
May
(80) |
Jun
(64) |
Jul
(49) |
Aug
(48) |
Sep
(16) |
Oct
(3) |
Nov
(5) |
Dec
(5) |
2011 |
Jan
(13) |
Feb
|
Mar
(1) |
Apr
(7) |
May
(7) |
Jun
(7) |
Jul
(7) |
Aug
(8) |
Sep
|
Oct
(6) |
Nov
(2) |
Dec
|
2012 |
Jan
(5) |
Feb
|
Mar
(3) |
Apr
(3) |
May
(4) |
Jun
(8) |
Jul
(1) |
Aug
(5) |
Sep
(10) |
Oct
(3) |
Nov
(2) |
Dec
(4) |
2013 |
Jan
(4) |
Feb
(2) |
Mar
(7) |
Apr
(7) |
May
(6) |
Jun
(7) |
Jul
(3) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
(2) |
Mar
(1) |
Apr
|
May
(3) |
Jun
(1) |
Jul
|
Aug
|
Sep
(1) |
Oct
(4) |
Nov
(2) |
Dec
(4) |
2015 |
Jan
(4) |
Feb
(2) |
Mar
(8) |
Apr
(7) |
May
(6) |
Jun
(7) |
Jul
(3) |
Aug
(1) |
Sep
(1) |
Oct
(4) |
Nov
(3) |
Dec
(4) |
2016 |
Jan
(4) |
Feb
(6) |
Mar
(9) |
Apr
(9) |
May
(6) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2017 |
Jan
|
Feb
(1) |
Mar
(3) |
Apr
(1) |
May
|
Jun
(1) |
Jul
(2) |
Aug
(3) |
Sep
(6) |
Oct
(3) |
Nov
(2) |
Dec
(5) |
2018 |
Jan
(3) |
Feb
(13) |
Mar
(28) |
Apr
(5) |
May
(4) |
Jun
(2) |
Jul
(2) |
Aug
(8) |
Sep
(2) |
Oct
(1) |
Nov
(5) |
Dec
(1) |
2019 |
Jan
(8) |
Feb
(1) |
Mar
|
Apr
(1) |
May
(4) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
(2) |
2020 |
Jan
|
Feb
|
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
(2) |
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
|
Nov
(1) |
Dec
(1) |
2021 |
Jan
(3) |
Feb
(2) |
Mar
(1) |
Apr
(1) |
May
(2) |
Jun
(1) |
Jul
(2) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
|
|
|
1
|
2
|
3
|
4
|
5
|
6
|
7
(2) |
8
|
9
|
10
|
11
|
12
|
13
|
14
|
15
|
16
|
17
(1) |
18
(3) |
19
(1) |
20
(1) |
21
(3) |
22
|
23
|
24
|
25
|
26
|
27
|
28
|
29
|
30
|
31
|
|
|
|
|
|
From: Leos L. <lit...@ce...> - 2004-05-21 12:39:22
|
Pietro wrote: > But if i inform something like: > "showsearchbysubjectpage.do?categoryid=automoveis&localeufid=sp&localecityid=sao+paulo&localemarketid=5&guideid=1class=" > what is returnded for me is: > Can i obtain the parameters of an link without having to write my own > method with regex to do it ? Hi, attributes are property of the tag, so you can get HREF attribute of A tag with htmlparser. I am not aware of class, that would split its value into request parameters .. IMHO you'll on your own, but try to seek jakarta commons, there might be a solution. Regards Leos |
From: Pietro <pie...@ma...> - 2004-05-21 12:21:41
|
<p>Hi,</p><p> I'm trying to get the parameters of a given URI, for this i made the method bellow.</p><p>But if i inform something like: "showsearchbysubjectpage.do?categoryid=automoveis&localeufid=sp&localecityid=sao+paulo&localemarketid=5&guideid=1class="<br />what is returnded for me is:</p><p>Nome ==> A => null<br />Nome ==> HREF => showsearchbysubjectpage.do?categoryid=automoveis&localeufid=sp&localecityid=sao+paulo&localemarketid=5&guideid=1class=<br />Nome ==> null => </p><p>But i was expecting for something like:<br />categoryid/automoveis<br />localeufid/sp<br />and so on...</p><p>Can i obtain the parameters of an link without having to write my own method with regex to do it ?</p><p>Regards.</p><p /><p /><p>public Map getParameters(final String uri) {final HashMap parameters = new HashMap();new LinkTag();for(Iterator i = v.iterator();i.hasNext();) {return parameters;</p><p /><p><br />LinkTag link = </p><p>link.setLink(uri);</p><p><br />Vector v = link.getAttributesEx();</p><p /><p>Attribute attr =(Attribute)i.next();</p><p>parameters.put(attr.getName(), attr.getValue());</p><p>}</p><p /><p>}<br /><br /></p> |
From: Derrick O. <Der...@Ro...> - 2004-05-21 03:05:58
|
Dizzy, Just a sanity check, if you haven't registered this tag it won't be used, see http://htmlparser.sourceforge.net/wiki/index.php/CustomTagLinks but basically: Parser parser = new Parser ("http://urlIWantToParse.com"); PrototypicalNodeFactory factory = new PrototypicalNodeFactory (); factory.registerTag (new PreTag ()); parser.setNodeFactory (factory); You shouldn't need the scanner if you subclass CompositeTag, since it comes with a default CompositeTagScanner, unless of course you're doing something fancy.... Derrick Dizzy Reed wrote: > Hello, > > I need to manipulate every tag scanned by my parser, but I want > preformatted text to be left alone; after searching a bit (in the code > and in the archive), it appeared that I had to write both a PreTag > class and a PreScanner class, respectively inspired by StyleTag and > StyleScanner. So I pasted those latest two files into my working > directory, and all I did was approximately replacing "stlye" with > "pre" where it was needed. So basically, it comes down to changing > names and replacing: > > private static final String[] mIds = new String[] {"STYLE"}; > private static final String[] mEndTagEnders = new String[] {"STYLE"}; > > with: > > private static final String[] mIds = new String[] {"PRE", "CODE", "TT"}; > private static final String[] mEndTagEnders = new String[] {"PRE", > "CODE", "TT"}; > > > > This is the beginning of the "handleNode" of my inherited _parser_, to > see if this worked: > > public void handleNode(Node node) { > NodeList children = node.getChildren(); > > if(node instanceof PreTag) > System.out.println("PreTag works !!!"); // debug > > /* ... some more code below ... */ > } > > Though the detection works, for instance, with a DoctypeTag, this > doesn't. I tested it on htmlparser's website's main.html page, which > contains some preformatted text, but it is not detected. Can anyone > tell me what I've done wrong? > > Have a nice day all, > Anthony. > |
From: Dizzy R. <diz...@ho...> - 2004-05-20 13:20:20
|
Hello, I need to manipulate every tag scanned by my parser, but I want preformatted text to be left alone; after searching a bit (in the code and in the archive), it appeared that I had to write both a PreTag class and a PreScanner class, respectively inspired by StyleTag and StyleScanner. So I pasted those latest two files into my working directory, and all I did was approximately replacing "stlye" with "pre" where it was needed. So basically, it comes down to changing names and replacing: private static final String[] mIds = new String[] {"STYLE"}; private static final String[] mEndTagEnders = new String[] {"STYLE"}; with: private static final String[] mIds = new String[] {"PRE", "CODE", "TT"}; private static final String[] mEndTagEnders = new String[] {"PRE", "CODE", "TT"}; This is the beginning of the "handleNode" of my inherited _parser_, to see if this worked: public void handleNode(Node node) { NodeList children = node.getChildren(); if(node instanceof PreTag) System.out.println("PreTag works !!!"); // debug /* ... some more code below ... */ } Though the detection works, for instance, with a DoctypeTag, this doesn't. I tested it on htmlparser's website's main.html page, which contains some preformatted text, but it is not detected. Can anyone tell me what I've done wrong? Have a nice day all, Anthony. _________________________________________________________________ Essai gratuit: Appareil photo numérique http://essai.fr.msn.be/product/product.html?nr=13 |
From: Martin W. <mar...@ya...> - 2004-05-19 05:17:51
|
Marcio, I think reading the JSP page directly is not going to tell you much assuming the JSP tags/scriptlets are generating the useful content. At best you could check the JSP page for XML well-formedness using something like Tidy. Using something like HtmlUnit or HttpUnit or JWebUnit you can actually query the page from the web server causing the JSP code to execute. Then you are not looking at JSP but actual HTML. --Marty --- "Pietrosoft Informatica Ltda." <pie...@ma...> wrote: --------------------------------- Hi Martin, still don't tried to do anything. I Really don't know what is better, parsethe raw JSP or the generated html. I think that parse the raw JSP is the more easyoption, no? My only work in this case would be find the jsp in your phisical path andparse the file. But is possible parse the generated HTML ? Seems to be a more ellegantsolution. Regards, Marcio. ---- Mensagem Original ---- De: "Martin Wegner" Para:htm...@li... Enviado: Ter, Maio 18, 2004 12:39 am Assunto: Re: [Htmlparser-user] Parsing JSP with Struts... Are youtrying to parse the raw JSP page or the HTML file generated by Struts? --- Pietro wrote: --------------------------------- Hi, I already used htmlParser a couple of years ago. In that occasion i waslooking for just a simple parser to extract all the String from statics html files. AndHtmlParser worked perfect. But know i have a new situation, i gotto extract strings byhtml tags, that's ok. The problem is that now the html filesare dinamics, generated byJSP, and this JSPs are returned by Struts actions (.do). Somebody here have somesolution for this case? i confess that i don't knowfor where to start, any help orguideline will be very very welcome. Regards, Pietro. -------------------------------------------------------This SF.Net email issponsored by: SourceForge.net BroadbandSign-up now for SourceForge Broadband and getthe fastest6.0/768 connection for only $19.95/mo for the first 3 months!http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click_______________________________________________Htmlparser-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlparser-user ------------------------------------------------------- This SF.Netemail is sponsored by: SourceForge.net Broadband Sign-up now for SourceForgeBroadband and get the fastest 6.0/768 connection for only $19.95/mo for the first 3months! http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user -------------------------------------------------------This SF.Net email is sponsored by: SourceForge.net BroadbandSign-up now for SourceForge Broadband and get the fastest6.0/768 connection for only $19.95/mo for the first 3 months!http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click_______________________________________________Htmlparser-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlparser-user |
From: Pietro <pie...@ma...> - 2004-05-18 12:18:55
|
<p>Hi Martin, still don't tried to do anything. I Really don't know what is better, parse the raw JSP or the generated html. I think that parse the raw JSP is the more easy option, no? My only work in this case would be find the jsp in your phisical path and parse the file. But is possible parse the generated HTML ? Seems to be a more ellegant solution. </p><p>Regards,</p><p>Marcio.</p><br />---- Mensagem Original ----<br /><b>De:</b> "Martin Wegner" <mar...@ya... /><br /><b>Para:</b> htm...@li...<br /><b>Enviado:</b> Ter, Maio 18, 2004 12:39 am<br /><b>Assunto:</b> Re: [Htmlparser-user] Parsing JSP with Struts...<br /><br />Are you trying to parse the raw JSP page or the HTML file generated by <br />Struts? <br /><br />--- Pietro <pie...@ma... />wrote: <br />--------------------------------- <br />Hi, <br /> I already used htmlParser a couple of years ago. In that occasion i <br />waslooking for just a simple parser to extract all the String from statics <br />html files. AndHtmlParser worked perfect. But know i have a new situation, <br />i got to extract strings byhtml tags, that's ok. The problem is that now <br />the html files are dinamics, generated byJSP, and this JSPs are returned <br />by Struts actions (.do). <br /> Somebody here have somesolution for this case? i confess that i don't <br />know for where to start, any help orguideline will be very very welcome. <br />Regards, <br />Pietro. <br /><br /> <br /> <br />-------------------------------------------------------This SF.Net email <br />is sponsored by: SourceForge.net BroadbandSign-up now for SourceForge <br />Broadband and get the fastest6.0/768 connection for only $19.95/mo for the <br />first 3 <br />months!http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click_______________________________________________Htmlparser-user <br />mailing <br />lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlparser-user <br /><br />------------------------------------------------------- <br />This SF.Net email is sponsored by: SourceForge.net Broadband <br />Sign-up now for SourceForge Broadband and get the fastest <br />6.0/768 connection for only $19.95/mo for the first 3 months! <br />http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click <br />_______________________________________________ <br />Htmlparser-user mailing list <br />Htm...@li... <br />https://lists.sourceforge.net/lists/listinfo/htmlparser-user <br /><br /><br /> |
From: Pietrosoft I. Ltda. <pie...@ma...> - 2004-05-18 12:18:09
|
<p>Hi Martin, still don't tried to do anything. I Really don't know what is better, parse the raw JSP or the generated html. I think that parse the raw JSP is the more easy option, no? My only work in this case would be find the jsp in your phisical path and parse the file. But is possible parse the generated HTML ? Seems to be a more ellegant solution. </p><p>Regards,</p><p>Marcio.</p><p><br />---- Mensagem Original ----<br /><b>De:</b> "Martin Wegner" <mar...@ya... /><br /><b>Para:</b> htm...@li...<br /><b>Enviado:</b> Ter, Maio 18, 2004 12:39 am<br /><b>Assunto:</b> Re: [Htmlparser-user] Parsing JSP with Struts...<br /><br />Are you trying to parse the raw JSP page or the HTML file generated by <br />Struts? <br /><br />--- Pietro <pie...@ma... />wrote: <br />--------------------------------- <br />Hi, <br /> I already used htmlParser a couple of years ago. In that occasion i <br />waslooking for just a simple parser to extract all the String from statics <br />html files. AndHtmlParser worked perfect. But know i have a new situation, <br />i got to extract strings byhtml tags, that's ok. The problem is that now <br />the html files are dinamics, generated byJSP, and this JSPs are returned <br />by Struts actions (.do). <br /> Somebody here have somesolution for this case? i confess that i don't <br />know for where to start, any help orguideline will be very very welcome. <br />Regards, <br />Pietro. <br /><br /> <br /> <br />-------------------------------------------------------This SF.Net email <br />is sponsored by: SourceForge.net BroadbandSign-up now for SourceForge <br />Broadband and get the fastest6.0/768 connection for only $19.95/mo for the <br />first 3 <br />months!http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click_______________________________________________Htmlparser-user <br />mailing <br />lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlparser-user <br /><br />------------------------------------------------------- <br />This SF.Net email is sponsored by: SourceForge.net Broadband <br />Sign-up now for SourceForge Broadband and get the fastest <br />6.0/768 connection for only $19.95/mo for the first 3 months! <br />http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click <br />_______________________________________________ <br />Htmlparser-user mailing list <br />Htm...@li... <br />https://lists.sourceforge.net/lists/listinfo/htmlparser-user <br /><br /><br /></p> |
From: Martin W. <mar...@ya...> - 2004-05-18 03:39:59
|
Are you trying to parse the raw JSP page or the HTML file generated by Struts? --- Pietro <pie...@ma...> wrote: --------------------------------- Hi, I already used htmlParser a couple of years ago. In that occasion i waslooking for just a simple parser to extract all the String from statics html files. AndHtmlParser worked perfect. But know i have a new situation, i got to extract strings byhtml tags, that's ok. The problem is that now the html files are dinamics, generated byJSP, and this JSPs are returned by Struts actions (.do). Somebody here have somesolution for this case? i confess that i don't know for where to start, any help orguideline will be very very welcome. Regards, Pietro. -------------------------------------------------------This SF.Net email is sponsored by: SourceForge.net BroadbandSign-up now for SourceForge Broadband and get the fastest6.0/768 connection for only $19.95/mo for the first 3 months!http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click_______________________________________________Htmlparser-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/htmlparser-user |
From: Pietro <pie...@ma...> - 2004-05-17 19:47:42
|
<p>Hi,</p><p> I already used htmlParser a couple of years ago. In that occasion i was looking for just a simple parser to extract all the String from statics html files. And HtmlParser worked perfect. But know i have a new situation, i got to extract strings by html tags, that's ok. The problem is that now the html files are dinamics, generated by JSP, and this JSPs are returned by Struts actions (.do).</p><p> Somebody here have some solution for this case? i confess that i don't know for where to start, any help or guideline will be very very welcome.</p><p>Regards,</p><p>Pietro.</p><p /><p /><p> </p><p> <br /></p> |
From: Derrick O. <Der...@Ro...> - 2004-05-07 11:35:22
|
Gerret, Yes, <p> tags are not recognized as special, just Tags, and aren't derived from CompositeTag so they won't have children. This is because <p> tags aren't necessarily closed with a </p>, and aren't block level tags, from the HTML spec: http://www.w3.org/TR/html4/ To try making them special on a well formed document that uses </p> tags, you would need to add a ptag class derived from CompositeTag to the PrototypicalNodeFactory, similar to the custom tag example illustrated here: http://htmlparser.sourceforge.net/wiki/index.php/CustomTagLinks In general, the custom ptag class would need to be carefully crafted to finish scannng when encountering </p> or another <p> tag and many others by correctly implementing getEnders() and getEndTagEnders(). I'm not sure how much success you would have. It might work for your specific HTML file, but not against all the various HTML that's out there in the wild. Derrick Gerret Apelt wrote: > Hi -- > > I am having a problem with HTMLParser that is illustrated in the code > snippet below. > My goal is to extract the text content of all "<P>" elements in an > HTML document. > It appears that my NodeFilter does not find text nodes that are children > of P-Tag nodes. > However, if I instead try to find the text children of some other tag > nodes (e.g., the text node children of "table" tag nodes), > this appears to work fine. > Is there a difference between P and TABLE tags in terms of > HTMLParser's model? > > Any help much appreciated. > > cheers, > Gerret > > // Parser parser = > Parser.createParser("<HTML><BODY><TABLE>HiThere</TABLE></BODY></HTML>"); > Parser parser = > Parser.createParser("<HTML><BODY><P>HiThere</P></BODY></HTML>"); > > NodeList list = new NodeList (); > NodeFilter filter = new AndFilter(new NodeClassFilter(StringNode.class), > new HasParentFilter(new TagNameFilter("P"))); > for (NodeIterator e = parser.elements (); e.hasMoreNodes (); ) > e.nextNode().collectInto(list, filter); > > System.out.println(list.size()); > |
From: Gerret A. <ga...@cs...> - 2004-05-07 04:31:22
|
Hi -- I am having a problem with HTMLParser that is illustrated in the code snippet below. My goal is to extract the text content of all "<P>" elements in an HTML document. It appears that my NodeFilter does not find text nodes that are children of P-Tag nodes. However, if I instead try to find the text children of some other tag nodes (e.g., the text node children of "table" tag nodes), this appears to work fine. Is there a difference between P and TABLE tags in terms of HTMLParser's model? Any help much appreciated. cheers, Gerret // Parser parser = Parser.createParser("<HTML><BODY><TABLE>HiThere</TABLE></BODY></HTML>"); Parser parser = Parser.createParser("<HTML><BODY><P>HiThere</P></BODY></HTML>"); NodeList list = new NodeList (); NodeFilter filter = new AndFilter(new NodeClassFilter(StringNode.class), new HasParentFilter(new TagNameFilter("P"))); for (NodeIterator e = parser.elements (); e.hasMoreNodes (); ) e.nextNode().collectInto(list, filter); System.out.println(list.size()); |