Hi Folks,
Integration release 1.2-20021016 is out. You can get it from
http://htmlparser.sourceforge.net
Here's the change log :
Integration Build 1.2 - 20021016
--------------------------------
[1] Fixed bug 621117 - JSP tags not recognized if within string node
[2] Fixed bug 617228 - Links with > symbol in query strings were not
being recognized.
[3] build.xml completely automatic - no manual changes needed before running
[4] build.xml included in release package, inside src.zip
[5] Refactored HTMLTag - design modified, introduced HTMLTagParser helper
class
[6] Optimized scanning process - 20% faster now
There have been some refactorings and optimizations in this release. Most
notably, the scanners are not enumerated sequentially anymore. Instead, they
are stored inside hashtables, and are identified by the first word that
occurs in a tag (in uppercase). Now, we have a default implementation of
evaluate() which returns true, and most of the scanners dont override this
if their evaluation is simply based on matching the first word. However, if
the matching logic is complex, then evaluate() should be overridden.
An additional method has been introduced in HTMLTagScanner() which all
scanners have to override - and that is - getID() - which will be used to
register the scanner into the hashtable (called only once) inside
addScanner().
In addition feedback is being incorporated - you will find feedback if you
run the testcases.
The performance improvement is substantial - on running
com.kizna.htmlTests.PerformanceTest.java - I could see a reduction of 500 ms
(with all scanners registered) from 2500 ms to 2000 ms (run on the MySQL
installation guide page).
For developers (or folks who want to join) - the build script has been
included in the distribution (it is a whole lot more powerful now -
autodetects code version, etc..). Making your package ready for distribution
is exceedingly simple now - so do go ahead and explore.
Regards,
Somik
|