It contains a Web Crawler, HTML Parser and ("in the near future") NER and REX.
Additionally, including JWikiDocs, a Java tool for crawling and downloading Wikipedia documents.
HTML Parser is a Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags and easy to use JavaBeans. It is a fast, robust and well tested package.
It is a fast real-time parser for real-world HTML. What has attracted most developers to HTMLParser has been its simplicity in design, speed and ability to handle streaming real-world html.
OpenLaszlo programs are written in XML and JavaScript and transparently compiled to Flash and, with OpenLaszlo 4, DHTML. The OpenLaszlo APIs provide animation, layout, data binding, server communication, and declarative UI. An OpenLaszlo application can be as short as a single source file, or factored into multiple files that define reusable classes and libraries.
OpenLaszlo is "write once, run everywhere." An OpenLaszlo application developed on one machine will run on all leading Web browsers on all leading desktop operating systems.