WebSPHINX ( Website-Specific Processors for HTML INformation eXtraction) is a Java class library and interactive development environment for web crawlers.
G. Manku, A. Jain, and A. Sarma. WWW '07: Proceedings of the 16th international conference on World Wide Web, page 141--150. New York, NY, USA, ACM, (2007)
Z. Bar-Yossef, I. Keidar, and U. Schonfeld. WWW '07: Proceedings of the 16th international conference on World Wide Web, page 111--120. New York, NY, USA, ACM, (2007)
A. Broder, M. Najork, and J. Wiener. WWW '03: Proceedings of the 12th international conference on World Wide Web, page 679--689. New York, NY, USA, ACM, (2003)