The Crawljax team is pleased to announce the crawljax-2.0 release. This release supports multi-browser crawling and includes many improvements. Crawljax is
Webstemmer is a web crawler and HTML layout analyzer that automatically extracts main text of a news site without having banners, ads and/or navigation links mixed up
H. Zhang, A. Santos, und J. Freire. Proceedings of the 30th ACM International Conference on Information &$\mathsemicolon$ Knowledge Management, ACM, (Oktober 2021)
J. Rennie, und A. McCallum. Proceedings of the Sixteenth International Conference on Machine Learning, Seite 335--343. San Francisco, CA, USA, Morgan Kaufmann Publishers Inc., (1999)
X. Wang, L. Tokarchuk, F. Cuadrado, und S. Poslad. Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Seite 311--315. New York, NY, USA, ACM, (2013)
M. Ehrig, J. Hartmann, und C. Schmitz. Workshop ``Semantische Technologien für Informationsportale'' (GI-Jahrestagung 2004), Gesellschaft für Informatik, (September 2004)
G. Gossen, E. Demidova, und T. Risse. Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, Seite 75--84. New York, NY, USA, ACM, (2015)