Introduction
On several occasions developing database-driven web applications, I've been approached by clients who want Google-style search implemented at the last minute of the development cycle. Usually this leads to using some canned script that crawls the website, or a hacked up search function that uses the database but either returns too many results or none at all. On top of that, the queries performed are too many or too slow.
Until now, most developers have been forced to use relational databases to power search, install extra component packages, or seek out other non-php solutions. The problem with using a relational database, such as MySql's fulltext indexing, is that scalability problems crop up as your search criteria becomes more complicated.
One of the features that sets the Zend Framework apart from the others is the inclusion of a decent search module. Zend_Search_Lucene is a php port of the Apache Lucene project, a full-text search engine framework. Zend_Search_Lucene promises a simple way to add search functionality to an application without requiring additional php extensions or even a database.
Zend_Search_Lucene overcomes the usual limitations of relational databases with features such as fast indexing, ranked result sets, a powerful but simple query syntax, and the ability to index multiple fields. Better still, a Zend_Search_Lucene index can live happily alongside your relational database to provide fast searching but without duplicating the effort of storing all of your data twice. In this tutorial, I'll show you how to use Zend_Search_Lucene to index and search some RSS feeds.
Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files. Swish-e is ideally suited for collections of a million documents or smaller. Using the GNOME™ libxml2 parser and a collection of filters, Swish-e can index plain text, e-mail, PDF, HTML, XML, Microsoft® Word/PowerPoint/Excel and just about any file that can be converted to XML or HTML text. Swish-e is also often used to supplement databases like the MySQL® DBMS for very fast full-text searching. Check out the full list of features.
Find and download data in any format, from financial to social networking to GIS data. Or sell data in our data marketplace, at a price you set. We have large data sets, spreadsheets, and databases packed with statistics.
A. Hotho, R. J�schke, C. Schmitz, и G. Stumme. The Semantic Web: Research and Applications, том 4011 из LNAI, стр. 411-426. Heidelberg, Springer, (июня 2006)
A. Hotho, R. Jaeschke, C. Schmitz, и G. Stumme. The Semantic Web: Research and Applications, том 4011 из Lecture Notes in Computer Science, стр. 411-426. Heidelberg, Springer, (июня 2006)
Y. Yanbe, A. Jatowt, S. Nakamura, и K. Tanaka. JCDL '07: Proceedings of the 2007 conference on Digital libraries, стр. 107--116. New York, NY, USA, ACM Press, (2007)
R. Jäschke, B. Krause, A. Hotho, и G. Stumme. Proceedings of the Second International Conference on Weblogs and Social Media(ICWSM 2008), AAAI Press, (2008)
A. Hotho, R. Jäschke, C. Schmitz, и G. Stumme. The Semantic Web: Research and Applications, том 4011 из Lecture Notes in Computer Science, стр. 411-426. Heidelberg, Springer, (июня 2006)