Webbots, Spiders, and Screen Scrapers is "unmatched to my knowledge in how it covers PHP/CURL. It explains to great details on how to write web clients using PHP/CURL, what pitfalls there are, how to make your code behave well and much more."
OpenAcoon ist eine als OpenSource zur Verfügung stehende Suchmaschine. Die Software wird seit Jahren von die Suchmaschine Acoon eingesetzt und wird von dieser auch weiter entwickelt.
OpenAcoon ist in Pascal geschrieben und arbeitet derzeit ausschließlich unter Windows. Wir arbeiten aber bereits daran die Sourcen für FreePascal anzupassen, damit die Software sowohl unter Windows, als auch unter Linux läuft.
In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn't find and index for users who search on Google. Specifically, when we encounter a <FORM> element on a high-quality site, we might choose to do a small number of queries using the form. For text boxes, our computers automatically choose words from the site that has the form; for select menus, check boxes, and radio buttons on the form, we choose from among the values of the HTML. Having chosen the values for each input, we generate and then try to crawl URLs that correspond to a possible query a user may have made. If we ascertain that the web page resulting from our query is valid, interesting, and includes content not in our index, we may include it in our index much as we would include any other web page.