The Web Data Commons project extracts structured data from the Common Crawl, the largest web corpus available to the public, and provides the extracted data for public download in order to support researchers and companies in exploiting the wealth of information that is available on the Web.
D. Sonntag, R. Neßelrath, G. Sonnenberg, and G. Herzog. Paper presented at the First International Workshop on Spoken Dialogue Systems Technology (IWSDS-2009), Kloster Irsee, Germany, (December 2009)Available from http://www.dfki.de/web/forschung/publikationen?pubid=4673.
D. Sonntag, R. Neßelrath, G. Sonnenberg, and G. Herzog. Paper presented at the First International Workshop on Spoken Dialogue Systems Technology (IWSDS-2009), Kloster Irsee, Germany, (December 2009)Available from http://www.dfki.de/web/forschung/publikationen?pubid=4673.