The Web Data Commons project extracts structured data from the Common Crawl, the largest web corpus available to the public, and provides the extracted data for public download in order to support researchers and companies in exploiting the wealth of information that is available on the Web.
RDFa is an extension to HTML5 that helps you markup things like People, Places, Events, Recipes and Reviews. Search Engines and Web Services use this markup to generate better search listings and give you better visibility on the Web, so that people can find your website more easily.
S. Staab, M. Erdmann, A. Maedche, and S. Decker. Proc. of First Workshop on the Semantic Web at the Fourth European Conference International Workshop on Research and Advanced Technology for Digital Libraries, Lisbon, Portugal 18-20 September 2000, (September 2000)
J. Tane, W. Siberski, W. Nejdl, and B. Simon. Proc. of the First International Semantic Web Conference 2002 (ISWC 2002), June 9-12 2002, Sardinia, Ital, volume 2342 of LNCS, page 236-249. Springer, (2002)