Abstract. In order to support web applications to understand the content of HTML pages an increasing number of websites have started to annotate structured data within their pages using markup formats such as Microdata, RDFa, Microformats. The annotations are used by Google, Yahoo!, Yandex, Bing and Facebook to enrich search results and to display entity descriptions within their applications. In this paper, we present a series of publicly accessible Microdata, RDFa, Microformats datasets that we have extracted from three large web corpora dating from 2010, 2012 and 2013.
L. Ehrlinger, J. Schrott, and W. Wöß. Database and Expert Systems Applications - DEXA 2023 Workshops, page 3--10. Cham, Springer Nature Switzerland, (2023)
V. Ehrenstein, H. Kharrazi, H. Lehmann, and C. Taylor. Tools and Technologies for Registry Interoperability, Registries for Evaluating Patient Outcomes: A User’s Guide, 3rd Edition, Addendum 2 Internet, Agency for Healthcare Research and Quality (US), (2019)
M. Barz, M. Moniri, M. Weber, and D. Sonntag. Adjunct Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp '16), Heidelberg, Germany, page 17--20. New York, ACM, (2016)
C. Seitz, C. Legat, and J. Neidig. Workshops Proceedings of the 5th International Conference on Intelligent Environments, volume 4 of Ambient Intelligence and Smart Environments, page 51--57. Amsterdam, IOS Press, (2009)