The web can be represented by a graph with special regions: SCC, IN, OUT and TENDRILS.
Regions are defined by the link-path-reach from one website to others.
The linkage to and from a website (in- and out-degree) seems to conform the power law, which is also mentioned in this document.
"Die vorliegende Schrift präsentiert zentrale Ergebnisse des
Verbundprojekts 'Entwicklung zukunftsträchtiger Mediendienste' (DeMeS
- Development of Media Services), das in den Jahren 1997 und 1998
durchgeführt und vom Bundesministerium für Bildung, Wissenschaft,
Forschung und Technologie (BMBF) im Rahmen des Programms 'Arbeit und
Technik' (AuT) gefördert wurde.
Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that facilitate automatic indexing of them. For this processing, we employ lexical semantics, structural models, and community content. Furthermore, we describe visualization and summarization techniques that can be used to present the extracted events.
K. Varani, S. Gessi, S. Merighi, F. Vincenzi, E. Cattabriga, A. Benini, K. Klotz, P. Baraldi, M. Tabrizi, S. Lennan and 2 other author(s). Biochem Pharmacol, 70 (11):
1601-12(November 2005)Varani, Katia Gessi, Stefania Merighi, Stefania Vincenzi, Fabrizio
Cattabriga, Elena Benini, Annalisa Klotz, Karl-Norbert Baraldi, Pier
Giovanni Tabrizi, Mojgan Aghazadeh Lennan, Stephen Mac Leung, Edward
Borea, Pier Andrea England Biochemical pharmacology Biochem Pharmacol.
2005 Nov 25;70(11):1601-12. Epub 2005 Oct 10..